The AI Monthly Top 3 - October 2021

The most interesting October's AI breakthroughs with video demos, short articles, code, and paper reference.

The AI Monthly Top 3 - October 2021

Here are the 3 most interesting research papers of the month, in case you missed any of them. It is a curated list of the latest breakthroughs in AI and Data Science by release date with a clear video explanation, link to a more in-depth article, and code (if applicable). Enjoy the read, and let me know if I missed any important papers in the comments, or by contacting me directly on LinkedIn!

If you’d like to read more research papers as well, I recommend you read my article where I share my best tips for finding and reading more research papers.


Paper #1:

Skillful Precipitation Nowcasting using Deep Generative Models of Radar [1]

DeepMind just released a Generative model able to outperform widely-used nowcasting methods in 89% of situations for its accuracy and usefulness assessed by more than 50 expert meteorologists! Their model focuses on predicting precipitations in the next 2 hours and achieves that surprisingly well. It is a generative model, which means that it will generate the forecasts instead of simply predicting them. It basically takes radar data from the past to create future radar data. So using both time and spatial components from the past, they can generate what it will look like in the near future.

You can see this as the same as Snapchat filters, taking your face and generating a new face with modifications on it. To train such a generative model, you need a bunch of data from both the human faces and the kind of face you want to generate. Then, using a very similar model trained for many hours, you will have a powerful generative model. This kind of model often uses GANs architectures for training purposes and then uses the generator model independently. If you are not familiar with generative models or GANs, I invite you to watch one of the many videos I made covering them, like this one about Toonify.

Watch the video

A short read version

DeepMind uses AI to Predict More Accurate Weather Forecasts
50+ expert meteorologists assessed DeepMind’s new model beating current nowcasting methods in 89% of situations for its accuracy and usefulness

Colab Notebook: https://github.com/deepmind/deepmind-research/tree/master/nowcasting


Paper #2:

The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks [2]

Have you ever tuned in to a video or a TV show and the actors were completely inaudible, or the music was way too loud? Well, this problem, also called the cocktail party problem, may never happen again. Mitsubishi and Indiana University just published a new model as well as a new dataset tackling this task of identifying the right soundtrack. For example, if we take the same audio clip we just ran with the music way too loud, you can simply turn up or down the audio track you want to give more importance to the speech than the music.

The problem here is isolating any independent sound source from a complex acoustic scene like a movie scene or a youtube video where some sounds are not well balanced. Sometimes you simply cannot hear some actors because of the music playing or explosions or other ambient sounds in the background. Well, if you successfully isolate the different categories in a soundtrack, it means that you can also turn up or down only one of them, like turning down the music a bit to hear all the other actors correctly. This is exactly what the researchers achieved.

Watch the video

A short read version

Isolate Voice, Music, and Sound Effects With AI
If we take an audio clip with the music way too loud, you can simply turn up the speech and lower the music!

Project page: https://cocktail-fork.github.io/

DnR dataset: https://github.com/darius522/dnr-utils#overview


Paper #3:

ADOP: Approximate Differentiable One-Pixel Point Rendering [3]

Watch the video

A short read version

AI Synthesizes Smooth Videos from a Couple of Images!
Let’s construct 3D models from a couple of photos…

Code: https://github.com/darglein/ADOP


If you like my work and want to stay up-to-date with AI, you should definitely follow me on my other social media accounts (LinkedIn, Twitter) and subscribe to my weekly AI newsletter!

To support me:

  • The best way to support me is by following me here or on Medium or subscribe to my channel on YouTube if you like the video format.
  • Support my work on Patreon.
  • Join our Discord community: Learn AI Together and share your projects, papers, best courses, find Kaggle teammates, and much more!
  • Here are the most useful tools I use daily as a research scientist for finding and reading AI research papers… Read more here.

References

[1] Ravuri, S., Lenc, K., Willson, M., Kangin, D., Lam, R., Mirowski, P., Fitzsimons, M., Athanassiadou, M., Kashem, S., Madge, S. and Prudden, R., 2021. Skillful Precipitation Nowcasting using Deep Generative Models of Radar, https://www.nature.com/articles/s41586-021-03854-z.

[2] Petermann, D., Wichern, G., Wang, Z., & Roux, J.L. (2021). The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks. https://arxiv.org/pdf/2110.09958.pdf.

[3] Rückert, D., Franke, L. and Stamminger, M., 2021. ADOP: Approximate Differentiable One-Pixel Point Rendering. https://arxiv.org/pdf/2110.06635.pdf.