NVIDIA has been hard at work on the problem posed by high frame rate interpolation of video data shot on lower fps. We have had this tech since the late 1990s with the advent of Twixtor and refined over the decades in systems like Twixtor Pro and Adobe’s Optical Flow in After Effects. You are still not getting real temporal detail data since the frames are created by extrapolating velocity and direction vectors plus pixel values between frames to get the result.
We explored this technique in our post on interpolation here and why it is no substitute from a real slow motion camera solution. NVIDIA’s new method uses machine learning along with 11,000 videos to arrive at a more convincing result. Considering the relatively small sample size we can imagine a future where hundreds of thousands or millions of footage samples are used to generate near flawless interpolation. This technique takes some serious computation and data sets so as of now it is not really ready for the mass market but that could change with the cloud very soon.
NVIDIA Slow Motion Is New But Still flawed:
As you can see in the sample video below the artifacts produced by interpolation are very evident and more so when a fluid or fabric motion is introduced. The human eye can hide some of these in real time playback due to the persistence of vision effect and brain image processing but it is still quite apparent if you look at it witha critical eye.
Transforming Standard Video Into Slow Motion with AI by NVIDIA:
There is no question this might be the best looking interpolation method we have seen to date but it is still not generating new information that has any scientific value. In other words, you can’t create something from nothing. Nothing being the estimated values between two distant frames in time. It sure is a marvel of computation and could really help in getting many more frames where detail is vast and artifacts suppressed but there is no real image captured from a live event. If you record an explosion or fluid with this technique you will get what the computer estimates should be there and not what actually happened. Any rogue debris or physically distinct motion phenomena will not be there. This technique is completely useless for education and scientific research.
That said the technique can make your slow-mo videos shot on your phone just a little more interesting even when shot at 30 or 60fps. As with any interpolation technique you can get better results the more frames you give the system. If you shoot at 1000fps with a shutter of 1/4000 for example you will get the ability to interpolate down to 3k or 4k fps without much artifacting happening. Then again if you shoot at 4000fps like what an edgertronic SC2+ can do you could interpolate down to 16,000fps without much in the way of artifacting.
We can certainly see a future in which you can upload your lower frame rate footage to the cloud and choose which frame rate you want it at within a reasonable range. The cloud AI with the Machine Learning algorithms will get better with more and more videos being added to the collection. It is possible to do it with millions of samples instead of only 11,000 videos like the NVIDIA researchers were using in the lab. The interpolation should get better and better as the computer learns from the added content.
It will also be possible to create footage from scratch by using video parts much like what Google did with Machine image learning to create new art. What an interesting future it will be.
We are all for better interpolation but do not believe the hype when you are told you may never need a slow-motion camera again. In fact, temporal detail and nature recording cannot be interpolated to generate real information. So you better continue to use your slow motion camera and expect to get one more capable as technology improves and the price continues to lower. -HSC
Nvidia Slow Motion Interpolation Press Release on the Technology Below:
Link to the article here: https://news.developer.nvidia.com/transforming-standard-video-into-slow-motion-with-ai/?ncid=–43539
Researchers from NVIDIA developed a deep learning-based system that can produce high-quality slow-motion videos from a 30-frame-per-second video, outperforming various state-of-the-art methods that aim to do the same. The researchers will present their work at the annual Computer Vision and Pattern Recognition (CVPR) conference in Salt Lake City, Utah this week.
“There are many memorable moments in your life that you might want to record with a camera in slow-motion because they are hard to see clearly with your eyes: the first time a baby walks, a difficult skateboard trick, a dog catching a ball,” the researchers wrote in the research paper. “While it is possible to take 240-frame-per-second videos with a cell phone, recording everything at high frame rates is impractical, as it requires large memories and is power-intensive for mobile devices,” the team explained.
With this new research, users can slow down their recordings after taking them.
Using NVIDIA Tesla V100 GPUs and cuDNN-accelerated PyTorch deep learning framework the team trained their system on over 11,000 videos of everyday and sports activities shot at 240 frames-per-second. Once trained, the convolutional neural network predicted the extra frames.
The team used a separate dataset to validate the accuracy of their system.
The result can make videos shot at a lower frame rate look more fluid and less blurry.
“Our method can generate multiple intermediate frames that are spatially and temporally coherent,” the researchers said. “Our multi-frame approach consistently outperforms state-of-the-art single frame methods.”
To help demonstrate the research, the team took a series of clips from The Slow Mo Guys, a popular slow-motion based science and technology entertainment YouTube series created by Gavin Free, starring himself and his friend Daniel Gruchy, and made their videos even slower.
The method can take everyday videos of life’s most precious moments and slow them down to look like your favorite cinematic slow-motion scenes, adding suspense, emphasis, and anticipation.
The researchers, which include Huaizu Jiang, Deqing Sun, Varun Jampani, Ming-Hsuan Yang, Erik Learned-Miller, and Jan Kautz, will present on Thursday, June 21 from 2:50 – 4:30 PM at CVPR.