AI tools have seemingly appeared out of nowhere, with each iteration attempting to fill in a specific niche that the previous one had missed. However, Google has mostly been quiet during the current AI boom. That is, up until recently. Dreamix, which the company describes as a diffusion-based video editor, has been introduced by a team affiliated with Google Research. Although it can’t create films from nothing but a prompt, it can take already-existing content and alter it using text prompts. But how does it actually work? Can the new technology help the creatives? Or should we start to be concerned that, within the next ten years, audiences will simply be able to produce their own movies on the fly?
Basics of Dreamix
No, it’s unlikely that a Google app will take your position anytime soon. Even while AI technology is advancing quickly, we are still a long way from the Star Trek level of interaction. Google has a history of discarding technology before it has truly served its purpose. However, given how highly important and well-liked AI technologies have grown, Dreamix might endure for some time. Here’s what the company says — Given a video and a text prompt, Dreamix edits the video while maintaining fidelity to color, posture, object size and camera pose, resulting in a temporally consistent video. Here, Dreamix turns the eating monkey (left) to a dancing bear (right) given the prompt “A bear dancing and jumping to upbeat music, moving his whole body“
Video of an eating monkey is completely transformed
Dreamix can create videos based on image and text inputs. In this example it is able to instill complex motion in a static image, adding a moving shark and making the turtle swim. In this case, visual fidelity to object location and background was preserved but the turtle direction was flipped.
Creating Videos from Image
Given a small collection of images showing the same subject, Dreamix can generate new videos with the subject in motion. In this example, given a small number of images of the toy fireman, Dreamix is able to extract the visual features then animate it to lift weights while maintaining fidelity and temporal consistency.
When provided video content, “Dreamix edits the movie while retaining fidelity to color, posture, object size, and camera pose, resulting in a temporally consistent video,” according to the info that has been made public. This is accomplished by “combining, at inference time, the low-resolution spatio-temporal information from the original video with new, high-resolution information that is synthesized to coincide with the guiding text prompt” using a “video diffusion model.” Don’t worry if that makes no sense to you. It was the same for me. As far as I can understand, the program adds noise to the original video, which is the core concept. The original film is then “nudged” into the director of what you prompted using a video diffusion model and text prompts.
Should Creatives Care?
A filmmaker or any other creative artist might be prepared to “give up their artistic spurs,” so to speak, after witnessing anything like this. But to be honest , this represents a significant advancement in a set of tools that could be beneficial to editors, directors, and cinematographers alike. Think about making a project pitch and having a stunning clip ready. It will be lot simpler to create something with Dreamix that suits your ultimate narrative. Dreamix might assist editors in honing cuts or perhaps provide stock footage. Even already shot material can be changed by cinematographers to fit the plan. Although Dreamix could (and certainly will) find a larger audience, these use cases are specifically for creatives. Put yourself through social media filters, but instead of changing your face, you produce entirely new material. Similar to motion capture but enhanced by AI Sure, there can be instances of unethical usage for this kind of technology, but the same could be said for seatbelts. You can read the research paper here. With that said, what are your thoughts on Dreamix? How should technology advance in your opinion? Comment below and let me know!