We continue with the news about artificial intelligences that create audiovisual content starting from a textual description. Yesterday we reviewed the existence of AudioGen, a text-to-audio AI, and days before we talked about Make-a-Video, Meta’s AI that generates videos from text.
What artificial intelligence is it time to talk about today? Not of one, but of two AI models that could directly compete with the Meta app. They are called Image Video and Phenaki, they were presented by Google and they are two AIs that convert text to video.
Image Video prioritizes the quality of the videos created with its AI, even if they are shorter
If we technology enthusiasts know one thing, it is that Google’s experience in Artificial Intelligence is very extensive. Thus, the fact that they have presented two text-based video generators is not something that takes us by surprise. But, Because two? Because everyone’s approach is different and because they can.
The first model is Image Video, an AI that focuses on create high quality videos. Its base starts from the same source code as Image, Google’s text-to-image AI that was introduced a few weeks ago. However, Image Video is a refined version that incorporates a lot of new elements capable of converting static images into moving images.
Like the Meta model, Google’s AI delivers some results that are not perfect, but certainly amazing. Some videos can be unsettling, especially if there are faces or people moving, but it’s still a big step forward.
The best? It works like any other AI of this style (it only requires a text description), but image quality is superior to Make-A-Video. According to Google developers, Image Video starts from a file of only 16 frames at a speed of 3 fps and resolution of 24 x 48 pixels.
Once the low-res base video is ready, various super-resolution AI models are run, bringing the end result down to the following: a 128-frame video, at 24 fps, and a resolution of 1280 x 768 pixels. In other words, a video in HD quality of just over 5 seconds. In the case of the Metra AI, the output resolution is 768×768 pixels.
Phenaki bets on long videos, but sacrifices image quality
Google’s other text-to-video AI does just the opposite :gIt generates much longer videos, but for this you have to sacrifice the final quality of the output image.
The other difference? Since your goal is to make much longer videos, Phenaki requires much more detailed instructions. In fact, Image Video does its job with a simple sentence, but you can ask Phenaki to animate a whole paragraph with different sequences and it will do it.
As would be expected, the consistency of the resulting images is not that good. But the fact of being able to handle various scenes and scenarios (as if it were a movie) is something that leaves us speechless.
Additionally, the Phenaki development team revealed another fact: its AI model generates videos of arbitrary length. There is no maximum time limitalthough the same text can generate two videos of very different durations.
According to Google, the future versions of these two artificial intelligences “They will be part of an ever-expanding set of tools that will help artists and everyday users create exciting ways to express their creativity.”
Is this the future of cinema? We don’t know, but time will tell. How can you test these applications? Unfortunately, these two AI models are not yet available for users, although you can see some videos produced by them on their official portals.
#Google #presents #Artificial #Intelligences #convert #text #video