The hot new launch by Open AI – Sora AI is the latest buzz around the tech world. This AI model is the first AI model from Open AI that is capable of generating up to a minute-long AI video from a text as a prompt.
While the world is just getting used to AI chat assistance and image generation, Open AI has given people a new reason to be excited yet worried about AI advancements.
Though Sora AI is still in development and not available for general use it has excited a lot of enthusiasts like me.
Let’s quickly know all about this new AI text-to-video model!
What Is Sora AI?
First things first, Sora is not ChatGPT and cannot be used within ChatGPT or your ChatGPT Plus subscription.
Sora AI is a separate AI model that is still in the works. Not everyone can use Sora AI, and is not yet available to the general public for use.
With Sora AI, a user can enter a simple text prompt and get up to a minute-generated video with high adherence to the user prompt while maintaining the quality of – lighting, skin tone, textures, real-world physics, and elements.
Here’s A Look At The Best Sora AI Example I Found:
How Does Sora AI Work?
Now, to dig deeper into the technology and the base of Sora’s work, Sora AI is the largest AI text-to-video model by Open AI, based on text-conditional diffusion models trained on images and videos of different aspect ratios and resolutions.
Sora AI works on a transformer architecture that makes use of spacetime patches of video and image latent codes.
Just as LLMs work on text “tokens,” Sora works on visual “patches.” These patches were found to be highly scalable and effective in training generative models on a variety of images and videos by Open AI.
To simplify how Sora AI works, it takes static noise and transforms it into relevant videos by removing it sequentially.
Sora takes the existing research on DALL E and GPT and uses the same re-captioning technique that is used by DALL E 3 to generate highly visual outputs.
Sora AI Availability
As of February 2024, Sora AI is not available for general users. This text-to-video model can only be used by select individuals chosen by Open AI called “Red Teamers,” who are currently testing out the potential and limitations of this AI model.
These individuals are experts when it comes to identifying bias, hateful content, and misinformation.
Sora AI Safety
With such a high level of potential, this AI model also sparks the question of the safety and security of using this model.
Hence, before releasing this AI model into the market, Open AI is geared towards developing tools such as a detection classifier.
They have also adopted existing text classifiers that filter the request for violent, sexual, hateful, celebrity likeness, or IP-related content.
Sora AI Potential
Apart from generating a brilliant video like the above post, what else can Sora AI do?
Sora has the ability to create – 1920x1080p videos, vertical 1080×1920 videos, and any other aspect ratio that comes in between as of now.
Check This Out:
Source: Open AI YouTube
Sora AI can:
- Generate an AI video from text from up to a minute’s length.
- Create content for a variety of devices as per the aspect ratios of the device.
- Animate still images.
- Extend an existing video to fill out the frames.
- Video-to-video editing.
- Connect two input videos.
- Generate AI Images up to 2048×2048 resolution.
- Simulate digital worlds (like Minecraft)
Limitations Of Sora AI
Sora AI currently has many limitations, the most evident being the inability to model the physics for a lot of elements.
Here are some more Sora AI limitations.
- It is a private tool, not available to the general public.
- Still in the testing phase.
- Can generate videos up to a minute’s length only.
- Does not have any sound for the video.
- Faces difficulty in understanding directional accuracy.
What Does the Launch of Sora AI Text-To-Video Model Mean For the Future?
Sora AI is a big leap towards improving the artificial general intelligence of AI models and their implications in real-life use cases.
But this might sound alarm for the video production industry for many video content creators, artists, the stock video industry, and other professionals involved in video production.
On the contrary, for the tech world, this is yet another breakthrough that was bound to be introduced, given the research and development that has gone into developing AI models and the potential they have shown.
We can expect the Sora AI text-to-video model to be available for use and further integrated into other Open AI products pretty soon.
And it is only a matter of time before other AI-focused tech giants like Google. Microsoft and Meta came up with their own AI text-to-video models.