My First Try with Image-to-Video AI on Google Colab
Today I tried something exciting — building my own image-to-video AI model on Google Colab. My goal was not complicated: I wanted to upload an image, write a prompt describing what I wanted, and let the AI generate either a new image aligned with that prompt or even a short video based on it.
The idea behind this is powerful. Imagine taking a single picture and transforming it into a living, moving story, all with the help of AI. Many platforms already provide this kind of service, but most of them are subscription-based and costly. Since I could not afford those, I decided to explore the free alternative that Colab offers.
The results were interesting. The AI did manage to create images and even videos based on my input, but they were far from perfect. The original image wasn’t always preserved well — faces and objects would sometimes distort, or the AI would hallucinate completely new details that weren’t there before. While this can look creative in its own way, it wasn’t what I was aiming for.
Still, the experience taught me a lot. Running such models on free resources like Colab is not easy — performance is limited, and results can be unpredictable. But it also opened up exciting possibilities. With refinement, better models, and more resources, these AI tools could become truly impressive.
For now, I’m happy I managed to bring my idea to life, even in this rough version. This is just the beginning, and I’ll try it again in the future. It was particularly helpful that Gemini on Colab adjusted the code and solved the issues, allowing it to run successfully.
Here’s the google Colab Python code snippet with a prompt on planned image or video
(This version is the one adjusted by Gemini on Colab, so it runs successfully.)
 
