Keep up with us. Sign up to our newsletter

Director Paul Trillo’s work with AI tool Dall-E 2 featured in Fast Company

(FAST COMPANY) L.A.-based director Paul Trillo‘s mind-twisting work has left me absolutely flabbergasted more than once. He has created some of the best visual material I’ve ever seen, which he made possible through raw creativity, a deep knowledge of the media, and the development of production techniques like the first fully mobile bullet-time rig.

Then he got DALL-E 2, and things dialed up to 11.

Text-to-image synthesis AI apps like DALL-E 2 allow anyone with an imagination to create pretty much anything they want, simply by typing a few guiding words known as “prompts.” You can imagine a dramatic love story between John Oliver and a cabbage or turn David Bowie’s lyrics into surrealist artwork worthy of an album by Ziggy Stardust himself. But, in the hands of someone as creative as Trillo, these AI tools are the equivalent of going from using a zoetrope to having Industrial Light & Magic following all your whims by just using a command line.

“It’s an incredibly exciting and overwhelming time for creators,” Trillo tells me over email. “In one sense, AI democratizes image-making so that people who are more verbal can express themselves visually. It also gives people who are already visual a way to evolve their work and go down paths they may have never explored.”

DALL-E 2 has become an extremely powerful tool in Trillo’s creative arsenal, one that has allowed him to create an impressive series of stop-motion composites combining real-world video imagery with DALL-E’s synthetic creations. His latest example is this beautiful 30-second fashion show that used AI to generate hundreds of outfits in collaboration with his wife, the artist Shyama Golden, who helped him art direct which outfits should appear in the final video and starred as the model.

Trillo tells me that the project came to be from his desire to use AI in a new form that he hadn’t seen before. “I first started creating some experiments combining live-action video with DALL-E back in June, beginning with this video of objects changing in the palm of my hand,” he says. That first clip was simply an AI experiment to recreate an effect from a previous short film of his called A Truncated Story of Infinity.

“[The experiment] worked a lot better than I had expected,” Trillo says. “My next impulse was to do an outfit swap, which is also an effect I’ve done practically in previous projects, but I wanted to experiment with some other ideas before using it with a full body person.” That’s where the power of the tool became so obvious to him. The generative work with AI is so quick that “we can now have access to a multiverse of ideas,” he says.

To create the outfits seen in the video, he used variations of several text prompts to guide the AI, from “purple iridescent mylar oversize T-shirt” to “lavender purple puffy jumpsuit” to ” lavender purple retro futuristic fashion jumpsuit with mock turtleneck, puffy feather shoulder pads, avant garde fashion, Japanese minimalism from 2040, Barbarella.”

“[It] opened the door to some pretty wild designs that I would have never come to on my own,” says Trillo, who describes the power as “limitless and overwhelming.”


Another DALL-E 2 stop motion experiment by Trillo, who also used text-to-voice AI synthesis to narrate it with Sir David Attenborough’s voice.

The difference between using a traditional technique and AI is staggering. As Trillo says, you would literally need to design and build 100 outfits and then have the model change into them every few frames as you motion-control the camera. Another way would be to design and build the clothes in 3D, having to create the fabrics and textures, then light and do the necessary composite work over the video. But DALL-E 2 can do this with a text prompt. It not only generates an object like clothing, it can recreate a photograph and instantly composite something into that photograph. This is a unique feature to DALL-E 2 that the other AI synthesis programs don’t do: “DALL-E analyzes the aesthetic of the original image, the lighting, the perspective, everything, and seamlessly blends something new into the original image. It’s incredibly good at adding objects into a scene or filling erased parts, a process known as ‘inpainting,’” says Trillo.

Making the actual video is a straightforward and even tedious process. First, you have to capture the base video, which is just a dolly shot of his wife, Shyama, in a garden path. Then you have to extract the video frames and feed them one by one to DALL-E, introducing a text in the prompt for each of them. “It’s essentially AI-generated stop motion,” he says.

“Once I explored a few different directions to go, I honed in on a particular style for the wardrobe,” Trillo says, taking into consideration the balance of visual design variation and visual consistency. He ended up using 115 outfits, with countless others left on the cutting room floor. Those had to be sequenced in a way that flowed together organically, he says, but was also unexpected. He finally used another AI program—called RunwayML—to rotoscope the image sequence into the source video. And then, to finalize the sequence, he created the floating objects that you can see using DALL-E, again using stop motion and placing them in different layers for added depth.

One of the dangers of DALL-E, Trillo believes, is that it’s a time sucker. Its “imagination” is so fascinating that you can easily lose hours and hours in the exploration process. But you can curb that compulsion, he believes, and that’s precisely why it is just a powerful tool in a creative’s arsenal.

Trillo knows that DALL-E could have negative impacts on the creative industry, but, he insists, “Tt isn’t going to be taking any jobs away from visual effects artists.” If anything, he anticipates, “it’s going to create efficiencies to work they’re already doing. It will open the door to entirely new kinds of techniques as well as allow for lower -budget projects to have photorealistic VFX.”

Which makes sense. I can see how truly creative people will be safe and empowered by these new tools rather than threatened by them. I can imagine people who are great at using After Effects or Photoshop, but have limited creativity, losing jobs, the same way that many other jobs were lost to technology that empowered others to do amazing things.

Trillo makes another good point: “If everyone can create spectacle, then spectacle will become boring.”