If you've ever been to Paris, chances are you've been to the Louvre and pushed through the crowds for a chance to see the famous Mona Lisa painting. For an extra level of excitement, museum visitors can also move from one side of the frame to the other to experience the well-known sensation of feeling like the woman in the painting is looking at you - whatever direction you go. Of course, the woman is not moving at all, but we enjoy the thrill of the concept. Now, thanks to artificial intelligence (AI), we can take this excitement a whole level further.
In a video shared on Youtube, viewers can see the Mona Lisa move her lips and head. The clips were created by a 'convolutional neural network.' This is a type of AI that processes information similarly to the human brain, analyzing and processing images. The computer scientists behind the clips manipulated the algorithm to understand the relationship between facial features and general shapes and then were able to apply this to basic images. As a result, images can gain multiple realistic moving expressions from a still photo.
The scientists in the video explain that they taught the AI different facial movements based on videos of three humans, each of which helped produce slightly different versions of the 'live' Mona Lisa. The Leonardo da Vinci painting wasn't the only one brought to life! The scientists also used AI to create animations from famous photos of Albert Einstein, Marilyn Monroe, and Salvador Dali.
These videos, known in the industry as deepfakes, are pretty complicated to produce. Scientists explained the complexities and the varieties of the human face structure mean that the 3D models of heads have "tens of millions of parameters" to work with. Moreover, according to the GAO, convincing deepfakes require "advanced technical skills and resources." As seen here, AI can produce realistic deepfakes - but multiple subject angles are crucial. Although for the Mona Lisa, they only had this one image to work with, scientists have trained the AI to recognize a large dataset of human expressions and movements, which it could then map onto another human face image. Stay tuned while this story develops.