|
“The place we end up is where technology is not the problem anymore”
-Image Metrics CES 2009 presentation
Imagine sitting down with nice cup of coffee and opening your laptop computer in the corner of your
favorite cafe. In a few minutes you've logged on to the virtual world of your choice, transported to a
friends island for a meeting of animation geeks like yourself, and after clicking a button on the interface
that allows the built in camera on your laptop to target your facial animations, you start talking as
Image Metrics software built in to the virtual world adds your expressions in real time to your avatar's
face. Meanwhile, all of your friends avatars have photo realistic facial animation as well because they
are using the same cutting edge facial animation software as you are.
Hard to imagine? Yet, this is one of the goals of Image Metrics, a nearly 10 year old animation
company that is arguably THE cutting edge company for facial animation in 2009. Their marker-less
system of facial animation and analysis wowed the CGI world at SIGGRAPH 2008 with their presentation
of “The Emily Project”. Their stated goal was to create an “exact replica of the actress Emily O'Brien.”
And if you haven't seen the video of their results (which went viral for several months on
youtube.com), you are in for a jolt: the animation is so lifelike and natural that it's virtually
indistinguishable from the original.

Emily Project Video (QuickTime/high quality) Small format in Video Center
Up until about 10 years ago, facial animation existed as a kind of subset of standard physical animation
in CGI. Gary Oldman's face in 1998's Lost in Space and digital doubles for The Matrix series were
stepping stones to the real quantum leap made by Weta Digital when they created extraordinary facial
animation (along with a brilliant performance by Andy Serkis) for the character of Gollum in Peter
Jackson's Lord of the Rings. The success of that technology lead to a focus on facial animation that has
grown in leaps and bounds both artistically and commercially.
Facial animation is particularly difficult because unlike general body animation, we (the audience) have
a highly developed ability to read faces and the subtle interplay of muscles and eye movement that
indicates emotional states. Essentially, we are all experts in reading faces having trained from birth to
do so. Which is why we find it so odd when we see a photo-realistic 3D character that might have dead
eyes or lip-sync that isn't quite perfect. This phenomenon is called “the uncanny valley,” and bridging
this phenomenon has been the holy grail for facial animation.
Technology for facial animation has been driven by the success of the LOTR trilogy, but also by the
video game industry cut scene tradition. What has changed in the last several years is that the
technology has become increasingly economical and relatively time efficient. Also, the competition in
the field has grown considerably so that where there were only a few major players as recently as 2000,
there are now are 8 major companies developing and promoting their unique facial animation
technology. Companies like Digital Concepts Group, Vicon, Dimensional Imaging, Mova, Alter Ego and Image Metrics have all made major progress in the visual effects industry and in computer game
technology.

Emily Project "Making Of" Video (QuickTime/high quality) Small format in Video Center
Contemporary facial animation tech is centered around either marker-based or marker-less systems,
with the latter system having a slight edge because of it's ease of production. Image Metrics uses a
marker-less system that starts with the physical actors performance (usually with a director present)
which is captured on a standard video recorder under reasonably good lighting conditions. Then the
footage is analyzed by IM's proprietary software which focuses especially on the area around the eyes,
nose and mouth. Then the footage of the performance is transferred to a fully rigged 3D facial model
(usually in the program of choice for the company the work is being done for, e.g. Maya, 3D Studio
Max, et al) where it is further tweaked and adjusted to match the live actors facial performance as
closely as possible. Then the animation curves are exported and delivered to the company to be
incorporated into their project.
Image Metrics created “The Emily Project” as a way of pushing the envelope for photo-realistic facial
animation. Their success has not only pushed their company forward as epitome of cutting edge in the
field, but it has inspired other companies to grow even faster. And it has created an atmosphere in
Games and Special Effects in film that tells major companies that this technology has arrived as a costeffective
and time ready means to implement top level facial animation in their projects.
At CES (consumer electronics show) 2009 in Las Vegas, Image Metrics and AMD presented an
upgraded version of The Emily Project by showing how their facial animation technology could be be
adapted to a real-time environment like a video game. The presentation emphasized that the real
bottleneck technically was the CPU, but that AMD, in particular, had arrived at a level of power and
complexity with the central processor that now made this kind of detailed and life-like facial animation
possible in games. Projecting further, there is a sense that real-time Pre-Viz for facial animation is a
distinct possibility as is our virtual world scenario which I described at the beginning of this article.
And at the Video Game Awards for 2008, Image Metrics teamed up with Epic, the game company that
produced the Gears of War series, and applied their real-time facial animation ideas to a short scene
with one of the GOW2 characters, Augustus “Cole Train” Cole. The results are superb. Take a look at
the 16 second clip presented here.

VGA Clip (QuickTime/high quality) Small format in Video Center (note: clip contains video game violence)
Image Metrics has already created facial animation for games like Grand Theft Auto IV, Unreal
Tournament 3 and Operation Flashpoint 2, but the VGA project was created in association with Epic in
only 3 days. Demonstrating that fast, quality facial animation can be added to a game production
pipeline in a very short amount of time.

Image Metrics Demo Reel
(QuickTime) (note: clip contains video game violence)
We are fortunate to have Jay Hosfelt, Lead Animator from Epic and Peter Busch, Director of
Production at Image Metrics to talk about how the VGA project came about and to comment on the
future of facial animation in games and as special effects. My thanks to both of them.
A Renderosity exclusive interview by Ricky Grove
Epic Answers: Jay Hosfelt, Lead Animator
Image Metrics Answers: Peter Busch, Director of Production
How did the GOW2 project come about? Why was Image Metrics chosen to create
the facial animation for the clip? Why did you choose the marker-less method
for facial animation?
Epic: The project came about because we were looking for a fast, high-quality facial
animation solution to use for the spot we did for Spike TV’s Video Game Awards. We chose
Image Metrics’ marker-less solution because it allows us to choose the fidelity of our capture,
so we could have thousands of points controlling the face. This picks up far more subtle
motion than a marker set would.
Describe the process of putting the clip together and the logistics involved. How
long did the GOW2 project take from start to finish? IM uses the "in-house"
animation program to rig the face for the model; what is the software that Epic
uses for GOW2?
Epic: Once we got the audio clip for Cole Train's lines, we had Patrick Downey (the motion
capture actor for Cole in Gears of War 2) come in and act to the lines. Using our 36-camera
Vicon motion capture system, we captured his body motion and imported it into Maya, where
we already had Cole's body/face rig set up. Our next step was to get Cole's rough mocap
motion into UE3 so Greg Mitchell, our Cinematic Director, could set up the shots in Matinee.
While Greg set up the cameras and audio in Matinee, Scott Dossett, an animator, polished up
the motion on Cole using Maya. Once Greg finalized the shots, we imported his cameras
back into the Maya scene with the cleaned mocap data, as it was very important that Image
Metrics could animate to the cameras we had set up in UE3. We sent this file off to Image
Metrics.
Image Metrics: For our portion, we shot the video in Burbank during the VO session on a Thursday
morning. Epic had called in from Raleigh and was on the line with us for the shoot. When Epic got
the performance to a point they liked, we instantly uploaded them the HD video for approval. They
loved it and within 20 minutes of completing the shoot in Burbank, and having the performance
approved by Epic in North Carolina, we had our team in Santa Monica begin image analysis. We
finished analyzing the video images that day and begun re-targeting the animation onto the 3D model
that afternoon. With just two animators working on it, we had a rough first pass of the sixteen
seconds that evening!
The next day Epic sent body data for us to apply to the face we had been working on. We
refined the re-targeting and finessed the facial data to feel seamless with the very large
performance of the body data. We then worked with Epic to fine-tune the performance over
the next two days to give them exactly what they wanted in the time schedule they needed.
Epic: What were your goals for the animation in the clip? Were there any particular
problems or obstacles you had to overcome?
Our goal was to improve upon anything we had done in the past. For this particular project,
we had a very tight deadline, and we wanted the quality to be higher than anything we had
done before. The talent and dedication from the teams at Epic and Image Metrics made the
deadline very easy to achieve and with great results.
Image Metrics: How much work does the animator(s) do after you've used your
proprietary software to "read" the live action video? Is it an art in itself to re-work
the data? Do you have to do much clean up?
Our data is surprisingly clean and delivers results true to the actor’s performance in the
sound booth. Our “clean-up” boils down to several things. The first thing is finessing the
character to give a realistic performance when the character cannot be “posed” or when the
character does not have the true facial elasticity or expressiveness that the actor does. The
second is taking any notes from Epic that differ from that performance and making it
seamless with our re-targeted animation to fulfill the director’s decisions in the final product.
And the final part would be integrating and correcting certain expressions and eye lines to
match the body animation.
Epic: Did you use the original actor who voiced the "Cole Train" character when you
shot the video for Image Metrics? Did you have a director there to work with the
actor? Or did IM take care of this part?
Yes, we used the talented Lester Speight who has always brought life to Cole Train.
Image Metrics: Since most people are very good at reading faces (we do it every
day), how does the Image Metrics software interpret such subtle movements of the
eyes, nose and mouth without markers? After watching the "Emily" project
(amazing work), there still seemed to be some development for the eyes (imo).
What challenges do you face in overcoming the "uncanny valley" effect for facial
animation?
Since 2000 our team of Ph.D. computer vision scientists in Manchester, England have been
developing and improving not just our tracking algorithms, but toolsets for us to more effectively use
them. We run our video data through a video analysis process that looks at every pixel of all the
areas of the face; the mouth, the lips, the tongue, the eyes, the pupils, the cheeks, the eyebrows, the
jaw, well, everything. The tracking algorithms are so good, we can define and track all the areas of
the face, from gross shapes down to pore-level detail, without any markers.
That said, we are always working to make every project better than the last; but in this particular
case we were helped by the fact that Cole was stylized and addressed the camera directly. Our brains
know this character is not real, so it allows the humanity we add to shine through that much more.
There are many challenges we face in consistently pulling ourselves out of the uncanny valley. The
first is creating characters that look and have the ability to move like real people. Then there are the
hurdles that are not necessarily within our control, such as proper lighting, rendering, compositing,
realistic environments, and processing power (just to name a few). Because of this we have partnered
with many of the top visual effects studios, games studios, and research institutions so we can better
learn how our animation can be a flawless cornerstone in building our way out of the uncanny valley.
Epic: The original GOW had superb cut scenes and good facial animation. How does the
work the Image Metrics did on the clip compare? How pleased were you with the results?
With GOW, everything was key-framed by hand, or it was auto generated with FaceFX and then
polished from there. Image Metrics' work brings an entirely new level of fidelity to the motion and
retains the performance of the voice actors. The results we got with the VGA spot from Image
Metrics exceeded our expectations, and we were very pleased.
Image Metrics: Is your work different depending on whether you are working on a game or
CGI in film? I know film is more photo-realistic, but are there other differences in your
experience? Do you think IM's work will move into television?
We do have different “service levels”. These are mostly based on the complexity of the character’s
facial rig and the runtime environment, i.e. game engine or pre-render. There is a huge difference in
a background character in a video game and what Emily can do, both physically in the 3D space and
perceptually to the viewer. Because of this, our technology has been developed to be fully scalable.
We are really hoping to take this work into the realm of television because we can create convincing
characters, from realistic to cartoony, all within the tight time schedules television studios demand.
Epic: Do you see the game industry as a whole adopting more subtle and specific
facial animation in the future? Do you want to move to more photo-realistic
animation in general?
The game industry is slowly moving past a lot of the obstacles that once limited what we
could do with facial animation. Games in the future will certainly start using tools and
services that were only used on movie projects in the past. We at Epic are striving to create
characters that give good performances whether they're really stylized or more
photorealistic.
Epic: Do you plan on using Image Metrics in the future for cut-scenes in a full
game?
We will certainly explore that avenue in the future.
**All videos are in QuickTime format. Get QuickTime player here.
All supporting images and video are copyright, and cannot be
copied, or reproduced in any manner without written permission.
Ricky Grove [gToon], Staff Columnist with the Renderosity Front Page News. Ricky Grove is a bookstore clerk at the best bookstore in Los Angeles, the Iliad Bookshop. He's also an actor and machinima filmmaker. He lives with author, Lisa Morton, and three very individual cats. Ricky is into Hong Kong films, FPS shooters, experimental anything and reading, reading, reading. You can catch his blog here. |