After returning from SXSW last year, our CSO Andrew and CEO Luke had lots of ideas about the future of digital and where things were headed for Redweb as an agency. One of the main talking points during their recap of the conference was chatbots.
Gazing into the future
Chatbots have seen a resurgence in popularity and use over the past few years, thanks in part to improvements in AI and an abundance of available platforms. As a result, brands and businesses have been adopting them en masse for various applications – from hailing an Uber to ordering a pizza from Domino’s.
With ideas from SXSW, Andrew discovered Adobe Fuse. After experimenting with it briefly, he approached us with a question:
‘Can we use Adobe Fuse to build an interface for a chatbot?’
We weren’t entirely sure – but undertook his challenge nonetheless.
Okay, where do we start?
We began by investigating and planning out what we’d actually need to build and create an interface for a chatbot. With that in mind, we set out smaller goals to work towards. Our first port of call was to get to grips with Adobe Fuse and its capabilities.
Fuse is a 3D character-creation tool designed to be easy to use and achieve great results. It also includes an accompanying web-based service that allows users to automatically rig and produce animated versions of the characters. It’s basically an all-round solution that ties in nicely with Adobe’s Creative Cloud products.
We spent time exploring Fuse and its capabilities, concluding that in order to lip sync the characters, we’d need complete control over the motion and animation of the model. Thankfully, Adobe provides rigs for industry-standard 3D modelling and animating applications, so we could bring our Fuse model into one of those and gain more control over it.
We settled upon using Autodesk Maya as our modelling and animation tool – it’s a tried-and-tested product that has been successfully used by visual effects and animation studios behind films such as Star Wars and Frozen. Maya offers a broad range of features that make animating and lip syncing easier, and we found that the rig provided by Adobe for integrating Fuse characters into Maya was the most comprehensive out of those available.
Getting a Fuse character to manually lip sync to a soundbite was relatively easy once we’d got all of the individual components configured and set up. Getting it to look good, however, was another matter. We had to iterate a lot and continuously tweak the movements that our character was making to make them look natural.
One of the biggest problems we faced with lip syncing a Fuse character was the ‘uncanny valley’ – where a humanoid character doesn’t have perfect mouth movement and facial features. That meant when we showed it to people, the general reaction was “This looks creepy” or “This looks weird”.
Thankfully, we had an idea in mind from when the project was first suggested to us…
Your solution was a dog?!
Our solution came to us in the form of a dog! No, seriously. Andrew gave us an overview of where this could be used to drive user engagement for one of our design and build clients. This inspired us to use a dog as our character rather than those generated in Adobe Fuse.
Our decision to use a dog not only gave us a new challenge of creating a dog that could be lip synced to audio files, but also offered the opportunity to nickname the project after our resident dog, Oakley.
Producing the talking dog was an entirely different beast. We spent time exploring modelling, rigging and texturing – things that Adobe Fuse basically automates. Once we had sourced a model, we had to look at rigging it to move and lip sync. This was a challenging process, but thanks to Maya’s ‘blendshapes’ feature, creating a lip sync rig was easier.
After modifying the model and rigging it, it didn’t take much time to lip sync it to an audio file in a similar fashion to the Fuse model – but this time it went much more smoothly than our first attempts in Fuse.
That’s cool, now make it real-time
Now, it’s all well and good producing a model that can be lip synced. But for usage as a chatbot interface, it needs to:
- Operate in real time
- Be dynamic (eg audio snippets can change and it can still maintain a form of lip sync)
- Be multi-platform
- Pull content from the internet, or a web server
This was our final goal. It was quite ambitious compared to our previous objectives, but we were up for the challenge.
We began by exploring how to run the chatbot in real time, settling on Unity as our solution. It’s more commonly known as being a game engine – but it’s increasingly become used in installations, augmented reality applications and VR tours. It’s also very scalable and runs on pretty much every platform (including in a browser, to an extent).
We also used extensions to the Unity platform to provide lip syncing – and SALSA was a big help in realising our ambitions. SALSA can be run in real time and change dynamically. It also provides support for all the major platforms supported by Unity, as well as the ‘blendshapes’ Maya produces. This made it easier to transition between the editor and the real-time engine.
Using Unity, SALSA and our existing assets allowed us to create a real-time, lip-syncing dog that could support being lip synced from a microphone, soundbite or a text-to-speech engine. It seemed like the be-all-end-all solution to the challenge set out.
Unity wasn’t without its issues, however. It’s very developer orientated and we found ourselves having to write C# to get some functionality to work. As a team with basic HTML knowledge, attempting to write C# and wrap our heads around its concepts was quite the level up for us.
So, while what we’ve got now is cool and all, I’ve got a few ideas of where it needs to go to further meet the objectives we set out and become even better. We’re going to continue exploring and building upon it to improve its functionality and capabilities.
It goes without saying, but I’m personally quite proud of what we’ve achieved. It’s been really exciting to head up and produce something that’s innovative and new. I’ve also gained an understanding and insight into disciplines and processes I’d probably never have otherwise encountered in my day-to-day work.
Of course, it wasn’t without its stresses. But looking back, seeing all of the research and effort culminate into something that actually works made it worthwhile.