Future Tech

Lights, camera, AI! Real-time deepfakes coming to DEF CON

Tan KW
Publish date: Mon, 05 Aug 2024, 06:19 AM
Tan KW
0 462,401
Future Tech

DEF CON Visitors to the AI Village at this year's DEF CON hacker conference will have the chance to star in their own deepfake video simply by standing in front of Brandon Kovacs' camera, and watching as he turns them into a digital likeness of a fellow attendee - for a good cause.

Kovacs is a senior red teamer at security biz Bishop Fox, and this won't be his first time creating real-time video and voice clones - and demonstrating how easy it is for criminals to use these techniques to improve social engineering attacks.

In an interview with The Register, Kovacs said a $25 million scam involving a deepfake video call earlier this year caught his attention "and sent me down this rabbit hole of research."

That call was made in February, when a Hong Kong-based finance professional at a multinational bank thought he was seeing and speaking to his London-based chief financial officer. He was instead conversing with a real-time deepfake, which tricked him into making a $25 million wire transfer.

"At the time, I was like, 'Wow, that sucks.' But also, 'How do they do it?' I really admired it from a technical perspective," Kovacs said.

He therefore began looking for someone to clone and recruited his Bishop Fox colleague and DEF CON Social Engineering Capture the Flag champion Alethe Denis to help.

Denis' image and voice are all over the internet because she's appeared in many interviews, podcasts and webcasts, and has spoken at several infosec conferences.

Kovacs decided to test a hypothesis. "Can we successfully clone someone using only public information that's on the internet using open-source tooling?"

The duo then trained machine learning models using publicly available footage of Denis. They used a professional DSLR camera, lens, lighting, wigs, props, green screen and production software. The result is a real-time video that appears to be Denis - but is really Kovacs sitting in what appears to Denis's home office.

"At one point, in what we dubbed the 'Deepfake Turing Test,' we routed the outputs of the deepfake video and voice as the camera and microphone inputs for Microsoft Teams," Kovacs wrote in a LinkedIn post that includes the video. "I then spoke with her children over a live video call, who believed they were speaking with her mom."

While tricking kids is always fun, the joke falls flat when criminals benefit from similar techniques.

"In the context of the MGM [ransomware] attack, where hackers called the IT help desk, claiming to be someone within the organization trying to reset their password. Now imagine they have that same capability, but they can also sound like that person," Kovacs said.

Creating deepfakes does not require huge resources, for criminals or other users. DeepFaceLab, which allows users to train models and create deepfakes, is a free download. Retrieval-based Voice Conversion (RVC) is an open-source project for training voice models. A consumer-grade graphics card can run about $1,600.

And while Kovacs feels that using high-end studio lighting and a DSLR camera "greatly elevates the authenticity of the scene and swapping of the faces, versus using a standard webcam," Kovacs points out that spending “a couple grand on the camera, a lens, studio lighting and everything else, that's still chump change when it comes to stealing $25 million."

Kovacs will bring his "studio in a box," which includes wigs, lights, a green screen and other equipment to Las Vegas and will allow DEF CON attendees to test it out.

"The idea is that I will transform that person, in real time and make them look like someone else, and then drop them into an interactive environment, or a studio or office setting, to demonstrate what it looks like when this is pulled up in real time," he said.

Video and audio clones Kovacs makes will be used to feed a deepfake detection tool being developed by DARPA's Semantic Forensics program that will also be demonstrated in the AI Village.

"This is a new adventure for us," said DARPA program manager Wil Corvey.

The AI village program, which aims to develop semantic technologies for analyzing media, is four years old. Its work includes creating detection algorithms to determine if video, audio, images, and text have been generated or manipulated. An attribution algorithm that will determine if the media originates from a particular organization or person is in the works too.

"We made a platform that will appear on screen at DEF CON, in the AI Village, which is basically a triage utility, among other things, for video, audio, image and text that have been manipulated or synthesized by SM means," Corvey told The Register. "Deepfake videos are one layer of that, that captures people's imaginations."

"So we're bringing some of our analytics to DEF CON, putting them in this user interface and helping people to understand current workflows for this kind of forensic analysis, and help us, in turn, to understand how we should think about red teaming those so that we have as robust an information triage capability, collectively as a society, as we can," Corvey explained. ®

 

https://www.theregister.com//2024/08/04/realtime_deepfakes_defcon/

Discussions
Be the first to like this. Showing 0 of 0 comments

Post a Comment