Phishing attacks have increased by 3,000% in the past year, and are a major driver in the increasing cybercrime losses reported globally.

DeepPhishing: The New Weapon in CyberCrime

by Joshua McKenty

The global wire transfer system processes roughly 45 million messages per day, moving well over $10 trillion dollars. Combined with other money transfer mechanisms, including crypto-currency payments, ACH, and country-specific platforms, it is easy to see how phishing attacks on financial transactions are one of the largest and fastest-growing criminal enterprises in the world. The use of real-time audio and video Deepfakes (what we are calling DeepPhishing) represents a dramatic escalation in this threat, one for which most financial professionals are unprepared.

In February of this year, Interpol announced that the Hong Kong offices of British design firm Arup were the victims of a $25.6 million dollar DeepPhishing attack. These attacks have increased by 3,000% in the past year, and are a major driver in the increasing cybercrime losses reported globally. And the Arup attack is a textbook example of the new playbook for AI-powered financial fraud.

The new playbook

Fraud is older than finance itself. In recent years, the category of "Business Email Compromise" (BEC) has become increasingly a misnomer, as the attackers have moved from email phishing, to text messaging over SMS, iMessage or WhatsApp. Now, with the help of generative AI technologies, they are combining this with realistic audio and video impersonations. These attacks may impersonate internal team members (such as the CFO or CEO), banking partners, external financial advisors, even vendors and customers.

This escalation is more dangerous than it might appear. We have been grappling with email for decades, and we have enterprise protection tools available; Direct messaging is direct and personal, and because most organizations have embraced a "bring-your-own-device" approach to smart phones, we have very few tools available to protect us. But the bigger risks are in audio and video.

Audio is instinctively persuasive because it triggers recognition, which is a limbic response, not a cognitive one: Our nervous system reacts to the familiarity of someone's voice before we have any opportunity to engage critical faculties. And our visual assessment of trustworthiness is the MOST high-speed process - unconscious assessment of trustworthiness occurs in a small fraction of a second, and strongly influences our decision making. Whether we like it or not, we are biologically wired to "trust the evidence of our eyes".

"(1) people cannot reliably detect deepfakes and (2) neither raising awareness nor introducing financial incentives improves their detection accuracy."

Vulnerability is increasing

Thanks to a global pandemic and the successful spread of smartphones, we are now reachable by Zoom in every corner of the world. This has led to a culture that demands "exceptions" to the normal processes and timelines of financial prudence: Vacations, hybrid work, employee turnover and a new generation of managers with a digital-first mindset has increased the expectations of fast-turnaround, one-off transactions. What could be more innocuous than a Zoom call from your CFO at the beach on holiday?

With a systemic increase in social media scraping, large-scale data breeches and identity theft, attackers have all the data they need to craft high-credibility emails and text messages. (They know your boss's name, office location, favourite restaurant, spouse, kids, dog's name, etc.) Using AI, these attacks are completely automated AT SCALE. But most importantly, the task of the email has gotten much simpler: Attackers don't need to get you to approve a fraudulent transaction in an email; they just need to get you to agree to a phone or video call.

Attacks are proliferating

The tools being used to craft real-time video and audio impersonation have been around since 2019. But they used to be hard to use, requiring custom software development skills, specialized equipment and an expensive amount of computing resources. More importantly, each attack would take weeks or months to prepare, so targets were individually selected. In order for an attack to be worthwhile, it needed to net millions, if not tens of millions, of dollars.

Now, these attacks are fast, easy and cheap. As the barriers to access are removed, the number of criminal teams which can pivot into this market is going up. If that wasn't enough, here is one more driver: AI-powered translation is now flawless; attackers no longer need to have an grasp of the language of their targets.

We’re doing the wrong things to protect ourselves

Each generation of warfare has been characterized by improvements over the previous generation, in both attack and defense. But some of those improvements have been incremental, and others have been disruptive: As arrows became sharper, armor became tougher. But with the shift from bows to pistols, wearing armor was no longer enough. Our defense against phishing has always been education and vigilance - doublecheck the sender, and don't click any links unless you recognize the URL. With DeepPhishing, training no longer works: No one can be trained to spot deepfakes.