A deepfake is synthetic media — video, image, audio, or text — generated or manipulated by artificial intelligence to portray people saying or doing things they never said or did. The term combines "deep learning," the AI technique that powers them, and "fake." Five years ago, deepfakes were a niche curiosity. In 2026, they are a mainstream fraud vector: the FBI, WEF, and Gartner now treat deepfake-enabled identity attacks as a top-tier enterprise risk.
This guide is for security leads, compliance officers, IDV product managers, and anyone who needs a precise, technically accurate definition of what a deepfake is — and why the category matters. We cover the formal definition, the four media types, the difference between deepfakes and "cheapfakes," real-world incidents, and where the technology is heading.
- iProov 2025: just 0.1% of humans can reliably distinguish modern AI-generated content from real.
- Arup February 2024: $25.6M wired after a video conference of deepfaked executives — the largest publicly confirmed deepfake fraud loss to date.
- Voice deepfakes increased 680% year-over-year in 2024 (Group-IB) and remain the highest-volume enterprise deepfake category.
- CEO fraud now targets at least 400 companies per day using deepfakes, with average losses exceeding $500,000 (Keepnet Labs 2026).
- AI-generated identity forgeries are now convincing enough to defeat selfie checks and liveness tests (Microsoft Digital Defense Report 2025).
- Gartner projects 30% of enterprises will consider standalone identity verification unreliable in isolation by year-end 2026.
What Is a Deepfake?
A deepfake is content — most commonly audio, video, or images — that has been generated or manipulated by AI such that it closely resembles real people, voices, or events that did not actually exist or occur. The defining property is synthesis: a deepfake is not a Photoshopped image or a slowed-down video. It is content produced by a neural network that has learned a target's appearance, voice, or mannerisms from training data, and can produce novel outputs that match those patterns.
Three components are typical of any deepfake:
- A target identity — a real person whose likeness, voice, or behavior is being replicated.
- A generative model — usually a generative adversarial network (GAN), an autoencoder pair, or a diffusion model — trained on data of the target.
- A delivery medium — a video file, an audio call, an image, or, increasingly, a real-time live stream.
The Alan Turing Institute defines a deepfake simply as "AI-generated video, image or piece of audio content that is designed to mimic a real-life person or scene." That definition is broad on purpose: deepfakes range from completely synthetic faces of people who do not exist (NVIDIA's StyleGAN faces), to puppeteered videos of sitting heads of state, to fully cloned voices created from three seconds of audio.
The term itself was coined in 2017 by a Reddit user who posted face-swapped pornography. The technique has since matured from amateur novelty to enterprise-grade fraud tooling. To understand the mechanics behind the term, see our companion guide on how deepfakes are made.
The Four Types of Deepfakes
Deepfake research literature generally divides synthetic media into four functional categories. Each carries different threat models and detection challenges.
Face swap is what most people picture when they hear "deepfake": the source person's face is mapped onto the target's body in a video. The subject keeps their original gestures and expressions but appears to be someone else.
Face reenactment / lip sync preserves the target's identity but puppets their facial expressions, head pose, or lip movements to match a new audio track. This is the technique behind videos of politicians appearing to say things they never said. Modern reenactment models can drive a single still image to produce minutes of fluent speech.
Full-body synthesis generates entire bodies in motion — gait, gestures, clothing — from text prompts or reference video. Models like LTX-2, OpenAI's Sora 2, and Runway Gen-3 produce 4K-quality body video at near-real-time speeds on consumer hardware. This category was largely theoretical in 2023; in 2026, it is widely available.
Voice cloning / synthetic audio generates speech in a target's voice. Modern systems require as little as three seconds of clean reference audio to produce convincing clones. According to Group-IB's analysis, voice deepfakes increased 680% year-over-year in 2024, and remain the highest-volume deepfake category in enterprise fraud. For an audio-specific deep dive, see our voice cloning threat analysis.
A growing category is multimodal deepfakes — synchronized fake video plus fake audio plus fake chat messages, deployed in live calls. The Arup $25.6M fraud (Hong Kong, February 2024) was a multimodal attack: an entire video conference of "executives" was synthetic.
How Deepfakes Differ from Cheapfakes and Traditional Edits
Not every manipulated video is a deepfake. The distinction matters legally, technically, and operationally.
A cheapfake is content edited or recontextualized using ordinary tools — speed adjustments, splicing, mislabeled captions, color correction — without AI generation. The 2019 viral video of Nancy Pelosi slowed to make her appear drunk was a cheapfake, not a deepfake. A traditional edit uses tools like Photoshop or Premiere but does not synthesize new content. A deepfake requires a learned generative model that produces previously nonexistent pixels or audio samples.
This distinction matters because detection methods differ. Cheapfakes are caught by metadata analysis, reverse-image searching, and contextual debunking. Deepfakes require detection of the statistical fingerprints that generative models leave behind — frequency-domain artifacts, temporal inconsistencies, blood-flow patterns invisible to humans. For a full breakdown of detection approaches, see our guide on how to spot a deepfake.
A Brief History: From Reddit to $25M Wire Fraud
A short timeline is the fastest way to grasp how the category has evolved.
- 2014 — Ian Goodfellow and colleagues introduce generative adversarial networks (GANs), the foundational architecture for early deepfakes.
- 2017 — A Reddit user posting under the handle "deepfakes" releases face-swapped videos using open-source code; the term enters mainstream usage.
- 2019 — A UK energy firm is defrauded of €220,000 via a voice-cloned CEO call. First publicly documented enterprise voice deepfake fraud.
- 2020–2022 — Diffusion models (DALL·E, Stable Diffusion, Midjourney) replace GANs as the dominant image-synthesis architecture. Quality improves dramatically.
- February 2024 — A finance worker at engineering firm Arup is tricked into wiring $25.6M after a multi-person video call populated entirely by deepfaked colleagues. This becomes the largest publicly confirmed deepfake fraud loss to date.
- July 2024 — Ferrari narrowly avoids deepfake CEO fraud when a senior executive challenges the caller with a personal question the synthetic voice cannot answer.
- 2025 — The U.S. TAKE IT DOWN Act criminalizes non-consensual intimate deepfakes. The EU AI Act's deepfake disclosure provisions take effect.
- 2026 — Open-source video models like LTX-2 generate 4K deepfakes at 50fps on consumer GPUs. iProov's Threat Intelligence Report 2025 finds just 0.1% of humans can reliably distinguish real from AI-generated content. Gartner projects 30% of enterprises will consider standalone identity verification unreliable in isolation by year-end.
The trajectory is clear: the technical barrier to creating a convincing deepfake has collapsed, the volume of synthetic media has grown roughly 16x since 2023, and the financial loss per incident now routinely exceeds half a million dollars per Brightside AI's 2025 enterprise data.
Why Deepfakes Matter: The 2026 Threat Landscape
Deepfakes are no longer a misinformation curiosity. They are a structural shift in enterprise risk because they target identity — the layer that virtually every authentication, authorization, and verification process ultimately depends on.
The current threat surface includes:
- Executive impersonation and BEC. CEO fraud now targets at least 400 companies per day using deepfakes (Keepnet Labs, 2026). Average loss per successful incident exceeds $500,000; large enterprises lose $680,000 on average. The WEF Global Cybersecurity Outlook 2026 reports that cyber-enabled fraud has overtaken ransomware as the top concern for CEOs.
- Identity verification (IDV) bypass. Deepfakes are used to defeat selfie checks and liveness tests during onboarding. Microsoft's Digital Defense Report 2025 confirmed that AI-driven identity forgeries are now "convincing enough to defeat selfie checks and liveness tests." See our deepfake detection accuracy analysis for benchmarks on what works.
- Synthetic identity fraud at scale. Attackers generate thousands of unique synthetic personas — face, voice, ID document, social-media history — to open accounts that pass automated KYC. Fraud-attempt rates from AI-generated identities now represent 42.5% of fraud attempts in the financial sector, according to Signicat.
- Deepfake job candidates. The FBI's 2024 Internet Crime Report documented $13M in losses from deepfake-driven remote-job-interview fraud in 2025.
- Disinformation and reputational attacks. Synthetic videos of executives announcing false product news, fake earnings statements, or fabricated misconduct have moved markets and triggered regulatory investigations.
The common thread is that all of these attacks bypass technical perimeter defenses entirely. They target human trust, business process, and identity verification — not firewalls or endpoints.
Legitimate Uses of Deepfake Technology
It is worth noting that the underlying technology has substantial non-malicious applications. A 2025 systematic review of 826 peer-reviewed papers on deepfakes found a meaningful subset of research focused specifically on beneficial use cases. The same models that enable fraud also enable:
- Film and entertainment — de-aging actors, posthumous performances, dubbing without lip-sync mismatch.
- Accessibility — voice cloning for ALS patients losing their natural voices, sign-language avatars.
- Privacy-preserving research — synthetic faces in medical and surveillance datasets that protect real individuals' identities.
- Education and training — historical figures "delivering" their own speeches in classroom settings.
- Localization — generating native-quality video and audio in dozens of languages from a single source.
The technology is dual-use. The detection challenge is therefore not to ban the underlying capability, but to authenticate origin, consent, and provenance — which is increasingly what regulation (EU AI Act, TAKE IT DOWN Act) is targeting.
Common Misconceptions
A few mental models still circulate that no longer match the 2026 reality.
"Humans can spot deepfakes if they look carefully." False. iProov's 2025 study found that just 0.1% of participants could reliably distinguish real from AI-generated content. The unnatural-blinking and weird-teeth heuristics that worked on 2019-era deepfakes do not work on 2026-era models. Human perception has been overtaken.
"Deepfakes require huge GPU clusters." False. A single consumer GPU (RTX 4090 or 5090) can generate near-real-time 4K deepfake video. Voice cloning runs on a laptop. The barrier to creation has collapsed; only the barrier to high-volume industrial production remains.
"Watermarking will solve this." Partially. C2PA and similar provenance standards help authenticate genuine content, but they do nothing about deepfakes generated by adversarial actors who simply do not watermark. Provenance is necessary but not sufficient.
"Deepfakes are mostly a misinformation problem." Outdated. In 2024-2026, financial fraud, IDV bypass, and synthetic identity attacks have become the dominant use cases by volume and dollar value.
"Deepfake detection is unreliable, so why bother?" This conflates two things. Lab-benchmark accuracy on academic datasets does diverge from production accuracy on adversarial real-world content. But that gap is precisely what enterprise-grade detectors are built to close — see deepfake detection accuracy in production for the data.
What Comes Next
The trajectory through 2027 points in three directions.
First, real-time deepfakes are now operationally feasible. Live face-swap and voice-clone systems run with sub-100ms latency on consumer hardware. This forces detection from "post-hoc forensic" into "in-call streaming" — the same shift that anti-virus made when malware moved from disk to memory.
Second, multimodal attacks will dominate. Single-channel deepfakes (audio-only, video-only) are easier to spot than synchronized audio + video + chat + email campaigns. Defending against multimodal attacks requires correlation across signals, not isolated checks.
Third, explainability will move from a research nice-to-have to a regulatory requirement. Compliance officers, courts, and insurance carriers increasingly need to understand why a piece of media was flagged. Black-box "trust us" detection is becoming insufficient. Our explainer on explainable AI in deepfake detection covers what XAI looks like in practice and why the major IDV buyers are now writing it into RFPs.
DuckDuckGoose's DeepDetector is built around the principle that a detection result without an explanation is operationally incomplete — a security analyst, a compliance officer, and a court of law all need to know which artifacts drove the verdict, not just the verdict itself.
FAQ
What is a deepfake in simple terms? A deepfake is a video, image, audio clip, or piece of text generated or manipulated by AI to make it look or sound as if a real person said or did something they never did. It differs from ordinary editing because the content is synthesized by a neural network, not just cut, pasted, or filtered.
Are all AI-generated images considered deepfakes? Not quite. The term "deepfake" historically refers to synthetic media that mimics a specific real person or scene. A purely fictional AI-generated landscape is AI-generated content (AIGC) but is not usually called a deepfake. AI-generated images of real people, however, are deepfakes by most working definitions.
Are deepfakes illegal? It depends on jurisdiction and use. In the U.S., the TAKE IT DOWN Act (signed May 2025) criminalizes the non-consensual creation or distribution of intimate deepfakes. The EU AI Act requires disclosure when synthetic media is used. Forty-six U.S. states have enacted deepfake-specific laws. Creating a deepfake for entertainment, satire, research, or with the subject's consent is generally legal in most jurisdictions.
What's the difference between a deepfake and a cheapfake? A deepfake is generated or substantially manipulated by an AI model trained on the target's likeness or voice. A cheapfake is content edited using conventional tools (slowing, splicing, mislabeling) without AI synthesis. Both can be deceptive; only deepfakes require AI-based detection methods.
Can humans reliably spot deepfakes? No, not anymore. iProov's 2025 study found only 0.1% of people could reliably distinguish real from synthetic content. Older heuristics like "look for weird blinking" or "check the teeth" no longer work on modern diffusion-based generators. Detection needs to sit in the system, not with the user.
How long does it take to make a deepfake? Voice cloning: seconds, from as little as three seconds of source audio. Face-swap video: minutes to hours, depending on quality and length. Real-time live deepfakes: zero — they are generated as the call happens. The technical barrier has effectively collapsed.
What is the most expensive deepfake fraud incident on record? The Arup case in February 2024, in which a finance employee in Hong Kong authorized 15 wire transfers totaling $25.6 million after a video call populated entirely by deepfaked executives. Hong Kong Police confirmed the incident, and Arup publicly acknowledged the loss in May 2024.
How can my organization defend against deepfakes? Three layers: (1) process — out-of-band verification for any high-value transaction or sensitive request, regardless of how convincing the source appears; (2) detection — automated deepfake detection in IDV pipelines, contact centers, and video conferencing; (3) training — staff awareness that visual and auditory evidence is no longer sufficient on its own. See our guide on how to spot a deepfake for practical playbooks.
Defending against deepfakes starts with understanding what you're up against. DuckDuckGoose's DeepDetector provides explainable, production-grade deepfake detection for IDV providers, contact centers, and enterprises. Request a demo to see how detection fits into your pipeline.
Last update: Q2 2026.



















