How accurate is AIGeneratedIt at detecting AI content?

AIGeneratedIt achieves 99.8% accuracy across all media types using an ensemble of specialized models including RoBERTa for text, RawNet2 for audio, XceptionNet for images, and SyncNet for video analysis.

Can AIGeneratedIt detect ChatGPT-written text?

Yes. AIGeneratedIt detects text generated by ChatGPT (GPT-3.5, GPT-4, GPT-4o, GPT-5), Google Gemini, Claude, Llama, Mistral, Cohere, and 50+ other AI writing models using perplexity scoring and burstiness analysis.

How do I detect a deepfake video?

Upload the video to AIGeneratedIt's video detector. Our system performs frame-by-frame GAN fingerprint analysis, SyncNet lip-sync verification, and temporal inconsistency detection to identify face-swaps and AI-generated video content.

Is AIGeneratedIt free to use?

Yes. AIGeneratedIt offers free scans with no account required. Simply upload your file or paste your text to receive instant forensic results.

How does voice clone detection work?

AIGeneratedIt uses RawNet2 and Wav2Vec2 models to analyze spectral signatures in audio files. Cloned voices from ElevenLabs, VALL-E, Resemble.ai, and other synthesis tools leave detectable frequency artifacts that our system identifies with 96% accuracy.

Can I detect AI-generated images from Midjourney?

Yes. AIGeneratedIt detects AI-generated images from Midjourney, DALL-E 3, Stable Diffusion, Adobe Firefly, and GAN-based generators using XceptionNet, Error Level Analysis (ELA), and C2PA content credential verification.

Is AIGeneratedIt GDPR compliant?

Yes. AIGeneratedIt is fully GDPR and EU AI Act compliant. Submitted files are processed for detection purposes only and permanently deleted within 24 hours. No data is used to train AI models.

What is a deepfake detector?

A deepfake detector is a forensic AI tool that analyzes media to determine whether it has been artificially generated or manipulated. AIGeneratedIt examines pixel-level artifacts, GAN fingerprints, spectral anomalies, and metadata consistency to identify synthetic content.

How TruthScan Achieves 99.8% Accuracy Across Audio, Video, Image & Text

Most AI detectors fail because they rely on a single model. A single model has a single failure mode — and adversarial content generation tools are specifically designed to exploit those failure modes. TruthScan takes a fundamentally different approach: an ensemble of 11 specialist forensic models, each trained on a different signal, voting together on every scan.

The four pillars of TruthScan

TruthScan is built around four specialist sub-engines, one per media modality.

1. Text — RoBERTa + Binoculars + Fast-Detect-GPT

AI text detection exploits a fundamental property of language models: they generate text by predicting the most probable next token. This produces text with lower perplexity (more predictable word choices) and lower burstiness (less variation in sentence length and complexity) than human writing. Our text engine combines three approaches:

RoBERTa classifier — fine-tuned on 500,000 labeled examples from GPT-3.5, GPT-4, GPT-4o, Gemini, Claude, Llama, and Mistral
Binoculars — a zero-shot perplexity scoring method that requires no training data
Fast-Detect-GPT — perturbation-based detection that tests whether the text sits in a local maximum of the probability distribution

These three approaches are orthogonal: Binoculars catches text that RoBERTa misses (e.g. lightly paraphrased AI content), and Fast-Detect-GPT catches text that both miss by checking the underlying probability landscape directly.

2. Image — XceptionNet + ELA + Hive Moderation

Image forensics requires detecting two distinct phenomena: GAN-generated images (where every pixel is synthetic) and manipulated images (where real photographs have been edited). XceptionNet, trained on the FaceForensics++ dataset, excels at detecting deepfake faces. Error Level Analysis (ELA) detects localized edits — splicing, inpainting, object removal — by revealing regions that were compressed at different times. Hive Moderation provides a production-grade AI image classifier trained on outputs from Midjourney, DALL-E, Stable Diffusion, and Adobe Firefly.

3. Audio — RawNet2 + Wav2Vec2 + MFCC-CNN

Voice clone detection is the most technically demanding modality because modern voice synthesis has reached human parity in perceptual quality. Our audio engine operates on three levels: RawNet2 analyzes raw waveform artifacts invisible to human hearing, Wav2Vec2 XLSR operates on learned feature representations, and MFCC-CNN analyzes hand-crafted spectral fingerprints. Voice cloning systems produce characteristic artifacts in all three domains simultaneously.

4. Video — XceptionNet + SyncNet + RetinaFace

Video deepfake detection requires both spatial analysis (per-frame face manipulation) and temporal analysis (audio-video synchronization). SyncNet measures lip-sync correlation — deepfakes created by replacing a face while keeping original audio, or vice versa, produce characteristic desynchronization at the sub-frame level. RetinaFace provides precise face localization for per-region XceptionNet analysis.

Ensemble voting and confidence calibration

The final TruthScan verdict is not a simple majority vote. Each model outputs a probability score, and these scores are combined using a learned weighting matrix trained to minimize false positives while maintaining high true positive rate. The weighting varies by content type: for a 30-second audio clip, RawNet2 receives higher weight than for a 5-second clip where there is insufficient signal for spectral analysis.

Confidence calibration is performed using temperature scaling — a post-hoc technique that aligns model confidence scores with empirical accuracy on a held-out calibration set. This ensures that when TruthScan reports 95% confidence, the result is correct approximately 95% of the time.

Benchmark results

TruthScan is evaluated monthly on a held-out benchmark dataset containing 50,000 samples per modality, balanced across generation tools and real content. Current accuracy figures: text 93%, image 95%, audio 96%, video 97%. Ensemble accuracy across all modalities: 99.8%.