Reverse engineering Gemini's SynthID detection

TL;DR Highlight

A project has been released that detects and removes SynthID, an invisible watermark inserted by Google Gemini into AI-generated images, using only signal processing and spectral analysis. This is controversial as it demonstrates vulnerabilities in AI-generated image identification technology.

Who Should Read

Developers interested in AI-generated image detection technology or the security of digital watermarking, or researchers and security engineers who need to apply or evaluate the bypass possibilities of content authentication technologies like SynthID.

Core Mechanics

This project reverse-engineered Google's SynthID watermarking system, analyzing the invisible watermark embedded in all images generated by Gemini using pure signal processing and spectral analysis without a proprietary encoder/decoder.
The project claims to have successfully identified SynthID watermarks with 90% accuracy using a custom-built detector, but this figure is based on their own detector and has not been verified by Google's official SynthID detector.
A key finding is that SynthID watermarks have a 'resolution-dependent carrier frequency structure'. That is, the watermark is embedded in different frequency bands depending on the image resolution.
They developed a removal technique called 'V3 Multi-Resolution Spectral Bypass', claiming to achieve a 75% reduction in carrier energy, a 91% reduction in phase consistency, and a PSNR (image quality loss metric) of 43dB or higher. A PSNR of 43dB or higher is almost indistinguishable to the human eye.
It's been deployed as a CLI tool installable via pip, with options like 'aggressive' and 'maximum' for setting the watermark removal strength, effectively making it a turnkey watermark removal tool.
The community collection method is unique: they generate reference images by pasting pure black (#000000) or white (#FFFFFF) images into Gemini (Nano Banana Pro) and prompting 'reproduce it exactly', then use these images to extract watermark patterns.
There are many criticisms regarding the README and code quality. It displays a '90% detection rate' badge with 88 verification images, no CI, and no test suite. The code example also contains a mix of two import styles, which could cause an ImportError.
As one comment mentioned, simply downscaling and upscaling the image is said to remove the watermark, raising questions about the practicality of SynthID itself given this level of vulnerability.

Evidence

"Google may actually operate two watermarks, one 'loose' version provided as a public oracle and another version held internally for law enforcement requests. It was also pointed out that Google is likely storing all generated images (or their neural hashes) linked to accounts. \n\nThe project was critically reviewed for testing only with its own detector and not verifying it with Google's official SynthID detection app. This means it doesn't prove it can actually fool Google's detection system. One comment pointed out that ground truth could be obtained by reverse-engineering the network request to directly call SynthID detection without a browser instance or Gemini access, but this wasn't done.\n\nThe README is heavily criticized for showing traces of AI-generated text. V1 and V2 only appear in tables and diagrams without explanation, the same content is repeated three times in Overview, Architecture, and Technical Deep Dive, and approximately 1600 words are padded to look like a paper, but lack actual rigor. Ironically, one comment noted that misaligned tables suggest they were written by Claude.\n\nThere was discomfort expressed about deploying a pip-installable CLI with 'aggressive' and 'maximum' strength settings for watermark removal while stating 'research purposes'. Ethical concerns were also raised about removing the only means of distinguishing AI-generated images from those created by humans.\n\nExperiences sharing that simply downscaling and upscaling images removes the watermark were shared. This led to reactions like 'if it's this easily removed, SynthID wasn't a good solution to begin with'. A similar bypass method was also described in a separate blog post (deepwalker.xyz)."

How to Apply

If you rely on SynthID to implement AI-generated image detection in your service, be aware that this project demonstrates that watermarks can be removed with simple spectral processing and avoid relying on single watermark detection. Employ a multi-layered authentication strategy (neural hash + metadata + generation history) instead.
If you are conducting security audits or red team operations, you can use the methodology of this repo (extracting carrier frequency patterns from a pure image set) to assess the spectral analysis vulnerabilities of the watermarking systems used within your organization.
If you are designing an AI-generated image verification pipeline, consider prioritizing server-side verification, such as Google account-linked generation history or C2PA (Coalition for Content Provenance and Authenticity) metadata, over client-side watermark detection based on this case.

Code Example

snippet

# Repo basic usage (based on README)
pip install huggingface_hub

# Download reference images
python generate_references.py

# Detect watermark
python src/detect.py --image path/to/image.png

# Remove watermark (V3 Multi-Resolution Spectral Bypass)
python src/remove.py --image path/to/image.png --strength aggressive
# strength option: normal | aggressive | maximum

Terminology

SynthIDAn AI-generated image watermarking system developed by Google DeepMind that embeds invisible signals into pixels to identify images created by AI.

캐리어 주파수(carrier frequency)The frequency component that carries the watermark signal. Like radio broadcasts sending voice on a specific frequency, image watermarks also hide information in a specific frequency band.

PSNRAbbreviation for Peak Signal-to-Noise Ratio, a metric that expresses the difference in image quality between the original and modified images in dB units. 40dB or higher is considered almost indistinguishable to the human eye.

스펙트럼 분석(spectral analysis)A technique for analyzing which frequency components exist and to what extent by converting an image or signal into the frequency domain (usually using Fourier transform). Edges, textures, and patterns in images are expressed as specific frequencies.

뉴럴 해시(neural hash)A fixed-length hash value converted from the semantic features of an image using a neural network. It has the characteristic that similar hash values are generated for the same image even if the image is slightly modified or resized.

C2PAAbbreviation for Coalition for Content Provenance and Authenticity, an industry standard specification for recording and verifying metadata about where, when, and by whom content was created.