Achieving More Realistic Deepfakes by Making the Image ‘Worse’

About the author

Martin Anderson

Martin Anderson

I'm Martin Anderson, a writer occupied exclusively with machine learning, artificial intelligence, big data, and closely-related topics, with an emphasis on image synthesis, computer vision, and NLP.

Share This Post

Visual effects artists have known for a long time that the key to realism is often not in the enhancement but the degradation of images and video, so that the general effect of the synthesis more closely matches the domain that it’s trying to simulate, using techniques such as Color Lookup Tables (CLUTs).

Color Lookup Tables (CLUTs) are reference files which contain color mappings. These are frequently used to match current footage or images to older film stocks - for visual effects purposes, or for general styling. Older and less capable emulsions hailing back 50-60 years can be simulated in this way, and the effect is generally destructive, in that applying a 'retro' CLUT will downgrade a modern, full-color, full-range image. Source: https://patdavid.net/2013/08/film-emulation-presets-in-gmic-gimp/
Color Lookup Tables (CLUTs) are reference files which contain color mappings. These are frequently used to match current footage or images to older film stocks - for visual effects purposes, or for general styling. Older and less capable emulsions hailing back 50-60 years can be simulated in this way, and the effect is generally destructive, in that applying a 'retro' CLUT will downgrade a modern, full-color, full-range image. Source: https://patdavid.net/2013/08/film-emulation-presets-in-gmic-gimp/

At its simplest, this principle is found in online and phone-based photo filters that attempt to emulate the look of film emulsions from bygone eras, or the effects of scratching and poor handling, from the analog age, when photos were more commonly real-world objects, either as prints or reels of celluloid.

There are many hobbyist and professional resources for ‘degrading’ footage to match the visual domain of older or damaged images and video. Source: https://archive.is/ZLT9C

The pursuit of ‘suitably damaged’ output is an active area of interest both for hobbyist video producers and VFX professionals, with a wide range of methodologies, products and tutorials available to help ‘perfect’ footage and pictures replicate the imperfections of time and/or reality.

Therefore, in the area of human facial synthesis, the key to creating a convincing effect can often be to ‘downgrade’ the quality of an image to within the known visual parameters of common streams of (real) output.

Face Values

In general, face replacement and editing systems, such as autoencoder deepfakes, use style transfer to match the faked elements to the background ‘plate’ (the original image, whether a single static image or just one frame of many in a video sequence), retaining elements such as grain and lighting, while replacing the substantive content, such as facial features.

The open source streaming deepfakes package DeepFaceLive applies style transfer, automatically matching not only the facial lineaments of Tom Cruise to the target footage (a webcam), but also matching the lighting style and general image quality of the target video. Source: https://www.youtube.com/watch?v=GoEwXJxbk8c

However, the difficulty that face-replacement algorithms have in exactly matching the tonality and qualities of a plate can lead to the faked elements having different statistical qualities than the rest of the image. One reason for this is that systems such as DeepFaceLab or FaceSwap do not take into account the entirety of the image when dropping in new content, but effectively ‘crop’ the affected area and calculate the style transfer based on its discrete (rather than contextual) qualities.

The same applies for more recent latent diffusion models, such as Stable Diffusion, which rely on the qualities of the swap area (in the example below, a limited inpainting area restricted to the face) to evaluate a ‘style’ to emulate:

Though integration with the source image varies according to transfer settings and the capability of the generative model, the results are rarely able to combine 100% accuracy of identity with total immersion into the native style of the target picture.
Though integration with the source image varies according to transfer settings and the capability of the generative model, the results are rarely able to combine 100% accuracy of identity with total immersion into the native style of the target picture.

This means that while the fake face might be accurate within the replacement area, the replacement area itself doesn’t necessarily represent the quality and fidelity of the entire photo. Because of this, even a very good deepfaked face, and one that might fool most people, can be picked out as a ‘fake’ element by the latest advances in deepfake detection technologies.

StatAttack

Entering into this hot research field is a new academic collaboration between Japan, Singapore and Canada, which claims to be able to ‘inject’ the common flaws of real photos into faked images, by simulating perturbations that are already present in the target image, so that the superimposed content replicates the ‘flaws’ of the original content.

An illustration of the principles of StatAttack – the light blue areas represent the embedding space of real images, the redder areas, fake images. The use of crafted adversarial examples (i.e., perturbations that simulate common shortcomings in real images) brings the statistical placement of the fake embedding over into the 'real' space. Source: https://arxiv.org/pdf/2304.11670.pdf
An illustration of the principles of StatAttack – the light blue areas represent the embedding space of real images, the redder areas, fake images. The use of crafted adversarial examples (i.e., perturbations that simulate common shortcomings in real images) brings the statistical placement of the fake embedding over into the 'real' space. Source: https://arxiv.org/pdf/2304.11670.pdf

This has two practical applications: one is that entirely synthetic images, such as those created with Generative Adversarial Networks (GANs) and other types of generative system (including Stable Diffusion) could potentially use this knowledge to create images that are even more realistic.

The second is that AI-based image-editing systems (including deepfakes, which amend existing images and video rather than entirely replacing them) could be able to conceal evidence of changes made to real images, by adding this layer of ‘authentic disruption’ in the final stages of generation.

In tests, the new system performed comparably to other frameworks, and exceeded the performance of several of them; but perhaps the greatest value in the work is that it prominently emphasizes the need for generative output to match the qualities of the target domain, even in cases where that means lowering the quality – and that, in this sense, ‘worse’ is actually better, in regard to verisimilitude. 

The new method is titled StatAttack, and appears in the paper Evading DeepFake Detectors via Adversarial Statistical Consistency, from six researchers spanning Kyushu University, the Center for Frontier AI Research (CFAR), Singapore’s Agency for Science, Technology and Research (A*STAR) and Institute of High Performance Computing (IHPC), Nanyang Technological University, Singapore Management University, the University of Alberta at Canada, and the University of Tokyo.

Three Roads to Realism

Though the authors concede that there are several other factors found in images which could later be included in such a system, theirs concentrates on injecting three commonly-found, real-world perturbations into fake images: exposure, blur, and noise.

Conceptual workflow for StatAttack. The addition of alterations in exposure, blur and noise can lead to misclassification of a fake image as real.
Conceptual workflow for StatAttack. The addition of alterations in exposure, blur and noise can lead to misclassification of a fake image as real.

The system first adds these three qualities to faked or amended images, and then calculates the maximum mean discrepancy (MMD) of both the real and the faked images. In this way, the initial perturbations are optimized by loss functions, and a refined set of degradations eventually generated.

A single row from the extensive samples provided in the paper, comparing StatAttack's adversarial capabilities to rival frameworks (see below).
A single row from the extensive samples provided in the paper, comparing StatAttack's adversarial capabilities to rival frameworks (see below).

The paper classifies the functionality of the system in terms of an ‘adversarial attack’, and characterizes the methodology as an exercise in affecting probability distributions;

The MMD values between feature distributions of a set of both real and amended images. The blue curves represent the high-frequency elements of GAN-generated images, and the red natural ones. Here we see the addition of Gaussian blur closing the gap in distributions between the two sets.
The MMD values between feature distributions of a set of both real and amended images. The blue curves represent the high-frequency elements of GAN-generated images, and the red natural ones. Here we see the addition of Gaussian blur closing the gap in distributions between the two sets.

In essence, the system is simply observing how real photos are constituted, averaging the perceptual qualities found in both the real and the fake photos, and feeding back that information across the three specified facets of image quality, with the option to add others later.

Though it’s only briefly hinted at in the work, the ‘attack’ specified therein is simply a morally-agnostic visual effects procedure, and one that could be useful rather than harmful – for instance in creating more effective (legitimate and disclosed) synthesis systems. It could be argued that the term ‘adversarial’ is being a little abused here, with its context  borrowed from its use in steganographic systems that can have no benign intent, such as frameworks designed to misinterpret road signs

However, the paper states:

‘Our proposed attack method is capable of generating natural-looking adversarial examples that can transfer across different DeepFake detectors, posing a significant practical threat to the current DNN-based DeepFake detectors. This stealthy and transferable attack method can be employed to evaluate the robustness of Deepfake detectors in real-world applications or improve their robustness through adversarial training.’

The three facets chosen as adversarial perturbation represent common deficiencies in photos – flaws to which we have become so accustomed that we tend to ‘screen’ them out; however, spectral analysis remains more discerning. The authors observe that GAN-generated/adulterated images demonstrate a spectral increase in high-frequency components, in comparison to real images:

On the left, an unadulterated source image, and its related spectrogram; middle, the same image amended by the StarGAN framework; rightmost, the tampered image, but with added frequency manipulation from StatAttack, restoring the 'authentic' spectrogram signature of the original image.
On the left, an unadulterated source image, and its related spectrogram; middle, the same image amended by the StarGAN framework; rightmost, the tampered image, but with added frequency manipulation from StatAttack, restoring the 'authentic' spectrogram signature of the original image.

Statistical brightness is another spectrographic ‘tell’ which StatAttack can address:

On the left, the brightness distribution found in an image from a real face dataset; middle, the brightness found in a fake image generated by ProGAN, where saturation and underexposure are not reproduced effectively enough, resulting in an excessively 'perfect' histogram; right, the addition of StatAttack brightness distribution restores the authentic histogram in the faked imag
On the left, the brightness distribution found in an image from a real face dataset; middle, the brightness found in a fake image generated by ProGAN, where saturation and underexposure are not reproduced effectively enough, resulting in an excessively 'perfect' histogram; right, the addition of StatAttack brightness distribution restores the authentic histogram in the faked imag

Data and Approach

The researchers for the new project have, perhaps confusingly, created two versions of StatAttack, the second being titled MStatAttack; the latter generates more effective and natural-seeming ‘adversarial examples’ (i.e., fakes and edits) at some cost of architectural complexity and processing time. Though we’ll consider this when looking at the results of tests (see below), we will in general address the entire system, and presume that MStatAttack, as the more effective of the two methods, would be the one that would be deployed in real-world scenarios.

The approach for StatAttack’s amendment to the brightness of altered content is based on prior work from many of the same authors as the new paper, which was designed to use adversarial patterns to prevent objects from being identified by image recognition systems.

To add Gaussian blur, the authors adopt a learning approach, where the blur module is itself trainable and adaptable to the target material. This aspect of StatAttack leans on further prior work from some of the paper’s authors, this time dealing with the artificial generation of motion blur effects.

The authors' prior project, which sought ways to effectively recreate motion blur effects. Source: https://proceedings.neurips.cc/paper/2020/file/0a73de68f10e15626eb98701ecf03adb-Paper.pdf
The authors' prior project, which sought ways to effectively recreate motion blur effects. Source: https://proceedings.neurips.cc/paper/2020/file/0a73de68f10e15626eb98701ecf03adb-Paper.pdf

The adversarial noise component of the StatAttack architecture was easier to simulate, by minimizing the an adversarial loss function and then adding it to the fake image, while enforcing sparsity, to ensure that the effect does not begin to ‘saturate’ (sparsity optimizes data to remove non-contributing numbers from the enforcing matrix, similar to the way that 16-point floats can cut the size of an AI model in half, with – arguably – minimal effect on the outcome).

For the more complex MStatAttack approach, the weights and parameters of each successive attack pattern are jointly optimized while calculating the optimal objective function (i.e., the ‘degrading’ process itself):

MStatAttack's more involved approach.
MStatAttack's more involved approach.

Tests

The researchers used four datasets in comparative tests for StatAttack and MStatAttack: for whole-face synthesis, NVIDIA’s StyleGAN2 and ProGAN; for modification, StarGAN; and for face-swapping, the DeepFake dataset provided by FaceForensics++.

For real face images, in addition to the FaceForensics++ dataset, the authors used CelebA and FFHQ.

The datasets used.
The datasets used.

The metric used for the tests was the Attack Success Rate (ASR) approach BRISQUE, where a higher image score actually denotes lower image quality.

For spatial-based detectors, the researchers used four classification models: ResNet50, EfficientNet-b4, DenseNet, and Google’s 2017 offering MobileNets. For frequency-based detection systems, the systems used were DCTA and DFTD.

The baseline attacks employed as reference for the tests were the frequently-used PGD and FGSM – ‘static’ attack methods which are specific to technologies used. In addition, the authors compared StatAttack to MIFGSM and VMIFGSM – two approaches which claim transferability across methodologies (i.e., they should keep working as the target synthesis technologies develop and evolve).

The size of the input images used was changed to 256x256px, a standard resolution for attack models. MStatAttack used 3 layers (rather than the single layer for StatAttack), and a batch size of 30.

The majority of the results, perhaps unfortunately, have been concatenated into a single and difficult-to-navigate results table (pictured below – see the paper for better resolution), rather than handled separately, but we’ll run some of the most significant results here.

Concatenated results across all the tests. Refer to paper for better legibility.
Concatenated results across all the tests. Refer to paper for better legibility.

Attacks were simulated in both a white box and black box setting (i.e., tested for cases where access to the originating technologies and infrastructure are either assumed or denied).

Regarding results for the tested attacks on spatial-based detectors, the authors state:

‘In the white-box setting, our attack achieves competitive results with [baselines]. Moreover, in comparison to StatAttack, MStatAttack is able to select a more efficient combination of perturbations, resulting in an improved attack success rate […]

‘In the black-box setting, we use adversarial examples generated by a specific detector to attack other detectors, enabling us to evaluate the transferability of our attack methods […]

‘[When] using adversarial examples crafted from ResNet on the DeepFake dataset, we achieve transfer attack success rates of 65.9%, 81.3%, and 71.2% on EfficientNet, DenseNet, and MobileNet, respectively.’

In the image above, we can see a) the stark distributional contrast between the fake and real images; b) the almost total assimilation achieved by the StatAttack method; and c), d), the failure of competing methods by comparison.

Regarding the attacks on frequency-based detector, the paper states:

‘The attack success rates of our proposed attack are significantly higher than those of baseline attacks in both white-box and black-box settings. Moreover, our method can alter the frequency components of the fake images, resulting in generated adversarial examples closer to natural images in terms of frequency information.’

Results from the frequency-based test rounds.
Results from the frequency-based test rounds.

Finally, the authors compared the realism of the affected images across all methods, using BRISQUE:

Results from the BRISQUE realism tests across four datasets.
Results from the BRISQUE realism tests across four datasets.

Here, the authors state:

‘[Our] attack method can maintain image quality while achieving high attack success rates. [Our] method (i.e., StatAttack) can [generate] natural-looking adversarial examples, and [MStatAttack] can further enhance the realism of the adversarial examples compared to StatAttack.’

Conclusion

The new paper achieves important work in at least one respect, in that it emphasizes the need for visual effects pipelines to give greater consideration to the contextualization of synthetic generation within original (target) source material; and that, perhaps, the time has come to consider that the 98% effectiveness of Style Transfer, which the research community has been leaning on complacently for quite a long time, isn’t going to stay effective as audiences become more immured to facial replacement, and more discerning of the results.

To date, the locus of research has centered on obtaining accurate physiognomic results, and tackling some of the thornier problems therein, such as the difficulty in generating certain obtuse facial angles; but, as these bugbears diminish, the problem of authentic placement is certainly going to become a more prominent concern.

At the moment, many of these issues are handled, within the VFX community, by low-AI or zero-AI solutions such as tracking and application of traditional grain filters, or other methods of degrading images down to the target domain. Eventually, this work should move into a more central position within the generative pipeline itself.

This will involve a greater study of context, film emulsions (in the case of recreating truly ‘historic’ effects), and the general peculiarities and eccentricities of digital photography, accounting for such anomalies as jpeg compression and other kinds of artefacts, which will need to be simulated as accurately as the faces being interpolated into the target material.

More To Explore

AI ML DL

Solving the ‘Profile View Famine’ With Generative Adversarial Networks

It’s hard to guess what people look like from the side if you only have frontal views of their face; and the chronic lack of profile views in popular datasets makes this a stubborn data problem that’s standing in the way of 360-degree facial synthesis. Now, researchers from Korea are offering a method that might alleviate this traditional roadblock.

AI ML DL

Repairing Demographic Imbalance in Face Datasets With StyleGAN3

New research from France and Switzerland uses Generative Adversarial Networks (GANs) to create extra examples of races and genders that are under-represented in historical face datasets, in an effort to offset controversies such as the tendency for facial recognition systems to fail to recognize (or to over-recognize) particular types of people.

It is the mark of an educated mind to be able to entertain a thought without accepting it.

Aristotle