As deepfakes continue to surge globally, generative AI firm OpenAI is taking steps to combat misleading content created by its popular image generator, DALL-E.
Earlier last week, the company announced it will be releasing a “deepfake detector” tool that can identify manipulated images, audio, and video generated by AI on its earlier text-to-image model, DALL-E 3.
According to OpenAI, internal testing shows that the deepfake detector can accurately identify 98 percent of DALL-E 3 images and has a low false positive rate of less than 0.5 percent. However, before releasing it to the public, OpenAI will first share the tool with a select group of disinformation researchers to test its effectiveness in real-world situations.
How the deepfake detector works
The deepfake detector provides a binary response, either “true” or “false”, indicating the likelihood of an image being generated by DALL-E 3. The tool can also display a straightforward content summary indicating that “this content was generated with an AI tool”. This summary includes fields that ideally flag the “app or device” and AI tool used to create the content.
To develop the tool, OpenAI has implemented metadata to all images created and edited by DALL·E 3. This metadata can be used to verify the source of the content. The aim of the detector is to identify and analyze information to prevent the spread disinformation online.
Not bullet-proof
While the deepfake detector is an important step in maintaining the integrity of information on the internet, the current version of the tool does have its limitations. For example, OpenAI only tested the deepfake detector with DALL-E 3 images, meaning it may not work against newer models or other popular generators such as Midjourney and Stability.
To address some of these limitations, OpenAI will be joining forces with leading tech companies, Google and Microsoft, in the steering committee for the Coalition for Content Provenance and Authenticity (C2PA). The companies will contribute to the development of ethical stands across the industry.
The C2PA was established to provide a means of displaying when and how digital content was created with AI — similar to a “nutrition label” for digital content.
“Over time, we believe this kind of metadata will be something people come to expect, filling a crucial gap in digital content authenticity practices,” OpenAI wrote in a statement.
OpenAI to roll out watermarking tool
To further these efforts, OpenAI announced that it is developing a “tamper-resistant watermarking” tool for its digital content. This tool will involve adding an invisible signal to content such as audio, making it difficult to remove without causing noticeable degradation.
The watermarking tool will also include capabilities to detect whether the content was generated using generative models, providing an additional layer of protection against misuse. Although these tools do not prevent deepfakes, it makes it harder for a user to fake and alter the information on the digital content itself.

Deepfakes have surged by the tenfold
The AI industry is under growing pressure to take responsibility for the content generated by their products. Recent advancements in machine learning and AI have significantly sped up the development and distribution of deepfakes, leading to widespread concern.
According to recent estimates, 500,000 video and voice deepfakes were shared on social media platforms in 2023. This trend is particularly concerning in certain regions, such as the Middle East and Africa (MEA), which saw a 450 percent increase in deepfakes since 2022, as well as North America who reported a 1750 percent surge, and Europe experienced a 780 percent rise.
Although OpenAI’s new deepfake detector may help stem the problem, experts believe more should be done. Or as Rajesh Nambiar, Chairman for Nasscom puts it: “Today, we don’t have a silver bullet for it. It is something that will have to be tackled by all stakeholders,” he said, – as the fight against deepfakes continue.