OpenAI Introduces DALL-E 2: A New AI System That Can Create And Edit Realistic Images And Art From A Description In Natural Language
This research summary is based on the paper 'Hierarchical Text-Conditional Image Generation with CLIP Latents' Please don't forget to join our ML Subreddit
New research by the OpenAI team has released a new version of DALL-E, its text-to-image generation tool. DALL-E 2 is a higher-resolution and lower-latency variant of the original system, generating images based on user-written descriptions. It also has additional features, such as altering an existing image.
According to researchers, UnCLIP is partly immune to a quite amusing shortcoming of CLIP: humans can deceive the model’s identification capabilities by naming one object (such as a Granny Smith apple) with a term meaning something other (like an iPod)
In addition, the researchers have implemented some built-in precautions such as:
- The model is trained using data that had some offensive material removed, restricting the algorithm’s capacity to create unpleasant stuff. Although it may theoretically be cut out, there is a watermark identifying the AI-generated origin of the work.
- The model can’t create identifiable faces based on a name as a preventative anti-abuse feature. Even asking for the Mona Lisa would supposedly return a version of the painting’s genuine visage.
The researchers state that DALL-E 2 will be tested by vetted partners. Users are not allowed to upload or create photographs that are “not G-rated” or “may cause harm,” such as hate symbols, nudity, obscene gestures, or “big conspiracies or events relating to important ongoing geopolitical events.
The team aims to maintain a tiered process to keep reviewing how to securely distribute this technology based on the feedback they receive. Users are required to declare the use of AI in creating the photographs, which further cannot be shared with others via an app or website. However, the team plans to include it in its API toolbox in the future, allowing it for third-party apps.
Paper: https://cdn.openai.com/papers/dall-e-2.pdf
References:
- https://www.nytimes.com/2022/04/06/technology/openai-images-dall-e.html
- https://www.theverge.com/2022/4/6/23012123/openai-clip-dalle-2-ai-text-to-image-generator-testing
- https://openai.com/dall-e-2/
suggested
Reference-www.marktechpost.com