The Morning After: Google claims ‘unprecedented photorealism’ from its new text-to-image AI
Google has shown off a new artificial intelligence system that can create images based on text input. Its Imagen diffusion model, created by the Brain Team at Google Research, offers “an unprecedented degree of photorealism and a deep level of language understanding.”
This isn’t the first time we’ve seen AI models like this. OpenAI’s DALL·E (and its successor) performed similar witchcraft, turning text into visuals. Google’s version, however, tries to create more realistic images. The researchers created a benchmark and asked humans to assess each image from a range of AIs. They “prefer Imagen over other models in side-by-side comparisons, both in terms of sample quality and image-text alignment,” Google said.
It’s not available to the public, and there are reasons for this. “Datasets of this nature often reflect social stereotypes, oppressive viewpoints, and derogatory, or otherwise harmful, associations to marginalized identity groups,” the researchers wrote. Imagen has inherited the “social biases and limitations of large language models” and may depict “harmful stereotypes and representation.” The team said the AI encodes social biases, including a tendency to create images of people with lighter skin tones and place them in certain stereotypical gender roles. The …read more