Recently, it seems text-to-image generators have been the hottest new trend surfacing in the AI community. Just write a piece of text into these programs and they will generate an extremely realistic and accurate image to match your description. Before, it was OpenAI’s DALL-E that was the leader in this text-to-image generation field, but now Google has come out with its own program called Imagen.
Introducing Imagen, a new text-to-image synthesis model that can generate high-fidelity, photorealistic images from a deep level of language understanding. Learn more and and check out some examples of #imagen at https://t.co/RhD6siY6BY pic.twitter.com/C8javVu3iW
— Google AI (@GoogleAI) May 24, 2022
How it works
The only way to understand the capabilities of the systems is to go over some of the images they can generate. Each image is created from the text entered into a prompt that’s fed to the Imagen program itself. The output becomes, as Google puts it, an image with “unprecedented photorealism”.
You just type what you want and the program generates it for you. You can find some samples on the Imagen page, but you have to look at these with a grain of salt. When research models are released, the teams behind them tend to cherry-pick some of the best results. So while they may look awesome, they may not truly represent the average output given by the system.
Alright, Google has also made their own AI tool that creates photorealistic images from text prompts! This one is called Imagen. Very similar to DALL-E. SCARY GOOD resultshttps://t.co/RdlHzyv53v pic.twitter.com/JDr4Cl2CDO
— Marques Brownlee (@MKBHD) May 25, 2022
This AI program should not be confused with something like reverse Google image searches, because this program creates something new and unique from the text provided to it. Google’s Imagen AI system is not currently available to the general public mainly because of the fact that it is not completely ready. Another reason is that the model is continuously learning with every new image it generates, so if it was released to the public, people might misuse it. This might end up teaching the AI inappropriate content like generating highly accurate but fake images that may be used as hoaxes or for harassment.
As stated by Google themselves, these systems will encode social and racial biases which means their output is often sexist, racist, and sometimes extremely toxic. It will be very interesting to see what tools like Imagen and DALL-E bring to the world with their innovative new technologies.
Photo credit: The feature image is symbolic and has been taken by Bekky Bekks.