
ChatGPT developer OpenAI unveiled ChatGPT-4o Image Generation on the 25th, an image-generating artificial intelligence (AI) model that accurately draws a conceptual diagram that can be printed in a textbook immediately if you say, “Please draw a picture explaining Newton’s prism experiment” without explaining complex scientific principles in detail.
While the existing image-generating AI had to provide detailed information on a specific scene in words or writing, the new model has become sophisticated enough to draw a conceptual diagram by understanding specific scientific principles or physical concepts by itself by simply mentioning simple concepts. This function can also be used in the free ChatGPT version. According to OpenAI, the biggest difference between the existing image-generating AI DALL-E and the new model is how well you understand the language. In order to create the desired conceptual diagram using DALL-E, the new model has to input each phrase that goes into the conceptual diagram, but the new model identifies the user’s intention and generates images as well as appropriate phrases.
Even without explaining that Newton’s prism experiment is an experiment that proves that there are various colors in visible light through a prism that refracts light, AI understands the principle of prism optics and creates images and phrases to explain it.

In particular, the new model is highly utilized in that it creates appropriate phrases by itself and places them in the right place, unlike when inputting the same level of command in a different way, the words in the concept map are often generated as wrong contents or as “alien words” that cannot be read. When asked to “make posters showing various types of whales,” they create images and phrases exactly according to the type and name of the whales. Existing models do not understand the relationship between images and phrases, and often fail to express broken letters or make them match images and names, but new models perform much more complex instructions well. It is expected that the new model will be used to produce educational graphics, promotional pamphlets, and webtoons containing information.
Even complicated requests can be made easier than existing models. For example, if you ask me to draw a bicycle with a ‘triangular wheel’, it will be difficult to draw a new model easily, even though it was not trained to run. OpenAI explained that the underlying technologies of the two models are completely different, although they are upgraded versions of the existing Dali in that the new model creates images. “This model, which started training two years ago, is the best image generating AI that integrates text and images to the best of our knowledge,” said Gabriel Go, head of OpenAI multimodal. “Image AI is moving from novelty that draws an imagined picture to usefulness that draws accurate graphics.”
From this day on, the image generation function of the new model can be used by free subscribers as well as paid subscribers such as ChatGPT Pro in Korean.