ChatGPT Images 2.0

<- Back

ChatGPT Images 2.0

wahnfrieden

Comments (200)

dakiol
> On the flip side, there are hundreds of ways that these tools cause genuine harm, not just to individuals but to entire systems.Yeah, agree. I think it's the first time I'm asking myself: Ok, so this new cool tech, what is it good for? Like, in terms of art, it's discarded (art is about humans), in terms of assets: sure, but people is getting tired of AI-generated images (and even if we cannot tell if an image is AI-generated, we can know if companies are using AI to generate images in general, so the appealing is decreasing). Ads? C'mon that's depressing.What else? In general, I think people are starting to realize that things generated without effort are not worth spending time with (e.g., no one is going to read your 30-pages draft generated by AI; no one is going to review your 500 files changes PR generated by AI; no one is going to be impressed by the images you generate by AI; same goes for music and everything). I think we are gonna see a Renaissance of "human-generated" sooner rather than later. I see it already at work (colleagues writing in slack "I swear the next message is not AI generated" and the like)
kibibu
Genuine question: what positive use cases are sufficient to accept the harm from image generators?One that i can think of:- replacing photography of people who may be unable to consent or for whom it may be traumatic to revisit photographs and suitable models may not be available, e.g. dementia patients, babies, examples of medical conditions.Most other vaguely positive use cases boil down to "look what image generators can do", with very little "here's how image generators are necessary for society.On the flip side, there are hundreds of ways that these tools cause genuine harm, not just to individuals but to entire systems.
madrox
This seems like a great time to mention C2PA, a specification for positively affirming image sources. OpenAI participates in this, and if I load an image I had AI generate in a C2PA Viewer it shows ChatGPT as the source.Bad actors can strip sources out so it's a normal image (that's why it's positive affirmation), but eventually we should start flagging images with no source attribution as dangerous the way we flag non-https.Learn more at https://c2pa.org
simonw
I've been trying out the new model like this: OPENAI_API_KEY="$(llm keys get openai)" \ uv run https://tools.simonwillison.net/python/openai_image.py \ -m gpt-image-2 \ "Do a where's Waldo style image but it's where is the raccoon holding a ham radio" Code here: https://github.com/simonw/tools/blob/main/python/openai_imag...Here's what I got from that prompt. I do not think it included a raccoon holding a ham radio (though the problem with Where's Waldo tests is that I don't have the patience to solve them for sure): https://gist.github.com/simonw/88eecc65698a725d8a9c1c918478a...
porphyra
The improvement in Chinese text rendering is remarkable and impressive! I still found some typos in the Chinese sample pic about Wuxi though. For example the 笼 in 小笼包 was written incorrectly. And the "极小中文也清晰可读" section contains even more typos although it's still legible. Still, truly amazing progress. Vastly better than any previous image generation model by a large margin.
skybrian
This time it passed the piano keyboard test:https://chatgpt.com/s/m_69e7ffafbb048191b96f2c93758e3e40But it screwed up when attempting to label middle C:https://chatgpt.com/s/m_69e8008ef62c8191993932efc8979e1e
ea016
Price comparison:GPT Image 2 Low : 1024×1024 $0.006 | 1024×1536 $0.005 | 1536×1024 $0.005 Medium : 1024×1024 $0.053 | 1024×1536 $0.041 | 1536×1024 $0.041 High : 1024×1024 $0.211 | 1024×1536 $0.165 | 1536×1024 $0.165 GPT Image 1 Low : 1024×1024 $0.011 | 1024×1536 $0.016 | 1536×1024 $0.016 Medium : 1024×1024 $0.042 | 1024×1536 $0.063 | 1536×1024 $0.063 High : 1024×1024 $0.167 | 1024×1536 $0.25 | 1536×1024 $0.25
swalsh
Been using the model for a few hours now. I'm actually reall impressed with it. This is the first time i've found value in an image model for stuff I actually do. I've been using it to build powerpoint slides, and mockups. It's CRAZY good at that.
dazhbog
Yay, let's burn the planet computing more slopium..
minimaxir
HN submission for a direct link to the product announcement which for some reason is being penalized by the HN algorithm: https://news.ycombinator.com/item?id=47853000
nickandbro
200+ points in Arena.ai , that's incredible. They are cleaning house with this model
throwaway2027
I know people like to dunk on ChatGPT and Gemini and say Claude is or used to be better, but you can still use worse models when you're out of usage AND make use of Nano Banana and and ChatGPT Image generation with separate limits for your subscription. I think it could make it a more package as a whole for some people (non-programmers). I do like having the option and am excited for which improvements they've done to ChatGPT Image generation because in the past it had this yellow piss filter and 1.5 it sort of fixed it but made things really generic with Nano Banana beating it (altough Gemini also had a too aggressively tuned racial bias which they fixed), it seems the images ChatGPT generates have gotten better.
dktp
One interesting thing I found comparing OpenAI and Gemini image editing is - Gemini rejects anything involving a well known person. Anything. OpenAI is happy to edit and change every time I triedI have a sideproject where I want to display standup comedies. I thought I could edit standup comedy posters with some AI to fit my design. Gemini straight up refuses to change any image of any standup comedy poster involving a well know human. OpenAI does not care and is happy to edit away
6thbit
System card link with safety details https://deploymentsafety.openai.com/chatgpt-images-2-0direct pdf https://deploymentsafety.openai.com/chatgpt-images-2-0/chatg...
joegibbs
The quality of the text is really impressive and I can’t seem to see any artefacts at all. The fake desktop is particularly good: Nano Banana would definitely slip up with at least a few bits of the background.
louiereederson
The image of the messy desktop with the ASCII art is so impressive - the text renders, the date is consistent, it actually generated ASCII art in "ChatGPT", etc. I was skeptical that it was cherry-picked but was able to generate something very similar and then edit particular parts on the desktop (i.e. fixing content in the browser window and making the ASCII dog "more dog like"). It's honestly astounding, to me at least.
Oras
My test for image models is asking it to create an image showing chess openings. Both this model and Banana pro are so bad at it.While the image looks nice, the actual details are always wrong, such as showing pawns in wrong locations, missing pawns, .. etc.Try it yourself with this prompt: Create a poster to show opening game for Queen's Gambit to teach kids to play chess.
amunozo
This is not as exciting as previous models were, but it is incredibly good. I am starting to think that expressing thoughts in words clearly is probably the most important and general skill of the future.
ibudiallo
And here I was proud of myself, having taught my mom and her friends how to discern real from fakes they get on WhatsApp groups. Another even more powerful tool for scammers. I'm taking a break.
volkk
the guys presenting are probably all like 25x smarter than I am but good god, literally 0 on screen presence or personality.
____tom____
No mention of modifying existing images, which is more important than anything they mentioned.I think we all know the feeling of getting an image that is ok, but needs a few modifications, and being absolutely unable to get the changes made.It either keeps coming up with the same image, or gives you a completely new take on the image with fresh problems.Anyone know if modification of existing images is any better?Anything better that OpenAI?
modeless
Can it generate transparent PNGs yet?
bensyverson
I caught the last minute of this—was it just ChatGPT Images 2.0?
muyuu
I wonder if this will be decent at creating sprite frame animations. So far I've had very poor results and I've had to do the unthinkable and toil it out manually.
kanodiaayush
It stands out to me that this page itself is wonderful to go through (the telling of the product through model generated images).
samiwami
do they have anything similar to SynthID, or are they just pretending that problem doesn't exist?I know this is probably mega cherry-picked to look more impressive, but some of the images are terrifyingly realistic. They seem to have put a lot of effort into the lighting.
RigelKentaurus
If every single image on their blog was generated by Images 2.0 (I've no reason to believe that's not the case), then wow, I'm seriously impressed. The fidelity to text, the photorealism, the ability to show the same character in a variety of situations (e.g. the manga art) -- it's all great!
hahahacorn
One of the images in the blog (https://images.ctfassets.net/kftzwdyauwt9/4d5dizAOajLfAXkGZ7...) is a carbon copy of an image from an article posted Mar 27, 2026 with credits given to an individual: https://www.cornellsun.com/article/2026/03/cornell-accepts-5...Was this an oversight? Or did their new image generation model generate an image that was essentially a copy of an existing image?
etothet
I would love to see prompt examples that created the images on the announcement page.
vunderba
OpenAI’s gpt-image-1.5 and Google’s NB2 have been pretty much neck and neck on my comparison site which focuses heavily on prompt adherence, with both hovering around a 70% success rate on the prompts for generative and editing capabilities. With the caveat being that Gemini has always had the edge in terms of visual fidelity.That being said, gpt-image-1.5 was a big leap in visual quality for OpenAI and eliminated most of the classic issues of its predecessor, including things like the “piss filter.”I’ll update this comment once I’ve finished running gpt-image-2 through both the generative and editing comparison charts on GenAI Showdown.Since the advent of NB, I’ve had to ratchet up the difficulty of the prompts especially in the text-to-image section. The best models now score around 70%, successfully completing 11 out of 15 prompts.For reference, here’s a comparison of ByteDance, Google, and OpenAI on editing performance:https://genai-showdown.specr.net/image-editing?models=nbp3,s...And here’s the same comparison for generative performance:https://genai-showdown.specr.net/?models=s4,nbp3,g15UPDATES:gpt-image-2 has already managed to overcome one of the so‑called “model killers” on the test suite: the nine-pointed star.Results are in for the generative (text to image) capabilities: Gpt-image-2 scored 12 out of 15 on the text-to-image benchmark, edging out the previous best models by a single point. It still fails on the following prompts:- A photo of a brightly colored coral snake but with the bands of color red, blue, green, purple, and yellow repeated in that exact order.- A twenty-sided die (D20) with the first twenty prime numbers (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71) on the faces.- A flat earth-like planet which resembles a flat disc is overpopulated with people. The people are densely packed together such that they are spilling over the edges of the planet. Cheap "coastal" real estate property available.All Models:https://genai-showdown.specr.netJust Gpt-Image-1.5, Gpt-Image-2, Nano-Banana 2, and Seedream 4.0https://genai-showdown.specr.net?models=s4,nbp3,g15,g2
gfody
there's something funny going on with the live stream audio
biosubterranean
Oh no.
thelucent
It seems to still have this gpt image color that you can just feel. The slight sepia and softness.
thevinter
Every time a new image gen comes out I keep saying that it won't get better just to be surprised again and again. Some of the examples are incredible (and incredibly scary. I feel like this is truly the point where understanding if something is AI becomes impossible)
dzonga
for video game assets this is massive.but in general though - will people believe in anything photographic ?imagine dating apps, photographic evidence.I'm guessing we're gonna reach a point where - you fuck up things purposely to leave a human mark.
Melatonic
We were afraid it would be Skynet and instead we got the ultimate meme generator !
throw310822
Ok, I can hear the sound of entire industries crumbling right now.
ai4thepeople
Each day when my AI girlfriend wakes me up and shows me the latest news, I feel: This is it! We are living in a revolution!Never before in history did humanity have the possibility of seeing a picture of a pack of wolves! The dearth of photographs has finally been addressed!I told my AI girlfriend that I will save money to have access to this new technology. She suggested a circular scheme where OpenAI will pay me $10,000 per year to have access to this rare resource of 21th century daguerreotype.
minimaxir
Model card for the API endpoint gpt-image-2 (which may or may not reflect the output from ChatGPT Images 2): https://developers.openai.com/api/docs/models/gpt-image-2API Pricing is mostly unchanged from gpt-image-1.5, the output price is slightly lower: https://developers.openai.com/api/docs/pricing...buuuuuuuuut the price per image has changed. For a high quality image generation the 1024x1024 price has increased? That doesn't make sense that a 1024x1024 is cheaper than a 1024x1536, so assuming a typo: https://developers.openai.com/api/docs/guides/image-generati...The submitted page is annoyingly uninformative, but from the livestream it proports the same exact features as Gemini's Nano Banana Pro. I'll run it through my tests once I figure out how to access it.
Melatonic
Can it generate anything high resolution at increased cost and time? Or is it always restricted?
anon
undefined
retrac98
The page keeps crashing on my iPhone 17 Pro.
bitnovus
great obfuscation idea - hidden message on a grain of rice
ChrisArchitect
Fake layouts, fake handwritten kid story, fake drunk photos? All from training on real things people did.As with anything AI, we are not ready for the scale of impact. And for what? Like, why are you proud of this?
Bennettheyn
fal has the endpoint under openai/gpt-image-2
ieie3366
It's great. Also doesn't seem to have any "slop" standard look, the images it produces are quite diverse.I would imagine this will hit illustrators / graphics designers / similar people very hard, now that anyone can just generate professional looking graphical content for pennies on the dollar.
szmarczak
Wow, the difference between AI and non-AI images collapses. I hate the future where I won't be able to tell the difference.
esafak
https://openai.com/index/introducing-chatgpt-images-2-0/
bitnovus
No gpt-5.5
rqa129
Can it generate Chibi figures to mask the oligarchy's true intentions on Twitter and make them more relatable?
simonw
Suggest renaming this to "OpenAI Livestream: ChatGPT Images 2.0"
sho_hn
In 5 years and 3 months between DALL-E and Images 2.0 we've managed to progress from exuberant excitement to jaded indifference.
brianbest101
[dead]
zb3
Image generation? Hmm, would be cool if OpenAI also made a video-generation model someday..
aliljet
I am hopeful that OpenAI will potentially offer clarity on their loss-leading subscription model. I'd prefer to know the real cost of a token from OpenAI as opposed to praying the venture-funded tokens will always be this cheap.