ChatGPT's âpowerful new image engineâ
There seems to be some excitement around âChatGPTâs powerful new image engineâ, but as ever, its functional understanding of the world seems limited. I first learned about the new system when some some smart aleck on X sent me an example of the new system trying to label a bike (an example I have considered before), with the caption âUh ohâ, apparently believing that my longstanding challenges to image generation had been solved. It does look impressive on first inspection, better than some examples I showed here before. But if you look closely, there are several errors, and those errors are revealing. For example, the rear center-pull (?) brake is mislabeled as the seat stay, and the big gear on the back is mislabeled as the rear brake. There is a label for a spoke that is pointing to blank space. In many modern bikes, of course, a rear brake can be found back there, but not in this diagram. Instead this system has combined a typical position for a modern disc brake system with a diagram of an older (though still in use) caliper (or similar) system. The system doesnât actually understand how the various parts function. And of course there are literally hundreds of labeled bikes on the internet as a quick Google Image Search would reveal. (Which is why my usual test here has been a tandem bike, to make things a little more challenging.) To up the degree of difficulty, I asked ChatGPT to âplease draw a taller than average tandem bike, and include a bike rack and panniersâ, which is not something you could readily find on the internet, and not something I used here before, and got this: Bike nuts would have a field day finding problems with this. (Feel free to drop your favorite error in the comments). Suffice to say that most people donât stuff their rear derailleur in the back wheel. And I donât even know what to say about that ârear brake leverâ, or the saddle-shaped rear handlebar, let alone the ârear brakeâ that is somehow part of the rear rack. As in the first example, the lack of functional understanding is manifest. Of course, to be fair, the average human couldnât complete this task, either. But anybody knowledgeable about bikes (racers, mechanics, designers etc) would immediately see numerous problems. And honestly is anybody tall enough to ride in the front?
Send this story to anyone â or drop the embed into a blog post, Substack, Notion page. Every play sends rev-share back to Marcus on AI.