Morfternight #99: Hands-on with ChatGPT-4 Vision
The one where we experiment with ChatGPT's new image feature.
Welcome to Morfternight, the digital postcard
about Photography, Leadership, and AI.
If you enjoy it, share it with your friends.
đ tl;dr
Today, we navigate through Viennaâs concealed courtyards, reflect on the end of a popular product and the integrity of academic trails, journey through the evolution of machine learning, and dive hands-on into the visual capabilities of ChatGPT-4.
đ Good Morfternight!
Greetings from Vienna!
This is Morfternight #99, the last with two digits. Weâll hit three figures in two weeks as next weekend Iâll be at the ICANN conference in Hamburg, Germany, and then Iâll meet a couple hundred fellow Automatticians in Spain after that. With seven days of travel starting Friday, I know I wonât have time to prepare the issue between now and Thursday.
In the meantime, enjoy Morfternight #99, see yâall for Halloween! đ
đˇ Photo of the week
Have you ever been to Vienna? If so, you might have noticed something unusualâmost of the cityâs building façades arenât visible from the streets. Itâs a delightful paradox! Instead, a maze of quiet courtyards tucks them away, offering locals an escape from the urban buzz. You wonât spot these hidden gems unless youâre looking at aerial photos or stumbling upon the few open to the public, connecting parallel streets like secret passageways.
What I love about these courtyards isnât just the quiet. Itâs also their lushness. Youâll find them filled with trees, ivy, and greenery. Itâs like stumbling upon a micro-forest in the heart of the city.
đşď¸ Three places to visit
â Can you believe itâs been a decade since Google pulled the plug on its RSS Reader? The tale behind this much-missed product is fascinating, and âWho Killed Google Reader?â tells us how it couldâve been a game-changer for Google. Whatâs your go-to tool for keeping tabs on your favorite sites?
Iâve got a soft spot for the WordPress.com Reader. Itâs from Automattic, so of course, Iâm a fan. Itâs a straightforward, no-frills solution for devouring website content. But if youâre after moreâa tool that lets you save highlights, Twitter threads, videos, PDFs, books, and even emailsâReadwise Reader is your best bet. Iâm a fan of both, using each for different purposes.
Do you use either of these or have you found another gem worth sharing?
⥠They Studied Dishonesty. Was Their Work a Lie? Hereâs a long piece in the New Yorker that made me sad. I read most of Dan Arielyâs books and have found his work interesting and effective in its real-life applications. Apparently, Ariely and another behavioral economist, Francesca Gino, took some liberty with the data they used for their research.
I donât have enough information to have an opinion. Still, Iâd wait a second before throwing Arielyâs work into the trash because I have found that many of his recommendations and findings worked in my work or personal life.
⢠Ever feel like discussions about machine learning are either too technical or too sensational? Finally, hereâs a conversation that strikes a good balance. It traces the past decadeâs evolution in machine learning that gave us todayâs Large Language Models (LLMs) and also dives into its broad societal implicationsâfrom politics to economics.
What sets this discussion apart? Itâs a nuanced conversation, a rare thing in this polarized world. Itâs a refreshing deep dive that appreciates the complexities of the matters discussed.
What are your thoughts on how machine learning is shaping our world? Do you find the conversation as illuminating as I do?
đ¤ Hands-on session with ChatGPT-4 vision
OpenAI enabled the new Image feature on my ChatGPT account a few days ago. In a nutshell, it means I can now upload images and ask ChatGPT to work with them. I started playing with it, and I want to share with you the results, some of which are impressive. I took screenshots of the chats so you can visualize them here without having to click on links and also because sharing chats with images is not yet supported by ChatGPT.
First, a few failuresâŚ
ChatGPT fails the geography test.
I mean, it got Brazil right, but at this point, I am not sure if itâs random. I tried with Europe and got the same type of results. This surprised me a bit, as I imagined the tool being trained on maps.
No, but seriously, where is Waldo?
This one is interesting because it shows how similar Waldoâs images are to each other from an LLMâs perspective. ChatGPT clearly answers are based on different pictures than the one I uploaded.
The conversation was a bit longer, but even after detailing the steps to follow, it got it wrong each time, with precise descriptions of Waldoâs surroundings that are not in this image.
Then, one that is almost right and yet already impressive.
Where was this photo taken?
This is a photo of Camogli, just over 10km east of Genova. The response is very accurate, considering that no recognizable landmark is visible.
And now, the excellent answers
Where was this photo taken? (again)
With recognizable landmarks, ChatGPT nails it.
Recognizing a molecule of glucose
It is glucose, indeed.
Solving Integral calculations
The result is correct, by the way.
Generating ALT text for images to improve website accessibility.
This is a pretty big one. Visually impaired people using special reader devices to browse the web need images to be described with words.
Itâs often overlooked because it is very time-consuming. Using AI to solve this problem is a clear, immediate benefit of the technology.
Criticizing a photo and providing suggestions for improvement.
The analysis is not revolutionary, but ChatGPT makes a great point about trying to move the horizon away from the middle of the image.
I barely scratched the surface over the weekend. Iâll try more things over the following days.
What do you think? Have you also tried this feature? Do you have ideas you want me to try?