Morfternight #99: Hands-on with ChatGPT-4 Vision
The one where we experiment with ChatGPT's new image feature.
Welcome to Morfternight, the digital postcard
about Photography, Leadership, and AI.
If you enjoy it, share it with your friends.
👇 tl;dr
Today, we navigate through Vienna’s concealed courtyards, reflect on the end of a popular product and the integrity of academic trails, journey through the evolution of machine learning, and dive hands-on into the visual capabilities of ChatGPT-4.
👋 Good Morfternight!
Greetings from Vienna!
This is Morfternight #99, the last with two digits. We’ll hit three figures in two weeks as next weekend I’ll be at the ICANN conference in Hamburg, Germany, and then I’ll meet a couple hundred fellow Automatticians in Spain after that. With seven days of travel starting Friday, I know I won’t have time to prepare the issue between now and Thursday.
In the meantime, enjoy Morfternight #99, see y’all for Halloween! 🎃
📷 Photo of the week
Have you ever been to Vienna? If so, you might have noticed something unusual—most of the city’s building façades aren’t visible from the streets. It’s a delightful paradox! Instead, a maze of quiet courtyards tucks them away, offering locals an escape from the urban buzz. You won’t spot these hidden gems unless you’re looking at aerial photos or stumbling upon the few open to the public, connecting parallel streets like secret passageways.
What I love about these courtyards isn’t just the quiet. It’s also their lushness. You’ll find them filled with trees, ivy, and greenery. It’s like stumbling upon a micro-forest in the heart of the city.
🗺️ Three places to visit
① Can you believe it’s been a decade since Google pulled the plug on its RSS Reader? The tale behind this much-missed product is fascinating, and ‘Who Killed Google Reader?’ tells us how it could’ve been a game-changer for Google. What’s your go-to tool for keeping tabs on your favorite sites?
I’ve got a soft spot for the WordPress.com Reader. It’s from Automattic, so of course, I’m a fan. It’s a straightforward, no-frills solution for devouring website content. But if you’re after more—a tool that lets you save highlights, Twitter threads, videos, PDFs, books, and even emails—Readwise Reader is your best bet. I’m a fan of both, using each for different purposes.
Do you use either of these or have you found another gem worth sharing?
② They Studied Dishonesty. Was Their Work a Lie? Here’s a long piece in the New Yorker that made me sad. I read most of Dan Ariely’s books and have found his work interesting and effective in its real-life applications. Apparently, Ariely and another behavioral economist, Francesca Gino, took some liberty with the data they used for their research.
I don’t have enough information to have an opinion. Still, I’d wait a second before throwing Ariely’s work into the trash because I have found that many of his recommendations and findings worked in my work or personal life.
③ Ever feel like discussions about machine learning are either too technical or too sensational? Finally, here’s a conversation that strikes a good balance. It traces the past decade’s evolution in machine learning that gave us today’s Large Language Models (LLMs) and also dives into its broad societal implications—from politics to economics.
What sets this discussion apart? It’s a nuanced conversation, a rare thing in this polarized world. It’s a refreshing deep dive that appreciates the complexities of the matters discussed.
What are your thoughts on how machine learning is shaping our world? Do you find the conversation as illuminating as I do?
🤖 Hands-on session with ChatGPT-4 vision
OpenAI enabled the new Image feature on my ChatGPT account a few days ago. In a nutshell, it means I can now upload images and ask ChatGPT to work with them. I started playing with it, and I want to share with you the results, some of which are impressive. I took screenshots of the chats so you can visualize them here without having to click on links and also because sharing chats with images is not yet supported by ChatGPT.
First, a few failures…
ChatGPT fails the geography test.
I mean, it got Brazil right, but at this point, I am not sure if it’s random. I tried with Europe and got the same type of results. This surprised me a bit, as I imagined the tool being trained on maps.
No, but seriously, where is Waldo?
This one is interesting because it shows how similar Waldo’s images are to each other from an LLM’s perspective. ChatGPT clearly answers are based on different pictures than the one I uploaded.
The conversation was a bit longer, but even after detailing the steps to follow, it got it wrong each time, with precise descriptions of Waldo’s surroundings that are not in this image.
Then, one that is almost right and yet already impressive.
Where was this photo taken?
This is a photo of Camogli, just over 10km east of Genova. The response is very accurate, considering that no recognizable landmark is visible.
And now, the excellent answers
Where was this photo taken? (again)
With recognizable landmarks, ChatGPT nails it.
Recognizing a molecule of glucose
It is glucose, indeed.
Solving Integral calculations
The result is correct, by the way.
Generating ALT text for images to improve website accessibility.
This is a pretty big one. Visually impaired people using special reader devices to browse the web need images to be described with words.
It’s often overlooked because it is very time-consuming. Using AI to solve this problem is a clear, immediate benefit of the technology.
Criticizing a photo and providing suggestions for improvement.
The analysis is not revolutionary, but ChatGPT makes a great point about trying to move the horizon away from the middle of the image.
I barely scratched the surface over the weekend. I’ll try more things over the following days.
What do you think? Have you also tried this feature? Do you have ideas you want me to try?