Top insights for IT pros
From cybersecurity and big data to cloud computing, IT Brew covers the latest trends shaping business tech in our 4x weekly newsletter, virtual events with industry experts, and digital guides.
A text-only exchange with ChatGPT is so last year.
In a May 13 presentation, pros at Microsoft-backed artificial-intelligence company OpenAI demonstrated the org’s new AI model—GPT-4o—and its ease comprehending problems like written algebra, code screenshots, and, for good measure, Italian.
The OpenAI team displayed GPT-4o’s lightning-fast, multimodal nature—a capability that IT pros who spoke with IT Brew said may speed up applications in fields like programming and customer-service.
“The fact that they’ve built one model that can interpret and ingest multiple mediums and multiple formats is really impressive from a technology perspective; that’s a pretty big, fundamental shift in the architecture and the way the models work,” Christine Livingston, managing director and global leader of AI services at business-consulting firm Protiviti, told IT Brew.
Along with an announcement of ChatGPT’s desktop version, OpenAI pros demonstrated GPT-4o’s ability to quickly process text, audio, and imagery—and respond with similar output.
- Voice: The OpenAI team prompted GPT-4o to respond in emotive ranges, from theatrically dramatic to purposely robotic.
- Emotion: The model registered a user’s emotional state, interpreting smiles and the sounds of heavy, panic-like breathing.
- Math on paper: “Don’t tell me the solution. Just give me hints along the way,” Barret Zoph, OpenAI’s head of post-training, told GPT-4o during the demo, asking for help with the handwritten equation 3X+1 = 4. (“Think about what operation would undo multiplication,” GPT-4o responded. “Is it subtraction?” Zoph asked. “Close, but think of the opposite of multiplication,” GPT-4o said, with the speed of a veteran teacher.)
- Code on screen: After being shown on-screen code, GPT-4o, upon request, recited the code’s function: to fetch weather data.
Top insights for IT pros
From cybersecurity and big data to cloud computing, IT Brew covers the latest trends shaping business tech in our 4x weekly newsletter, virtual events with industry experts, and digital guides.
Livingston sees the GPT’s screen-reading ability as game-changing, particularly with codewriting, where someone could go from visual workflow to running program, without having to describe an environment.
“You really start to get to a true no-code user interface, in a sense. It’s changing the way that some of the development processes work,” Livingston said.
Generative AI investments, according to Bloomberg Intelligence, are poised to make up 10–12% of IT hardware, software, services, advertising and gaming expenditures by 2032, up from less than one percent currently.
Tobias Dengel, author of the 2023 book The Sound of the Future and president of digital-services provider and Telus International company WillowTree, found the GPT’s speed “breathtaking” and sees consumer assumptions and companies’ customer-service offerings climbing as AI responsiveness increases.
“The expectations are immediately going to go to, ‘Why is ChatGPT able to do this stuff, but I can’t just ask my United Airlines app [for example] to switch my seat?’” Dengel told IT Brew.
“And you’re not going to want to go through ChatGPT to switch your United seat. You’re going to want to have it on the United Airlines website or app,” he said.
Dengel also appreciated the GPT’s math skills.
“In theory, you’re going to get to a place, real quick, where the best private tutor in the world is ChatGPT or similar system,” Dengel said.
But who, or what, knows, really?