What’s cool (and a bit funny) about OpenAI GPT-4o?

By Staff Reporter Last updated May 14, 2024

OpenAI releases GPT-4o, a free-to-use upgrade to their powerful language model, offering freelancers a leap forward in AI capabilities for their work. This multimodal AI can handle text, voice, and even images, making it a versatile tool for various tasks.

OpenAI has released GPT-4o, the next iteration of its popular GPT-4 language model, which is free for all users. This upgrade promises a significant leap forward in interactive AI capabilities, which could be particularly useful for freelancers. It’s early days though, so some limitations could leave you laughing.

GPT-4o goes beyond just text, now incorporating image and voice processing. This means you can have a two-way audio conversation with the platform.

Imagine a new AI model that can chat with you naturally, just like a person. You can talk to it (audio), show it pictures (image), or type text, and it will understand and respond in any way you like – text, voice, or even an image.

It can respond in about 320 milliseconds, which is similar to how long it takes a human to think and reply. This model is especially good at understanding different languages and even things seen in images and videos, better than any AI model before it.

Here’s an example, which can be witnessed via this video. It comes off a bit corny, but the technology breakthrough more than makes up for it.

OpenAI rumours

Alon Yamin, co-founder and CEO of Copyleaks, says all of these new options could be part of a build-up for the launch of a rival Google search engine:

“The release of OpenAI’s GPT-4o is intriguing because it follows a rumour that OpenAI is developing a search engine that could rival Google’s Chrome. With this release, ChatGPT will become similar to an AI assistant, like Alexa or Siri. GPT-4o enhances capabilities across text, image, and audio with real-time responsiveness, highlighting how OpenAI is intentionally putting the focus on the interaction between technology and humans, which can offer important feedback for possible future offerings, like a search engine.

“Nevertheless, no matter how advanced the technology gets, keeping some skepticism around it is crucial because while it might be more advanced, it still isn’t without potential flaws, such as misinformation. So, the need for tools to set guardrails is becoming more essential as this technology evolves and becomes easier for the public to interact and integrate it into their lives.”

Boosting efficiency

For writers, GPT-4o can assist with research, content creation, and even brainstorming ideas. Imagine a freelance journalist being able to quickly gather information and generate drafts, allowing them to focus on crafting compelling narratives.

Beyond text: A boon for creatives

Freelance graphic designers and web developers can use GPT-4o’s image generation capabilities to create mockups, design elements, or even generate initial drafts based on text descriptions. This can significantly speed up the design process and free up time for further refinement.

Voice and image

Voice and image give you more ways to use ChatGPT. For example, you could snap a picture of a landmark while travelling and have a live conversation about what’s interesting about it. When you’re home, you could snap pictures of your fridge and pantry to figure out what’s for dinner (and ask follow-up questions for a step-by-step recipe). After dinner, help your child with a math problem by taking a photo, circling the problem set, and having it share hints with both of you.

The inclusion of voice processing opens exciting possibilities for freelancers. Imagine a freelance translator being able to leverage GPT-4o to convert audio recordings into text, and then translate the text to another language. This could streamline workflows and increase productivity for in-house communication materials or social media posts. However, there are some limitations, some of which are quite humorous, as you can watch in this video.

Open AI is rolling out voice and images in ChatGPT to Plus and Enterprise users in the coming weeks. Voice is coming on iOS and Android (opt-in in your settings) and images will be available on all platforms.

Focus on safety and security

OpenAI emphasises that safety remains a priority. They’ve implemented safeguards to mitigate potential risks associated with the new features.

While GPT-4o is currently limited in its initial release, it represents a significant step forward for AI accessibility. Freelancers should keep a close eye on this evolving technology, as it holds the potential to enhance the way they work.