OpenAI Rolls Out More Powerful Version Of GPT-4 Turbo
By Mikelle Leow, 11 Apr 2024
Photo 246538623 © Rafael Henrique | Dreamstime.com
OpenAI recently released GPT-4 Turbo with Vision, a new iteration of its large language model (LLM). This update, newly available to developers, incorporates visual understanding capabilities alongside traditional natural language processing.
Previously, LLMs like GPT-4 were limited to text-based inputs and outputs. GPT-4 Turbo with Vision expands this functionality by allowing developers to integrate image analysis into their applications. The model can interpret the content of images and answer questions about them, potentially leading to a wider range of applications. Notably, this is the first time OpenAI has made this technology available to third-party developers.
While core text processing capabilities remain consistent with GPT-4 Turbo, the new model offers additional functionalities. It can not only analyze images but also extract and return text embedded within them. For example, a user could provide an image of a restaurant menu and receive the listed food options.
The primary focus of GPT-4 Turbo with Vision appears to be streamlining workflows for developers accessing the OpenAI model through an API. OpenAI suggests the update will eliminate the need for separate image and text models, potentially improving development efficiency when making apps.
It’s important to acknowledge the model’s limitations. While GPT-4 Turbo with Vision can recognize object relationships in images, it may struggle with detailed location-based questions. OpenAI acknowledges this, stating the model might not pinpoint the exact location of an object within an image.
“For example, you can ask it what color a car is or what some ideas for dinner might be based on what is in you fridge, but if you show it an image of a room and ask it where the chair is, it may not answer the question correctly,” says OpenAI.
Additionally, the model may misinterpret rotated text or images.
The update also includes an improved knowledge cutoff date, reflecting the latest information the model was trained on. Previously, the cut-off date was April 2023. GPT-4 Turbo with Vision’s knowledge base now extends to December 2023, highlighting ongoing development and improvement.
Following after, OpenAI will bring GPT-4 Turbo with Vision to ChatGPT, opening access to ChatGPT Plus and Enterprise subscribers.
[via ZDNET, Gadgets 360, Tom’s Guide, cover photo 246538623 © Rafael Henrique | Dreamstime.com]