Apple Debuts AI Image Editor That Turns Text Into ‘Photoshop-Style’ Tweaks
By Mikelle Leow, 09 Feb 2024
Image generated on AI tested on Apple’s MLLM-Guided Image Editing (MGIE) / Hugging Face
Apple is skipping the lasso tool and magic brush in its leap into the photo editing world. The Cupertino tech giant has introduced MLLM-Guided Image Editing (MGIE), an artificial intelligence model that transforms typed commands into “Photoshop-style” edits. Developed in partnership with the bright minds at the University of California, Santa Barbara, MGIE leverages multimodal large language models (MLLMs) to make professional-level photo manipulation easy for and accessible to the masses.
Instead of getting them to drag their cursor around a canvas, MGIE enables users to tweak images with just a few keystrokes, be it for basic adjustments like cropping and adding filters or more intricate edits such as altering the brightness of specific objects or even reshaping them entirely.
Want to brighten a gloomy photo? Just type “add contrast to simulate more light,” and voilà, your image is as sunny as a summer afternoon. Fancy changing the color of a laptop screen in a photo? Tell MGIE to “let the laptop have a green web page,” and watch your photo turn into a mockup. In another example shared by the researchers, MGIE adds more veggies to a pepperoni pizza when guided to “make it more healthy.”
Apple's new AI model could revolutionize Image Editing ð¼
— Gina Acosta (@ginacostag_) February 7, 2024
Apple has released an exciting new AI tool called MGIE (MLLM-Guided Image Editing) that allows users to edit photos through natural language instructions. This technology aims to simplify image editing using the power of… pic.twitter.com/og8SjmFYQo
It’s also useful for those who often find themselves lost for words when coming up with prompts—the engine’s prowess lies in its ability to digest simple or even vague instructions, translating them into precise editing actions.
Apple'sð¤©MGIE is now opensource! Edit your images with natural language
— Gradio (@Gradio) February 5, 2024
ð¡ðð®ð¥ðð¢ð¦ð¨ððð¥ ððð-ðð®ð¢ððð ðð¦ðð ð ððð¢ðð¢ð§ð or MGIE derives more expressive instructions for the model to follow the edits & leads to notable improvements in editing
Project linksð pic.twitter.com/1gEaJlozB6
MGIE runs on two core functions: understanding user prompts through the lens of multimodal language models and visualizing the requested edits to bring them to life. This dual capability enables anyone with a keyboard to get creative without mastering complex software.
While the tool is yet to make an official launch in the form of an app, eager users can get a taste of its capabilities on platforms like GitHub and Hugging Face.