OpenAI and Be My AI collaborate to redefine tech inclusivity

Posted On: October 20, 2024
Posted By: Arijit Goswami
Comments: 0

People who can’t see may no longer be at a disadvantage compared to those who can. Because Be My Eyes and Open AI have joined hands and created a new app by the name “Be My AI” to help the blind people experience what usually is the luxury of the sighted people. And that’s making waves in the world of inclusive AI because no one had ever imagined that blind people would get so interested in clicking pictures through their smartphones.

What is the app all about?

The app developed as part of a collaboration between Open AI and Be My Eyes leverages generative AI to produce a piece of text that describes the details of any image you upload to it. All a person needs to do is to click a photograph of something using the app, upload it and the app produces an AI-generated description of what all exist in the photograph. Imagine a blind person travelling with a sighted friend to the Eiffel Tower and the latter explaining the grandeur of the monument in plain language, for the blind person to appreciate what a wonder the tower is. Now replace the sighted friend with the app and voila… you have an AI assistant helping to visualize what’s around.

Has anyone done this before?

To some extent, yes. Microsoft had launched Seeing AI – a tool to extract basic information from images and to interpret the message contained in written text. However, it missed some essential details and could seldom identify the less prominent entities in a picture. Be My AI takes such analysis to a different level altogether. The app by Be My Eyes identifies even the subtle and obscure objects in an uploaded image and can identify the colours of objects in the pictures.

Why it matters?

The app matters because it aims to help blind people get a sense of their surroundings with just a few clicks. It helps them stay more aware of their environment and ask for finer details in any picture. For example, a blind individual can now independently click a picture of a restaurant menu and ask GPT-4 for saying out the prices aloud. Or a blind person can sway their smartphone to click multiple pictures and ask GPT-4 to explain what all exist around them, just to get a sense of safety of their surroundings.

What’s for Tomorrow?

The app is still in beta testing phase and is improving based on user suggestions to adequately serve the needs of the blind people. It certainly needs to explain in detail how the privacy of the users and the shard images will be protected. Moreover, the accuracy of image descriptions is also a concern as any wrong interpretation of an uploaded image can lead to misunderstanding among the blind users and can misguide their actions. Nonetheless, the emergence of such applications do further the frontiers of technology and make them inclusive for the larger population.