In recent years, several companies have developed apps or implemented technology which uses artificial intelligence to describe photos to blind people. In 2016 Facebook released a feature which auto-generates alt text for photos uploaded to the social network. A year later, Microsoft released Seeing AI, an app with multiple features, one of which is analysing photos. As is typical for Apple, the company has built auto-generated image descriptions into their screen reader, Voiceover, which can be found on all their devices.
With the release of iOS 14, Apple included lengthier image descriptions which should provide context to photos, enabling a blind person to fully understand what is happening. Rather than saying “two people, table” the description might now say “two people sitting at a table, probably eating dinner.” Apple also very clearly differentiates between descriptions of everyday objects and content which it deems to be of an adult nature. Voice-over users can customise content filters, either receiving a verbal warning or a sound alert to let them know that an image may contain adult content. I wanted to test the feature to see if I could receive high quality descriptions, or whether Apple would filter certain aspects.
First, I visited a website selling adult toys. I wanted to see if AI generated descriptions would be able to describe the toys, or if they would even flag them as adult. The descriptions I received were generally underwhelming. Typically, vibrators were assigned descriptors like “cylindrical object,” or “probably a bottle.” I feel this has more to do with the feature still being fairly new than it does to do with gatekeeping and denying us access to information.
Secondly, I investigated how iOS recognises photos of people wearing lingerie. I used the Asos app, viewing various kinds of underwear. The descriptions I received, whilst not overly descriptive, were fairly accurate. Photos were described as “a person wearing a black bra,” or “a person wearing a bikini,” or “black lacy panties.”
I took screenshots from the Asos app and tested them in Microsoft SeeingAI, to see how another service which uses artificial intelligence to generate image descriptions would describe them. The app identified one of the photos as “probably a woman wearing a bra,” but for the other two photos I only received responses of “probably a woman taking a selfie.” This doesn’t indicate whether SeeingAI will censor adult content, as the quality of image descriptions varies so widely it could simply be a case of the app choosing certain features to focus on.
Finally, I wanted to see how pictures of actual naked bodies would be described. I felt extremely uncomfortable searching the internet for someone else’s photos, so I took photos of myself. This also enabled me to control the test a little more. I started by taking a photo of the top half of my body while I was wearing a bra. Again, it described this as “a person was wearing a bra,” which showed consistency between image descriptions in apps and in the camera.
When I took my bra off, getting descriptions was a little harder. At first, I was only able to generate “possible adult content, photo of an adult.” I took photos of myself from different angles and with varied lighting, receiving a description of “possible adult content, photo of a naked person.” I only took photos of the top half of my body, so I don’t know how it would handle entire naked pictures, or whether specific body parts would be identified. I also didn’t upload these photos to Seeing AI to compare descriptions because the app requires an internet connection and to upload photos to the cloud, something I was not comfortable doing.
The fact that we’re getting this level of description, explicitly identifying naked bodies, is extremely important. Historically blind people have been denied access to information which is deemed to be of an adult nature. When technology is developed, content filters are applied with no way of removing them. Effectively, tech companies have decided what blind people should be allowed to know. Apple have chosen not to do this, by building this feature into their devices and enabling us to control whether we view descriptions of adult content they have made a statement regarding our right to information.
Even services with the mission of providing blind people with visual interpreting do not allow for this. When working directly with an interpreter, whether they are comfortable with the content or not, it would be a violation of company policy to ask for descriptions. Perhaps AI can fill the gap. If Apple chooses to continue to develop this technology, we might be able to receive very detailed descriptions, enabling us to view media online or take photos for a partner. Finally, we’ll have a choice.
Discover more from Catch These Words
Subscribe to get the latest posts sent to your email.