Cécile Eymard from Nurun’s Montreal office also contributed to this article.
Nurun worked on a prototype that combines new technologies with insights from a recent ethnographic sprint. The objective of the overall project was to improve the shopping experience for visually impaired people with the use of new technologies. The insights that emerged from the ethnography served as foundational elements for our technological analysis and eventually lead us to prototyping. We learned a few surprising things at every step of the process, from the way visually impaired people use technologies, to the inner workings of computer vision.
It all started with a simple question: how do the blind go shopping? It’s the kind of questions that this guy answers on his YouTube channel.
We were planning on refreshing our knowledge of computer vision technologies, but were looking for a way to constrain it with real user requirements—the visually impaired seemed like the perfect audience to have in mind.
We started with an ethnographic study to gain more insight into the challenges that the visually impaired face when they go shopping. We then used those insights as inspiration for a prototype idea enhanced by an exploration of the latest in computer vision technologies.
Designing for the visually impaired requires more empathy than other target audiences. Therefore, conducting an ethnographic study was crucial to the success of this project. As we discovered, our assumptions were slightly off from the actual realities of the visually impaired. For instance, their knowledge of accessible technologies surpassed what we initially thought, so much so that they had already ruled out some technical approaches for enhancing their life, such as guided indoor navigation. Furthermore, we discovered that the visually impaired are methodical in their approaches to adopting new tools, which does not leave a big margin of error for developers. It was really clear that good intentions were not enough, but that understanding their true realities was much more important.
We interviewed five people who had different levels of visual impairment and who lived in different social contexts. We wanted to understand the role that technology currently plays in their lives and how they used technology, if at all, to overcome the challenges and anxieties related to shopping with a visual impairment.
Throughout our ethnographic interviews and the analysis that followed, a few key elements emerged:
Example of a tool: A contrast plate to see food (low vision)
These insights led us to a brainstorming session, guided by what we had learned.
We set out to develop an idea that would enhance, rather than replace, a skill mastered by the visually impaired. It was tempting to develop a navigation system that would allow the visually impaired to go around the store by themselves, but we resisted as our informants were adamant that the technology is still not as reliable as human assistants.
Instead, we came up with a mobile application that augments the autonomy of the visually impaired by allowing them to manage their own shopping list, empowering them to direct their shopping assistant.
Furthermore, we sought to develop this application in a way that would also be useful for the sighted.
After identifying these three principles, we were ready to prototype.
To achieve the features of our prototype, we looked into three technological topics:
Computer Vision: As computers interact with the real world, they are developing the ability to understand image data. Computer Vision is a key component of many advanced technologies such as robotics, medical image analysis and surveillance. In our day-to-day life, computer vision is responsible for technologies such as OCR (character recognition), facial recognition (such as with advanced auto-focus on modern cameras) and even QR Codes. While computer vision tends to be computationally intensive, the improvements to processing capabilities make computer vision increasingly possible, especially in the mobile space.
After looking at several image recognition solutions for our prototype such as IQ Engines, Qualcomm Vuforia, Pointcloud, Kooaba, Layar, Metaio and Open CV, we choose Moodstocks. Moodstocks provides an impressive real-time image recognition engine that can track more than 1,500 images—10 times more than what competitor solutions can recognize.
The size of the image set in itself, however, was insufficient for our needs, so we decided to combine indoor location data to switch data sets as the user walks inside the store, for a potentially unlimited number of images that can be recognized.
Moodstocks is able to recognize a product even when it is partially occluded. In addition, when an occluded product is recognized, but the visible part of the product is not enough to distinguish between similar products, Moodstocks correctly identifies this match as a partial match, which allows the application to provide better feedback and directives to the user. While partial matching is much more advanced than other solutions, it does not work in all cases. For instance, when color was the only difference between different products, Moodstocks does not recognize the products as being different.
Another key limitation of the non-specialized image recognition solutions that we analyzed is that they rely on image features (edges) being present on the objects needing recognition. Smooth, textureless images and images that include highly repetitive patterns are not very well recognized. Organic elements such as vegetables do not work with any of the solutions that we tested.
Indoor location technologies allow us to track the location of a device inside a building, which is usually unreachable by GPS. To assist our indoor location requirements, we chose to use iBeacons, a new technology introduced in iOS 7.
Beacon region monitoring is a technique that allows devices to be notified when a beacon is in close proximity or when the beacon is no longer in range. Beacon region monitoring is similar in many ways to circular region monitoring (Geo-fencing) with the exception that it doesn’t have inherent location coordinates. Beacon monitoring is the most precise of location technologies that exists today, and this was appropriate for giving an accurate shopping list based on the aisle in which the user was standing.
Speech synthesis is the ability to convert text into speech. For our prototype, we used the speech synthesis system that’s available in iOS’s Accessibility menu.
The resulting experience is a voice guide throughout the shopping experience. The guide accompanies users from the moment they step into the store to when they leave. With every step, users’ locations are used to efficiently guide them through their shopping list, and they are only notified about nearby elements. At each step, users maintain full control over their shopping.
The result is a functional prototype that combines ethnographic insights and the newest technologies to enhance the shopping experience of the visually impaired. The technologies that we analyzed are impressive, however some limitations remain, especially around image recognition. Even if our idea does not remove all of the frictions that the visually impaired face while shopping, it has the potential to enhance their experience significantly.