Functional Low Vision: A SLAM perspective

Representing geometric 33D data efficiently is significant challenge.

If you are interested in efficient computation, you might be interested in the question of how much vision is enough for a particular task in the real world. There are animals around us that have a much finer sense of vision than us humans, and of course, there are animals with very poor vision. For example, Elephants have limited angles of vision with a superior ability to see forward rather than sideways or backward. Eyesight is considered poor in bright light, but in dim light, movement can be detected by elephants at up to 45m.

How can we mathematicians /computer scientists understand low vision? In this article, I will use a SLAM or computer vision approach. In the next article, I will also consider a principled data science explanation.

Before you read further, here are my disclaimers: I know very little about the anatomy of the eyes or the neuroscience of the visual cortex. All I have to say is theorized from personal experience

(Okay, I am blind in one eye and according to my doctor there is about 5% vison in the other eye. )

So, back to the question of low/efficient vision. Let’s begin with a scenario

Suppose you travel to a new country, call it the coconut-land, and suppose it’s your first time traveling to coconut-land. Now on your first day you go for a walk towards the woods all by yourself. There, you notice something wobbly and black at a distance ploding its way towards you. Since you are new to this country, you are a bit apprehensive, and you start wondering, “Is that a bear?” But then, all of a sudden, you remember that you read in the travel guide that the native bear species has been hunted to extinction, and there are no bears in coconut-land. So you keep going, continuing towards the bear-like thing. But now you are really getting confused, as you perceive the UWO(uunidentified wobbly object) even more as a bear The hair on your back starts standing up, you feel more scared, and a shot of boubt creaps up, “I am sure the guide-book said there are no bears in these woods?” And as you are about to take to your heels, you finally understand that the wobbly object is a teenager in a bear costume, and it is Holoveen in the coconut-land!

Notice that being in a “foreign country” that is in unfamiliar territory is a very important component of this story. In our everday setup our knowledge and memory fill in the information blanks that our visual perception creates and most people do not experience the scenario in everyday experience.

So, going back to the original question, what has this story got to do with functional low vision? Here are some observations.

People with low vision are guessing about things at much shorter distances, half a meter/meter. I can see a person, but I cannot tell who it is as the information I can squeeze out of my retina is of lower resolution (or so I think). This partly explains why functioning low vision is more common among people who have had such conditions since their childhood. When people acquire vision loss at a later stage in life, not knowing what lies at half a meter away might be too scary. Children seem not to know this fear.
A comparison with SLAM algorithms: A fundamental problem in robots corresponds to how robots should create a map of its environment (a place the robot has not visited before, to make matters worse) and know its own location on this map. As such, all moving creatures, including animals and in particulars humans, also must solve this problem and know where exactly we are relative to, say, their home. Of course, we take this ability in animals to be granted, birds fly back to their nests from far away indeed, but it is not absolutely obvious as to how works out. A classical algorithm that solves this problem is the famous Kalmann filter which helped the Apollo missions land on the moon. It is with algorithm and its more modern derivatizes such as Monte Carlo filters, that I want to compare functioning low vision here.

The Kalman algorithm is an application of Bayesian statistics. It starts with making a guess of the location and the map. This first guess could have very little knowledge fed into it or it could be a well -educated guess, but we have to dare a guess to start with. The Kalman algorithm assume a normal Gaussian distribution for its guess and other algorithms have different initial choice. Every observation that we can make is then used to update the guess in essentially the most effective way. Kalman algorithm uses Bayes formula to achieve this. But here is where something different happens in functioning low vision (again remember that this might be my personal bias) Every new observation or information is processed with past memories interfered by logic and/or emotions before we update our believe of the situation. But only certain memories attend to this update.

Attention

So here is a personal anecdote. Just after COVID lockdown was lifted I got a take away coffee from a metro-station backery in Vienna and drank it on the street. Once finished, I wanted to get rid of the empty coffee cup and saw a orange stip somewhere. So I started walking towards it, and at a closer distance I realized that it was woman with a florescent orange stipe on her dress. Now why did I end up decide to walk up to this woman if all I search a way to get rid of my coffee cup? If you have not been to Vienna have a look at the picture below:

So how can we improve the Kalman filter? As many people are trying it today the answer is no surprise at least for the technology avialbel today. Use an attention based neural network to improve the prediction of a filter. Attention based neural networks or transforms have been the rage since their very successful application in large language models (LLMs) like chat GPT. This is just anecdotal justification to use them in efficient vision computations.

Well if you are an AI or computer-vision expert, I hope that I convienced you to try a low vison approach towards solving SLAM problems.

Shantanu Dave

Functional Low Vision: A SLAM perspective

Share

Get updated

Thank you for your response. ✨

Functional Low Vision: A SLAM perspective

Share this:

Share

Get updated

Thank you for your response. ✨