# Introductory post, and the invariance problem

There are several topics I would like to talk about in the future posts, and they generally fall under the category of theoretical (or systems) neuroscience, and sometimes more broadly biological physics. The topics to be discussed include: the problem of invariance in theoretical neuroscience, Schrodinger’s take on physics of living matter and other modern thoughts on fundamental principles underlying biology (optimality principle, role of noise, etc), dynamics and computation: are they mutually exclusive concepts?, correlations (correlations in statistical physics, and the role of correlation in an ensemble of neurons), reinforcement learning, and more. I will start with the post on invariance.

The problem of invariance, in computational neuroscience, is commonly thought of as an effort to understand the nature of the brain’s cognitive capacity, invariant despite the presence of latent features, which is irrelevant to the task. Researchers in machine learning and computer vision also seem to be interested in the topic, because (among many other reasons) there are a lot of commercial interests in building algorithms which can recognize useful features in a robust manner. As a theoretical neuroscientist, I am interested in the problem of invariance because the phenomenon seems to be present in various modalities, such as vision, audition, somatosensation, and present across different scales. In the hopeful case, understanding the problem of invariance is perhaps linked with something universal in the information processing of the brain.

First, how can we define invariance? As it refers to the invariance in performance, one has to define a task, and then define latent variables which the performance is invariant under. Consider an example where a subject has to discriminate the direction of a motion of a bar along one axis, say x. In other words, the bar presented twice, and the second bar is shifted slightly, and the subject is forced to decide whether the shift was in the left or the right direction. Suppose that the first bar is at position x, and the second bar is at position x+dx or x-dx where dx is much smaller than x. If the choice of x is at random and the subject does not know what x is explicitly, then x is the latent variable. The problem can be made more complicated (and relevant) by introducing other variables such as the thickness/color/contrast of a bar, etc. These problems are simplified versions of a problem where a subject detects the direction of motion of an arbitrary object regardless of the nature of the visual input produced by the object. This is what we do everyday, when we notice the direction of objects that are moving around us, and it is clear that the real brain can do these tasks well regardless of what the objects are, as long as they produce considerable signals in the visual periphery.

Then can we mathematically define invariance with the task above? The naive and simplest answer would be the following: the performance on the task is exactly the same for each latent variable. This is not wrong, but it is too strict of a definition for the theoretical neuroscientists to say something meaningful about the brain. According to this definition, the brain is certainly not invariant, as shown by numerous psychophysics tests. For example, we are better at discriminating between the orientation of bars around the cardinal angles, compared to the oblique angles [1]. The point is not on the exactly same performance per each latent variable. One has to note that when neuroscientists say invariant recognition/discrimination, what they really refer to is a robust performance. There are two ways one can mathematically define this robustness. One way to describe “invariance” is by comparing the performance to a lower bound given by available information (e.g. Cramer-Rao bound given by Fisher information). In this sense, achieving Cramer-Rao optimality for each latent variable will suffice as an invariant performance. Another definition of invariance is by using the notion that physicists like: “extensivity.” If a performance of a particular network structure on a task can be described as a function of signal-to-noise ratio of a network output, then does this signal-to-noise ratio grow linearly with the size of a network N (which corresponds to the number of neurons)? If the extensivity of a neural model agrees with that of an optimal decoder, at least we know that the performance of a network is qualitatively the same as an optimal decoder up to a coefficient (even when the performance of a model is not exactly optimal).

The natural question that follows is, what is a candidate principle for reading out information from sensory response, such that the invariant performance is achieved? As it turns out, the presence of nonlinearity is crucial for achieving an invariant performance (linear decoder is proven to fail the invariant bar discrimination task [2]). Then, what kind of nonlinearly is needed for the readout which can achieve an invariant performance? Answering this question is closely related to the role of correlations in population coding. In the next post, I will discuss the issue of correlations in neural decoding.

[1] Girshick, Ahna R., Michael S. Landy, and Eero P. Simoncelli. “Cardinal rules: visual orientation perception reflects knowledge of environmental statistics.”Nature neuroscience 14.7 (2011): 926-932.[2] Seung, H. S., and H. Sompolinsky. “Simple models for reading neuronal population codes.” Proceedings of the National Academy of Sciences 90.22 (1993): 10749-10753.