Recent discoveries clarified the role of Layers 2, 3 and 4 of the neocortex as recognizers and predictors of feedforward patterns. However, the roles of Layers 5 and 6 are still unclear. In this paper, I propose that the function of Layer 5 is to recognize patterns representing needs for manipulations: actions that can be both physical (e.g., motor actions) and mental (e.g., cognitive tasks such as mental rotations). Such manipulations are used to help us survive (e.g. fight, find food) and to disambiguate the local context of feedforward input (i.e., understanding to which object do sensorial stimuli belong to).
Introduction
Recent research backed by experimental data shed a light on how sensorial processing might work. The neocortex is made up of 6 layers, labelled from the outermost inwards 1 to 6. Neurons in the neocortex are organized into regions, and each region is made up of many cortical columns, which span across all 6 layers.
Each region is tasked with processing coming from a given range of inputs; for example, one region might be responsible for processing tactile stimuli. Each column in that region is activated when a given pattern of incoming stimuli is recognized: for example, one column might activate whenever the right index touches a sharp edge along a given direction (i.e., whenever it receives incoming stimuli from the tactile neurons in our right thumb in a pattern indicating contact with a sharp edge).
Each column not only receives feedforward stimuli from the receptive field of the region it belongs, but also receives feedback stimuli from other regions of the neocortex or lateral stimuli from adjacent columns. These stimuli help the column recognize the context in which the feedforward stimuli is being detected (Hawkins, 2011). For example, a column might receive feedback that the right hand is holding a teacup; in such case, when the right index will be in contact with the top circular edge of the cup, and the column will receive feedforward input indicating the edge, it will know not only that it is an edge, but also that it is the teacup’s edge. Such context will help the column to predict its future input.
When a region receives a previously learnt pattern of stimuli, the corresponding column activates and fires the neurons encoding the local context of such stimuli. There exists three possible scenarios summarized in Table 1 below. They all follow the same pattern: identification of previously learnt contexts – incoming feedforward stimuli – selection of which and how many neurons to fire.
- Based on the feedback / lateral stimuli, the column identifies a single previously-learned context. Now, an incoming feedforward stimuli that matches the pattern the column is trained to recognize arrives to the region. The column fires only a few of its neurons; the set of neurons it fires encode the local context. The meaning of this event is “the column recognized its stimulus pattern in that local context”.
- Based on the feedback / lateral stimuli, the column identifies a few previously-learnt contexts. They might all apply to the situation at hand, or the column might have incomplete information which led it to entertain a few possible contexts but could not narrow it down to a single one yet. In such case, when an incoming feedforward stimuli that matches the pattern the column is trained to recognize arrives to the region, the column fires multiple sets of its neurons, each set encoding one possible local context. The meaning of this event is “the column recognized its stimulus pattern in a few possible local contexts”.
- Despite feedback / lateral stimuli, the column does not recognize any previously-learnt context which is supposed to contain the stimulus it is tasked to recognize. When an incoming feedforward stimuli that matches the pattern the column is trained to recognize arrives to the region, the column fires all of its neurons, possibly to represent the fact that all local contexts are possible. In fact, if the column did not recognize any previously-learnt context, it means it has no idea about the current local context; if it has no idea about the current local context, it is like if all contexts were possible.

An example of local context determination.
Let’s take a region tasked with processing touch sensations. The region is composed of many columns, each of them tasked with recognizing a particular pattern of tactile sensations. Imagine a blindfolded person sitting in front of a teacup, without seeing it. The person rests one of her fingers over the top edge of the teacup. The column responsible with recognizing when the fingertip touches an edge becomes active. As it has no idea of the local context of the edge, all possibilities about the nature of the object the edge pertains to are open; therefore, the column fires with all its neurons. Now, the person starts moving its finger around the cup, following the edge for about one inch. The column responsible with recognizing when the fingertip touches an edge keeps firing; however, this time, only with a subset of its neurons. By integrating feedback from other parts of the brain, this column now knows that the edge is longer than one inch, and is thus able to eliminate possibilities that do not apply to this scenario. Only the neurons which correspond to possible scenarios remain active. This process of disambiguation is presented in Table 2 below. Then, imagine that the person keeps moving her finger around the circular edge of the cup until the circle is completed. Now, the tactile region knows that the object being touched must include a deep circular edge. Only a few of the known object match this criteria; for example, a glass, a teacup or a can. Thus, only a few of the cells of the column remain active – those that correspond to local patterns corresponding to objects matching this criteria. Finally, imagine that the person keeps her finger on the edge of the teacup while using the other hand to remove her blindfold, thus seeing what lies in front of her. Now, our column only fires with a handful of neurons – those corresponding to the local context “teacup”.
This example has been freely adapted from “A Theory of How Columns in the Neocortex Enable Learning the Structure of the World” (Hawkins, Ahmad, & Cui, 2017). It has to be noted that a given concept or context is not represented by a single neuron, but by a handful of them, which form a sparse distributed representation. This last point has been left out from this paper as not necessary to understand the hypotheses presented in it.

The function of L5
The previous example shows how action helps with disambiguating context. Action enables collection of data. This progressively eliminates irrelevant local contexts, until the range of possible contexts converges towards a single one. This concept could be helpful with understanding one of the functions of Layer 5 of the neocortex (L5), as shown in the next paragraphs. Consider the following two statements.
- It is widely accepted that Layer 5 of the neocortex (L5) has a role in motion.
- It is also known that motion helps with disambiguating sensorial input (Hawkins et al., 2017).
From the previous two statements, it is possible that at least one of the functions of L5 is to help with the disambiguation of sensorial input. I propose that, often, when sensorial input provides us with ambiguous data, L5 triggers manipulations that help disambiguate it. The first thing a toddler does when you give him a new object is to manipulate it to understand how it is made. The main thing a toddler does with his leg is to manipulate them (sending random motor orders to them) in order to first understand how they are made, then how they work and finally how he can use them.
However, disambiguation of sensorial input is not the sole function performed by L5, but instead a specific instance of a more generic function. To understand this generic function is, we need to examine L5’s connections.
L5 connections
L5 is mainly constitutes of pyramidal neurons, of which exist two primary types: cortico-subcortical (CS) and cortico-cortical (CC). CS neurons connect with some of the areas of the central nervous system from which motion orders are processed and are thus generally considered to be involved with motion. CC neurons however, mostly connect to areas of the cortex which are unrelated to motion; in particular, they mostly connect to areas related with visual perception (E Kim et al, 2015, Three Types of Cortical Layer 5 Neurons That Differ in Brain-wide Connectivity and Function). Their role is unclear. Are L5-CC neurons related to action as well, despite lack of connections to motor areas? How?
(Some readers might know that in addition to areas of the CNS from which motion orders originate, CS neurons also connect with areas of the cortex not commonly related to motion. It has been hypothesized that CS neurons send motor orders to the subcortical areas and then send a copy of them to the cortex, so that it could update its internal model to compensate for the motor orders that are about to be executed. This hypothesis might very well be true; however, this does not explain the purpose of the L5-CC neurons who solely connect to areas unrelated to motion. This question is the focus of this section.)
L5 triggers manipulations.
I hypothesize that L5 (CC and CS alike) is concerned with triggering manipulations of two kinds: physical and mental. Physical manipulations are motor actions: movements that we make in order to change the position of parts of our body and/or of objects in our environment in order to understand them or to pursue our (conscious or unconscious) will. For example, we might eat an apple in order to obtain nutrients or we might slide our fingers on an object in a dark room in order to gather information to identify it. Mental manipulations are cognitive tasks that we perform in order to apply transformations to items in our mind. For example, we might mentally rotate an upside-down image to visualize it from a different angle, we might imagine how that new piece of furniture would look into our living room, we might imagine how that nut will look like after we hit it with a rock, or we might use planning to achieve our goals. Within this framework, L5-CS neurons are primarily tasked with physical manipulations, while L5-CC neurons are tasked with mental ones.
L5 recognizes needs for manipulations
In addition to proposing that L5 triggers manipulations, I suggest that the firing patterns of L5 represent needs for mental manipulations instead of directly representing the manipulations themselves. This interpretation is plausible for several reasons. First, it allows for composability. Different regions of the brain might each compute the need for a distinct manipulation; the motor region receiving multiple needs can then compose them in a single motor order (for example, the need for pushing an object and the need for rotating it might be combined in a single smooth movement). Composability explains how we are able to make complex movements that serve more than one purpose, rather than serially queuing individual movements one after another like a primitive robot would. Second, this interpretation allows for specialization. Some manipulations are very complex, and yet shared between different contexts. For example, both writing and drawing require very precise movements of the hand; it makes sense to have a single area of the brain tasked with moving the hand with precision and to have multiple other areas being able to request its activation. Third, as the output of L5 generally passes through the Basal ganglia (BG) before reaching its destinations (Cowan & Wilson, 1994; Levesque, Charara, Gagnon, Parent, & Deschenes, 1994), and as it is generally understood that the BG has a prioritization function, it makes much more sense that the BG elaborates needs rather than actions: the former are easier to compare and prioritize (the value of fulfilling a given need is much easier to compute and more predictable than the value of executing a given action).
Though the activations of L5 represent the need for mental manipulation, sometimes they then become a direct order for manipulation. This happens in cases where composability or specialization are not necessary, and what is required is a single, simple action.
Further considerations
In this section, I will provide 5 separate considerations that might provide an explanation to some emergent behaviors of our brain.
The importance of motivation for abstract thought
Observing a classroom of kids studying mathematics, one would notice that unmotivated students appear unwilling or unable to perform the mental manipulations needed to make sense of the equations. Conscious abstract thought requires motivation. The hypothesis presented in the core of this paper explain why. Let me explain. It is known that most connections originating from L5 pass through the BG before reaching their final destination (Cowan & Wilson, 1994; Levesque, Charara, Gagnon, Parent, & Deschenes, 1994), and that the BG inhibits by default the signals transiting through it unless it selectively let them pass if it judges them worth of being executed. Mental manipulations being triggered by L5 means that they are subject to the same kind of BG-mediated motivation-based gating that physical manipulation have to go through, and thus explains how motivation is important for the execution of both.
The representation of thoughts
It has been proposed that objects are defined by a set of locations, with grid cells in the neocortex representing the location of a sensor patch (for example, tip of finger) in the location space of the object (Hawkins et al., 2018). Similarly, I propose that abstract concepts are represented as sets of mental manipulations in a location space – in other words, sets of transformations or comparisons that can be applied to other concepts and that are spatially-defined.
To better understand the proposition, one can consider that whereas L2/3 in cortical regions close to sensory input recognize patterns of sensory inputs, L2/3 in cortical regions distant from sensorial input recognize patterns of firing in L2/3s of lower regions. The former recognize pattern representing tangible objects and the latter recognize patterns representing patterns of tangible objects. And what are patterns of tangible objects, if not transformations or comparisons between them?
The PFC as a sandbox for abstract interactions
A recent interpretation is that grid cells in the neocortex keep track of the location of sensorial features (Hawkins et al., 2018). For the areas of the neocortex whose primary feedforward input is sensorial the presence of grid cells makes sense: such areas are indeed tasked with processing sensory stimuli. However, for the Pre-Frontal Cortex (PFC), a large part of input is non-sensorial; thus, the purpose of grid cells in the PFC is unclear.
It has been proposed that the location-based framework can be applied not only to physical structures but also to abstract concepts such as mathematics and language (Hawkins et al., 2018). In line with this, I hypothesize that one of the ways [footnote]The other way is by implicitly encoding them as properties of a given object[/footnote] our brain keeps track of whether abstract concepts apply to physical objects in our perception field by the process of “mapping them over”: i.e., assigning abstract concepts the same location as the physical objects they apply to. Similarly, abstract concepts can interact with each other by being assigned the same location of another abstract concept or by modifying the location of another abstract object. In other words, location in the mental space is not only used to keep track of the position of physical objects but also to track of logical relationships between abstract concepts and to attribute abstract concepts to physical ones.
In this interpretation, the PFC could be considered as the provider of a sandbox mode for abstract concepts and abstract interactions, a playground where to apply mental manipulations and mentally calculate their result.
Explanation for a single conscious self
Healthy individuals report having a single “conscious self”. How can this be compatible with the fact that our brain performs distributed computation (each region and even each column performing its own computation), as if we were made by thousands of brains (Hawkins et al., 2017)? Perhaps, the answer lies in the way our brain keeps track of location.
Our brain uses grid cells to build an internal spatial model which keeps track of proprioception and where the objects/beings we observe(d) are relative to us or to the room/space we find ourselves in. Moreover, grid cells are to be found in every cortical column, suggesting that each column keeps track of location individually. However, the brain uses a single reference point at a time and express all locations in relation to that reference point (Hawkins et al., 2018). Otherwise, things would be messy pretty quickly and thoughts and concepts would overlap each other [footnote]I believe that this is the case in dreams, where the absence of sensorial stimuli coming from our body prevents the internal spatial model to function properly and causes concepts to overlap[/footnote]. The existence of such a single reference point could be the explanation of why we report having a single conscious self.
Conscience as a simulator
Sometimes, our brain can compute the result of manipulations or of interactions between concepts intuitively. Other times, it can’t; mostly, when such manipulations or interactions are novel or complex. In such case, I hypothesize, the brain uses a simulator, in which it plugs data regarding the initial configuration of objects and concepts; then, it lets the simulator simulate, step by step and in a visual way; finally, it analyzes the result of such simulation and takes decisions accordingly. I propose that the observation of such simulator running is what we call “conscious thought”.
Conclusions
I propose that the role of Layer 5 of the neocortex is to recognize patterns of needs for manipulations – both physical and mental.
