Pleasure of learning

From supermemo.guru
Revision as of 21:43, 13 January 2022 by Woz (talk | contribs) (→‎The main problem in education)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This text is part of: "I would never send my kids to school" by Piotr Wozniak (2017)

The main problem in education

The main problem with regards to education is the belief that learning may cause displeasure, and that this displeasure should be endured to achieve more learning.

There are countless educators who believe that school should be like work: it is unpleasant but it just needs to be done. In this chapter, I will explain that the opposite is true:

Good learning is inherently pleasurable, and without pleasure there is no good learning.

The displeasure myth is so prevalent that even good teachers believe that pain is part of learning.

In this chapter, I show that the pleasure of learning is wired into the brain, and how we systematically destroy this gift of evolution at the cost of mankind's health, learning, creativity, and ultimately future.

The main problem of education is also one of the main problems of society. By destroying the pleasure of learning we are contributing powerfully to the destruction of the pleasure of living. We have built an education system that sets millions of people up for a life of unhappiness.

Chances are you are skeptical of my words, as the myth of unpleasant learning is a potent side effect of schooling. Therefore this chapter is an attempt to convince you. And all that is necessary to abolish this myth is an understanding of the simple mechanism by which new knowledge is encoded in the brain.

Learn drive and entropy

The concept of entropy is helpful in understanding why most kids do not learn much at school.

You may recall from your physics class that entropy is a measure of disorder, and that the second law of thermodynamics states that the entropy of an isolated system never decreases. This is the type of sexy law of physics that we tend to remember for life. It is universally applicable.

There is a sister concept in information theory called Shannon entropy. It can be understood as the average value of information transmitted by a source. For example, take a channel that is continually transmitting a string of identical letters into infinity (e.g. a string of As: "AAAAAA..."). It is entirely predictable and carries an entropy of zero. We do not learn from such a channel.

Claude Shannon proposed the concept of information entropy in 1948. Soon after, scientists were hypothesizing as to whether the entropy of an information channel may have a powerful impact on how the brain perceives the value of the channel. In 1957, Meyer hypothesized that the entropy of music determines the perception of its beauty. He concluded that a higher entropy may result in subjective tension, which correlates with more meaningful musical moments.

Meyer's thinking was later refined to better understand the perception of music and information in general. There is more to music than just information. This is visible through the phenomena of a song being entertaining and fun for many playbacks. But this is rarely the case with books.

Music is a universal message. If you were given a choice of a radio channel, you would quickly tune out from noisy static and you would also not be too excited about zero entropy silence. However, most people will respond positively to a regular beat of a drum. As long as it wasn't being drummed on broken glass, which we are wired to dislike, we would find a radio channel with a regular drumbeat more interesting than a silent one. This will naturally last only for a while until the drumbeat itself becomes boring and too predictable.

Today, we can finally test the response of the brain to information entropy. Neuroimaging shows that the anterior hippocampus responds to the entropy of a visual stream, and similar findings have been confirmed for the ventral striatum. Therefore we are now certain that the brain responds to information entropy. The entropy sensor is important in scanning the environment for learning opportunities. This is the prelude to the reward that underlies the learn drive.

Prior knowledge in information seeking

We need to distinguish between information and meaning. Entropy is not a good measure of the latter. The measure of meaning must involve the brain itself in addition to the information channel metric. Prior knowledge is essential in learning. Imagine that in your search for an interesting channel on the radio you find a news service. If the service is delivered in Thai and you do not speak Thai, you will prefer a service delivered in English. In information sense, news channels may have the same entropy, yet your prior knowledge will make you opt for the English channel. While the Thai channel delivers a stream of sounds, the English channel delivers a stream of concepts. Without understanding the knowledge of the recipient, information entropy tells us little. We cannot determine a signal-to-noise ratio.

Every listener will have his or her own preferred level of information entropy. For most music lovers, the regular beat of a disco or techno will be somewhat more interesting than the isolated beat of a drum. This type of music carries a higher average level of information. For a more sophisticated listener a bit of syncopation will be welcome. However, syncopation requires a degree of prior learning. Those with lesser knowledge of music may get confused with increased rhythmic complexity. If there is too much information in the beat it may no longer be possible to dance to the music. To an average ear, the genius of Wynton Marsalis may be hard to perceive. Top shelf jazz music is reserved for only a small fraction of highly educated listeners, as for most of the population, as the complexity increases, the music slowly disintegrates into the direction of radio static.

Entropy detectors in the brain

The brain cannot effectively detect the entropy of the signal hitting the retina or the eardrum. Like pixels of a monitor, retinal cells are not aware of what they display. If the detector, such as the hippocampus, is to light up in response to entropy, it must operate on the inputs from the entorhinal cortex (i.e. the input to the hippocampus itself). Those inputs will present the signal after a high degree of processing. Instead of pixels, it may present a concept. A high entropy signal at the sensory inputs will lose most of its noise component early in the process of neural selection, completion, and generalization. The signal-to-noise ratio will determine how much information is lost. The bigger the noise, the bigger the loss. The smarter we are, the more selective this processing will be and the more information will be lost at that stage. That's good. We become blind to detail. Pattern recognition will act like a deterministic function, which by definition, results in a drop in entropy. Complex patterns may become simple concepts. Those concepts will provide the actual input to the detector, e.g. the hippocampus.

Note that the visual stream produced in experiments that prove the response of the hippocampus to signal entropy has a highly symbolic nature. As such, the stream will lose far less information in processing. That highly simplified and conceptualized message will be scanned for surprisal and provide guidance to the entire learn drive system. This is why, in this case, the hippocampus appears to be responding to input entropy.

The above reasoning explains why both low and high entropy sensory signals can be uninteresting. After a degree of processing, a high entropy signal may lose all its noise and deliver a low entropy input to the hippocampus. We then observe the illusion of an "optimum entropy" level at sensory input. We need a new concept, learntropy, that will help us accurately determine the attractiveness of the signal. Learntropy needs to take into account the high degree of processing of information before it can activate reward centers in the brain. Learntropy is discussed later in this text.

Speed of information processing

An under-appreciated factor in sensory information scanning is the speed of information processing in the brain.

For every piece of music, there is a tolerable playback range where the beauty of the music is appreciated. A high speed playback can be annoying and the music may become hard to decode, as the high speed goes beyond our processing power. The same piece of music slowed down can quickly lose its appeal. The same happens in speech delivery or in classroom lecturing. For the same information and the same entropy level, we may accomplish highly different levels of signal attractiveness. There is always an optimum speed of delivery and that speed depends on all other factors that power the learn drive, incl. prior knowledge. As such, speed of delivery is highly individual.

I like to listen to lectures at 1.4x speed. I use 1.3x for more ambitious pieces. I never speed up Fareed Zakaria though, but rather relish every piece of information in this show. Students in a classroom lecture do not have a speed-up or slow-down button. Even the pause button, if available, is hard to hit as it may annoy other students.

In schools, all too often, the speed of delivery surpasses student's processing capacity. This results in negligible learning and high stress. There is no time to enjoy the landscapes in the window of a high-speed train. At MIT they call it "drinking from a firehose".

Probability vs. knowledge

Low probability events carry more information. Average information determines entropy. Prior knowledge determines the perception of an information channel's entropy.

If you happen to tune in to radio news and you hear that "Janet Jackson has delivered a baby", your degree of interest will depend on the probability of the event. If you have no idea who Janet Jackson is, this is a high probability event. If some 350,000 women deliver babies every single day, this is no longer news and is not new or interesting. The first death of a soldier in a war makes news, but when deaths incrase into the thousands, young lives become just a statistic.

If you happen to know Janet Jackson or like her music, the probability of a baby delivery drops dramatically to the level of "once in a lifetime" (for Janet). This can make you become interested. However, if you recall Janet as a beautiful girl from some ancient sitcom, her baby delivery may go into the category of "Impossible!". If you realize Janet is 50 years old, and you know about menopause, you may instantly become morbidly curious about her case. Your prior knowledge determines how you respond to the message. There is no optimum entropy level for a channel. There is only an optimum entropy level that fits a specific brain. At this point you may see that we need to introduce a new derived concept, which we will later call learntropy. Learntropy will determine the attractiveness of a given channel for a given brain.

If you love Janet-like gossip, the channel rich in that gossip will provide the right level of surprisal for you. It will provide the learntropy match. If you lack knowledge or your priorities differ, you will tune out. Your learning priorities will also determine your level of knowledge in particular areas and your response to any particular channel and its information entropy.

Predictability and surprisal

Probability and complexity are not the only components in information perception. We seem to look for a balance between predictability and surprise. I like funk. In this type of music, the bassline is often highly predictable with the optimum dose of syncopation. It makes it easy to synchronize the body motion with the rhythm. However, funk would not be interesting if it did not carry surprise. This is where the sophisticated jazz riffs tickle the neural system responsible for the detection of surprisal. In addition, after decades of learning, there is a whole database of signals that my brain responds to. There may be that one backup singer voice that I recognize and like. My brain is ready for funk.

I love Ken Robinson lectures on creativity. In one way, they are highly predictable. I totally agree with Robinson, so you can say that Robinson feeds my confirmation bias. This is pleasurable. When people agree with us, we like to say "great minds think alike". But if Robinson just kept repeating the same dry mantras on how schools kill creativity, he would lose his appeal. Entropy can be interpreted as the average expected surprisal. Robinson's delivery carries a great deal of nice surprises. He may paint the same models in a different and unusually creative way. As a result, the brain receives new information, produces a generalization, and confirms the existing models. Generalizations derived from new contexts increase knowledge coherence. This a very pleasing type of complementarity in a message based on a known model.

Robinson lectures find a good balance between predictability and surprise.

The most pleasing information channels will keep delivering surprises that confirm existing models and arm them in new semantic twigs on which new knowledge can be built. A surprise that destroys existing models may not be pleasing at first, but may lead to a highly pleasing revolution in thinking.

Metaphorically, you can imagine this as the information channel massaging your tree of knowledge and adding new branches like a potter who adds new layers of clay to his perfectly shaped creation.

Detecting surprisal

Human learn drive is based on detecting surprisal. We have known that for ages. All models of human and machine learning involve that concept under different names. Piaget wrote about schemata that fall into disequilibrium under the impact of surprisal. In his models of the neocortex, Jeff Hawkins speaks of prediction errors that underlie learning and intelligence. I like to speak of models, their elaboration (when new information fits the model), contradiction (when new information requires changes to the model), and generalization (when forgetting and memory optimization sculpt out new quality from the model).

For the reward of learning, a new surprising piece of information needs to fit pre-existing knowledge (models, schemata, predictions, or so). For the reward to be delivered, neural processing is necessary. Information on the input needs to be processed, and compared with information stored in the brain. One of the chief processors of input information in the brain is the hippocampus. It is the brain's information switchboard that is able to compare the input with prior knowledge.

Measuring the entropy of the visual stream is not necessarily a reliable indicator of the pleasing power of the information channel. All information streamed to the hippocampus undergoes a high degree of processing. A stream of pixels representing a beautiful beach will be processed into a series of shapes and textures. Those in turn will model palms, sand, and the sea. This highly compressed simple information will determine the original response to the information input.

Scanning for information in the environment is equivalent to scanning for scents of food. The scent is enticing, but only the actual feeding is a true reward. This is why entropy scanning does not need to be rewarding. All it needs to do is to lead to a reward. The anterior hippocampus responds to entropy, as noticed earlier, however experimental design made sure that the entropy refers to the combination of simple shapes that do not lose much information during input processing. Instead of speaking of signal entropy, we should rather focus on the input entropy at the information comparator such as the hippocampus. It is not the retinal pixels that matter, but the shape of the palm as represented on the comparator input. For the comparator, the high entropy pattern of grayness or static noise will not differ from whiteness or silence. They will all bring the same entropy on input: zero. This is why I used the term learntropy to accurately refer to the attractiveness of the information channel.

The anterior hippocampus that responds to signal entropy is famous for the discovery of the Halle Berry neuron (see more). Using electrodes implanted in a consenting epilepsy patient, researchers were able to pinpoint a single neuron consistently responding to images of Halle Berry in various contexts. The same neuron would also respond to Halle Berry's name. At the same time, posterior hippocampus might respond less consistently to Jennifer Aniston (perhaps an indication of a preceding layer of neural processing).

Most of us have no idea how Halle Berry smells and her smell might not be unique enough to activate Halle Berry neuron in the hippocampus, however, even the smell signal can get there fast via just a few synapses in the olfactory bulb, olfactory tubercle, piniform cortex, and the entorhinal cortex (see picture). However, if one could hear the sound of Halle's voice, it might meet the sound signal in the olfactory tubercle, contribute to recognition, and result in the subsequent activation of the Halle neuron in the hippocampus or further down in the neocortex.

Olfactory system anatomy
Olfactory system anatomy

Figure: Olfactory system anatomy. The smell signal can get to the hippocampus fast via just a few synapses in the olfactory bulb, olfactory tubercle, piniform cortex, and the entorhinal cortex. (source: Wikipedia)

Does it all mean that Halle resides permanently in the patient's hippocampus? Due to the association of the hippocampus with formation of new memories, we may rather think that Halle shows up in hippocampal neurons as a result of the recognition. Her permanent place in the heart of the patient is likely situated further downstream in the neocortex. We now know that in the process of memory consolidation, knowledge engrams move from the hippocampus to the neocortex. We are also pretty sure that this process is happening in sleep. It is in the neocortex that we should look for concept neurons representing Halle or one's grandmother. This last possibility gave rise to a hypothetical type of neuron called the grandmother cell.

In monkeys, researchers could identify grandmother cells in the visual cortex that respond to faces. There we might find cells that more consistently fire up in contact with Halle's image. However, the concept of Halle might still reside elsewhere and be activated, among others, by visual cortex cells upon noticing Halle.

Another activation route might come from hearing Halle's name on the news. The entire recognition process would be orchestrated by the entorhinal cortex and the hippocampus while the ultimate Halle neuron would light up somewhere in the layers of the neocortex.

For information rich signal to generate a reward, there must be a low probability event detected on input and encoded via association as new knowledge in the cortex. Where anterior hippocampus would respond to the entropy, the activity of the extensive bilateral thalamo-cortical network would be modulated by the surprise factor. There we shall search for the roots of the pleasure of learning. There are also other comparator centers that might be involved depending on the type of the message. The amygdala has also been found to likely produce rewards when detecting novel visual signals. The same amygdala neurons that respond to rewarding visual stimuli may respond to novel visual stimuli. Rolls hypothesized that this may implement the reward of novelty via the amygdala.

We know that the hippocampus connects directly with the nucleus accumbens (the brain pleasure center). This connection might be used in two contexts:

  1. the anticipation of pleasure and
  2. the ultimate reward.

The anticipation would follow the detection of a high learntropy signal and would result in active pursuit of high value messages. Detecting a message by the hippocampus might then simultaneously send associative learning messages to the neocortex and the reward signal to the pleasure center. That would spell the moment of learning something new!

The wow factor

In the summer of 1977, while looking for extraterrestrial intelligence, SETI researchers discovered an unusual radio signal coming from Sagittarius. In the bland low-level noise of cosmic space the signal was highly unlikely. Low probability marks high surprisal. Astronomer Jerry Ehman circled 6 letters corresponding with the signal on a printout and mark it with "Wow!".

A scan of a color copy of the original computer printout, taken several years after the 1977 arrival of the Wow! signal

Figure: A scan of a color copy of the original computer printout, taken several years after the 1977 arrival of the Wow! signal. (source: Wikipedia)

"Wow!" is how the brain responds to a sudden discovery. The moment is highly pleasurable. The entire purpose of the learn drive is to look for wow factors in the environment. These are the most valuable nuggets of knowledge that complement what is currently known: the current model of reality. The pleasure of incremental reading comes from the condensed power of wows streamed into the student's brain.

Thus far, we have seen the impact of entropy, surprisal, predictability, and current knowledge on learning. In this case, the mere probability of the signal does not fully explain its power. It is the interpretation that stands behind it (see: Knowledge valuation network). At the moment of making his note, Ehman could sense the enormity of its implications. This had been the most powerful evidence thus far and ever since for the existence of intelligence other than human intelligence. If the same signal represented detecting sardines in the ocean, there would be no "wow!". Not even in the Arctic.

The reliability of the information channel is important. If the error rate is high, the learn drive may weaken. When Penzias and Wilson discovered cosmic microwave background radiation in 1964, there was no "wow!". Perplexed researchers went on to remove pigeon droppings from their radio antenna. Pigeon droppings received a priority in their explanation of the mysterious noise. In 1978, for their discovery, Penzias and Wilson received a Nobel Prize.

When a scientist makes a discovery, he may exclaim "Eureka!" and punch the air. A neural network somewhere in his brain has produced a generalization that results in sending a reward signal. This propagates further and makes an old man jump around the lab like a child.

The same happens early in life. A toddler in an empty room will scan the environment for low probability components like colorful objects, new toys, etc. When a toddler experiments with a spoon dropping off the table, she is like a little scientist. However, when the brain makes a generalization "all falling spoons make noise", she is rewarded too. She may celebrate in the exactly same way as the happy scientist, independent of the age. A big smile is the first clear sign.

The same happy thing occurs to a lesser degree in all forms of learning controlled by the learn drive. It does not matter if we learn about a celebrity or the chemical composition of a rock. Things are interesting because they reward the brain through the learn drive mechanism.

A creative process will also produce rewards. An association deemed useful is rewarding. An association that leads to a solution to a difficult problem is even more rewarding. Clearly there is a gradation of rewards. The system can quantify the probability of information, association, or a solution. The lower the probability, the higher the reward.

Knowledge valuation network

Knowledge valuations

All granular pieces of knowledge processed by the brain are instantly evaluated for their relevance, coherence, and value. We instantly know if information is understandable and useful. We also often instantly notice when it is inconsistent, incoherent or irrelevant.

Unusual and surprising bits of knowledge are highly valued, however, the probability isn't the best reflection of value from the brain's point of view. There are highly unlikely events of low significance (e.g. asteroid strike in a remote planetary system), and likely events that change one's life (e.g. the answer to "Will you marry me?").

Knowledge valuation relies primarily on the applicability of knowledge in achieving personal goals.

The emotional brain and the rational brain

Knowledge valuation network is an evaluation system based on a resultant of emotional and rational valuations of knowledge. In literature, it may be referred to broadly as neural valuation circuitry, which is not necessarily knowledge-specific.

In the valuation network, emotional valuations will connect information with rewards in the primitive brain centers responsible for hunger, thirst, sex drive, etc. Rational valuations will be knowledge-based. An example of a pure emotional valuation comes from an answer to "Where is the nearest fast food shop?". Knowledge-based valuations may be more complex and highly networked, i.e. dependent on a network of subvaluations. Answer to "Which book is best for my exam?" is evaluated through one's goals that include passing an exam leading to getting a degree affecting one's job prospects, and contributing to lifetime goals. Emotional and rational valuations segregate anatomically. The emotional valuations come from what has metaphorically been described as older portions of the triune brain: reptilian and paleommamalian structures. For example, a specific stimulus processed by the thalamus may send separate signals to the amygdala for an emotional evaluation, and to the neocortex for a rational valuation. The emotional brain is philogenetically older. Personality and education determine if rational valuations can control or override emotional valuations.

Decision tree in fast thinking

Knowledge valuation network is the network of memory connections that determine the value of an individual piece of knowledge. If learning is interpreted as a task, valuation network will determine the perceived task value (see: Problem valuation network).

In computational terms, knowledge valuation network can be compared to a decision tree. Goals and emotions determine core values at the root of the tree. Semantic connections between pieces of knowledge can be interpreted as fractional value transfer from goals to details. A well-organized semantic network of well-consolidated and well-chosen knowledge will need mere milliseconds to make expert decisions. This is what Kahneman calls automatic fast thinking (if you are interested in tough problems that require slow problem solving, see How to solve any problem?). The same kind of processes, that underlie decision making or problem solving, participate in knowledge valuation. Like many expert decisions, the valuation is fast and it is often running with low participation of conscious intentionality. In short, we sometimes die to know things without fully being able to explain why. This process is hardly under our own control, let alone the control of the teacher at school. For efficient learning, valuations must be high.

Xefer is a tool that helps understand knowledge as a network. It relies on semantic links between Wikipedia articles
Xefer is a tool that helps understand knowledge as a network. It relies on semantic links between Wikipedia articles

Figure: Xefer.com is a tool that helps understand knowledge as a network. It relies on semantic links between Wikipedia articles

Valuation network in education

The brain builds a valuation network in the course of learning over years and decades. Through optimization in sleep and via forgetting, the network is polished and smoothed up for efficient operation. This makes it easy to take valuation shortcuts. A student choosing a book may no longer see his exam in the full context of his whole life. He might have developed a quick shortcut: "In the next 3 months, all I want to do is to pass geology".

Knowledge valuation network is highly specialized and very different from individual to individual. The balance between reason and emotions will differ. The balance between goals will differ. The valuation network will shape differently in the mind of a criminal, and differently in the mind of a researcher with lofty goals based on the good of mankind.

The picture shows exemplary valuations that are determined by personal interests in cancer and fasting:

Figure: Exemplary hypothetical concept activations and valuations upon encountering a declarative statement "In fasting, the NK cells learned to use fatty acids as fuel instead of glucose, which is typically their primary energy source. This really optimizes their anti-cancer response because the tumor microenvironment contains a high concentration of lipids, and now they’re able enter the tumor and survive better".

Here is a set of concept maps activations that results from reading the passage. Colors indicate concept connections that form concept maps that represent individual statements:

  • I employ fasting (time-restricted feeding) (light green)
  • I believe in health effects of fasting (brown)
  • Health is essential for productivity (pink)
  • Productivity serves IVS (intrinsically valuable state) (dark brown)
  • Fasting does metabolic training on NK cells (as suggested in the passage) (purple)
  • NK cells are important in combating cancer (this is prior knowledge reinforced by the passage, which is represented in total by light blue)
  • Cancer is my main longevity risk (e.g. due to my family history) (black)
  • Longevity serves IVS (intrinsically valuable state) (dark blue)

The highly branched concept map responsible for conveying the newly acquired knowledge from the passage is presented in the pinkish circular area. The essential value concepts are located on the right on the white background. They focus on self and the main life goals (incl. longevity, productivity, etc.). Red arrows show how concepts impart value on other concepts in knowledge valuation network. The value of a piece of knowledge is imparted by associations with goal of goals (IVS). IVS imparts value on health, longevity and productivity, which in turn make fighting cancer important, while the news reveals that fasting trains NK cells to improve natural fight against cancer

Compare: The same concept map generalized by forgetting

Valuation network in development

The development of the network will depend on the personality, lifetime experience, and the environment. Childhood trauma or personality characteristics, e.g. impulsivity, may increase chances of developing a criminal mindset. Some traumatic events in early life may favor developing biased networks based on single-minded obsessions (see: falsity vector). The environment and the available knowledge will determine passions, interests, goals, and network subvaluations (see: conceptualization).

The ideal path towards developing healthy network valuations is a childhood sheltered from trauma and chronic stress, with no external stressors shaping emotional valuations, plenty of play, and free learning in large behavioral spaces

All strategies that promote healthy brain development will also promote rich, highly individualized, and efficient valuation network. Those will underlie a sparkling learn drive. All educators agree that we want to help kids have a good grip on their emotional life and build smart, creative, and knowledgeable brains.

The chief problem of educational systems is a cookie-cutter approach in which all kids are fed the same knowledge in an industrial fashion with little respect to the key component of efficient learning: the learn drive. Learn drive is a perfect computational device that matches the current status of the semantic network representing knowledge in the brain with current input produced by the knowledge valuation network in response to information available in the environment. If the kid insists that he must see that YouTube video, his own brain is the best authority. All interference will affect future independence and creativity.

While a lecturing teacher may spend 45 minutes to feed a child with a long string of symbols that produce low valuations, and negligible memories, the same kid, with access to Google, within 3-5 minutes, will identify pieces of information with high valuations, and easy coding for lifetime retention (for an opposing view see: The morbid myth of Digital Dementia). For kids well trained in the process, the efficiency of knowledge acquisition may be an order of magnitude higher in self-learning. When I say "order of magnitude", I am just being cautious and conservative. I do not want to run into accusations of hyperbole. I included a couple of examples of specific comparisons in this text elsewhere (e.g. 13 years of school in a month or 1600% acceleration of learning during vacation).

Where I speak of golden nuggets of knowledge, Peter Thiel speaks of the power law: a small set of core skills honed to perfection can produce power returns.

Small investments in learning can produce dramatic changes to individual lives and to the entire planet!

Knowledge valuation in the brain

The research into the actual anatomical implementation of the knowledge valuation network in the brain is of paramount importance for the understanding of the human mind. It is essential for prevention of depression and addictions. Knowledge valuation underlies efficient learning, creativity, and problem solving.

Good learning is pleasurable. Rewards of food, sex, or drugs tend to saturate. Happy learning does not have this property. It is easy to avoid unhappy learning. This is done instinctively via the learn drive. This is why learning is of supreme hedonic importance. It can literally lift societies to a new happier level.

Orbitofrontal cortex

The networked nature of knowledge valuation is indicative of the use of cortical resources. Indeed, most of researchers seem to lean to the belief that the entire system of valuations might be centered in the orbitofrontal cortex (OFC) with the level of abstraction increasing towards anterior areas. There are many models and hypotheses on how individual subsystems affect valuations (e.g. common currency, common scaling, somatic marker, appraisal-by-content, multiple component, cognitive-motivational interface, parallel appraisal, locationist vs. constructionist models, etc.). In the common currency model, all valuations from all sub-systems (hedonic substrates) are integrated and provide the ultimate signal of "wanting" or "liking". For example, (1) knowledge-based valuations from medial OFC (mOFC) might combine with (2) reward anticipation from the nucleus accumbens (NA), and (3) food appraisal messages from the insula to affect decision-making in the choice of a restaurant for the next meal.

Common currency model

OFC is a fantastic research area due to the convergence of many lines of human interests: drug addictions, ahedonia, learned helplessness, obsessive compulsive disorder, etc. The common currency model seems to indicate that the high associated with explosive creativity or explosive learn drive is neurochemically and neuroanatomically comparable to the high produced by low doses of cocaine.

There is a lively dispute on whether all rewards are translated into a reward signal that converges on the same type of neurons, or if they retain the origin of their character. I think the discussion is redundant as specificity can be conferred by individual concept map activations, while the ultimate valuation generated by a single output can constitute the common currency. In all valuations, we need to have a convergence due to the existence of a single answer corresponding with a single concept map activation. Some OFC neurons seem to specifically encode high-level value.

In knowledge valuation and in decision-making, we need a single boss. Redundancy can be used to restore the valuation system, but there is no escape from a concept neuron decision. We cannot have two decision makers that would make a hand with a fork stab an itching eye during a dinner even though competing neural forces make such a scenario possible due to a computing error.

Emergence of knowledge valuations

Building up the valuation network may occur via the interaction of individual concept maps. For example, if the exam concept is valued because of the job prospects concept, they may be coactivated, and the valuation of the employment concept may confer a valuation on the exam-related concept map. The degree of the activation and the associated concept valuations may determine the ultimate appraisal. Myelin concentration may increase in pathways targeting the ventral striatum, which may be one of the ways to explain how the learn drive can be boosted with learning (or suppressed with coercive learning a school). The role of the orbitofrontal cortex in determining valuations might be similar to the role of the hippocampus in establishing long-term memories. Those highly connected regions of the cortex may play a role of a switchboard that connect areas of interest only to relinquish the role of a matchmaker while the linked concept maps (or centers) develop their own wiring for fast connectivity (e.g. in sleep). With new wiring, highly valued concept might affect pleasure centers with no mediation from the OFC. This way, some concept cells (e.g. associated with one's favorite actor) could generate pleasant valuations by sheer solo activation.

Harms of reversal learning

In case of a negative school conditioning, we may associate irrelevant contexts (e.g. colors of items in SuperMemo) with low valuations. In this scenario, the concept of a white item, or the coactivation of the concept of item and color, will suppress valuation by providing a strong negative input. Outwardly it looks like a cut-off signal that blocks valuations (perhaps in the lateral OFC). In such contexts, association of concepts would still be possible, and short-term retrieval might be likely, however, low valuation would prevent consolidation of memory (e.g. by blocking the transfer to long-term cortical storage)(see: How school turns off memory). Reprogramming reward (e.g. swapping template color in SuperMemo) could occur in reversal learning. We know that animals with OFC injury are impaired at reversal learning (Mishkin 1972), which adds to the evidence for the anatomical location of the supreme valuation networks. If we keep overriding valuation signals, we might end up with the war of the networks, which is my hypothetical claim on the origins of learned helplessness induced by schooling. School coercion might be seen as a form of perpetual reversal learning that will wear on network plasticity leading to long-term adverse effects on the ability to evaluate rewards in decision-making. In that light, human memory might be seen as an EPROM with limited number of erase cycles. If long-term learning is seen as a buildup of the synaptic substrate that is then pruned in the process of stabilization (which in turn reduces synaptogenesis), reversal learning might lead to an unresponsive system in which learning is no longer possible.

Endless fake rewards and micro-penalties may turn off knowledge valuations and undermine long-term learning at school

Goals vs. habits

The knowledge valuation network is central in healthy free learning. In contrast, passive schooling leads to learned helplessness. Coercion converts goal-oriented behaviors into the acceptance of passive habits (as opposed to healthy habits honed in the pursuit of goals). The output from the knowledge valuation network is suppressed by lower valuations in the system (i.e. lower activation of concept maps of interest). This naturally leads to a less joyful state of mind. When learn drive withers, when curiosity dies, life becomes a series of habits executed with little reward (see: 50 bad habits learned at school).

Without the pleasure of learning, human existence becomes a joyless set of habits

Knowledge valuation that affects the course of life

Personal anecdote. Why use anecdotes?
My school tried to block the best thing in my life

I have my own striking example of the power of the valuation network in confrontation with the education system:

In 1985, I computed the approximate function of optimum intervals for knowledge review needed for developing long-term memories. This was the birth of spaced repetition. Originally, the function was applicable using a pen and paper. Within a few months, I realized the system was extremely powerful. I knew that I could double its power with the use of a computer. However, I did not know anyone who could write learning software based on my math. In those days, the entire population of programmers in Poland was made of old timers doing Fortran or Cobol on mainframes, or a growing mass of amateur enthusiasts working with microcomputers such as ZX 81, Commodore 64, or ZX Spectrum. I decided to write the program myself. I had no programming skills though. I was a student of computer science, and I asked my teachers for help. However, our only course of programming was the assembly language of Datapoint. Those skills were great for playing with registers and coming up with 11*11=121. I wanted to learn something more useful for programming SuperMemo. My school kept demanding that I learn to compute the resistance of an electronic circuit, or learn symbolic integration. My knowledge valuation network produced a simple output: programming skills would lead to SuperMemo, which would lead to faster learning (in all fields, incl. electronics or calculus). I was determined to learn programming. My school was determined to stop me (by loading other compulsory courses). In desperation, I enrolled in the University of Economics in Poznan, which had a course of algorithmic languages. The course focused on Pascal. I had to do my normal load of classes and do my Pascal in extra time. That course was nice, but we did all learning in theory, and on paper. There were very few PCs at Polish universities in those days (1986) and most practical applications ran on mainframes called Odra (produced for Soviet block in Poland as of 1960). When I finally got my first computer: ZX Spectrum (Jan 4, 1986), I could finally start learning to program real computers. Before my computer arrived, I started writing my first program. I wrote it on paper! It was a program for organizing my day (sort of Plan in SuperMemo). Not much later, I was able to learn Pascal too. First I had to reduce the bad impact of school and cut the load of classes. I struck a deal with my teacher of electronic circuits. I would do some high-pass filter calculations for him, and this would be a chance to improve my Pascal skills. The program took many hours to write and was a monumental waste of time. It was a perfect example of bad learning. I hardly understood how my own program worked. However, it was still better than just learning diagrams. For my programming skills, that learning was good, and I improved a lot.

It is hard to express it in words to those who do not know programming, but the difference of knowledge valuations between university courses and doing one's own programming is comparable to the size difference between the plum and the Jupiter. While my colleagues suffered through boring lectures in electronics and metrology, I could make my start. I would learn nothing at school. I would learn a bit in my extracurricular course of Pascal. However, only the practical knowledge backed up by passion and clear goals mattered. By December 1987, my effort culminated in writing the first version of SuperMemo, which totally changed the course of my life. Open mind of my supervisor Dr Zbigniew Kierzowski let me devote my whole Master's Thesis to the subject of SuperMemo. Happy 80th birthday Professor Kierzkowski! It was pretty unusual for a student to make his own determination on that scale, and then compound it with the fact that the thesis was written in English (less than a decade later, Polish parliament tried to make such efforts illegal). This involved a big administrative and tactical battle back in 1989.

My school almost destroyed SuperMemo, i.e. the major source of my present joy. There was no malice involved. Most of my college teachers were fantastic people. It was the system that was designed to squeeze students through a rigid curriculum rather than give them space for creative expression that is the best basis of education

My school was actively trying to block me from accomplishing the most important thing that underlay my entire professional life and future. If I was a bit more compliant, more conformist, more prone to social pressures, I would be a "better" student, invest more time in the theory of electronic circuits, calculus, metrology, and abstract algebra. As a result, this article would have never been written. This site would not exist.

I would not trade my present life for any other type of career in research or industry. I survived the denial attack by providing resistance based on strong knowledge valuation network.

We need to design an education system in which kids do not need to battle for the right to develop

Learntropy

There are many factors that affect how messages and information channels are perceived and valued by the brain. In preceding sections we have noticed that the brain does not respond just to entropy. There are many factors that modulate the impact of entropy or surprisal of individual messages. Those factors include: encoding, speed of delivery, pre-processing (e.g. generalization, completion, recognition, etc.), prior knowledge (incl. valuation, emotional valence, channel reliability, etc.), optimum level (affected by speed of processing), and more.

The complexity of the process calls for a better concept that can encapsulate all those nuances. I suggest the use of the term learntropy to describe the attractiveness of an educational channel or signal from the point of view of an individual brain in a specific context.

Learntropy is the attractiveness of any educative signal as determined by the learn drive system.

Lectures can be boring or attractive. Learntropy expresses their attractiveness from the point of view of an individual.

While entropy has a precise mathematical definition, learntropy would probably best be measured by the response of the reward system to the act of learning from the analyzed signal. As much as entropy depends on the probability of individual messages, learntropy will depend on the rewarding power of these messages (pictures, sounds, sentences, etc). That rewarding power will be associated with probability, but the valuation will largely depend on the knowledge valuation network.

For good learning there is a reward. However, there is also bad learning. There is a decoding failure penalty. If a student makes an effort to decode a message and fails, he is penalized. This is how frustration is born. This is how the dislike of learning begins. If learntropy is low, reward is little, penalty is high, and the net result may be negative. If we take negative reward signals into account, learntropy could actually assume negative values. A boring lecture could carry negative learntropy. It will result in suppressing the learn drive.

High knowledge valuations contribute to high learntropy, which in turn is necessary for attention and semantic slotting in of knowledge for long-term retention. In a powerful feedback loop, learntropy enhances the learn drive, which underlies valuations that determine learntropy. This feedback loop is kept in check by forgetting, learned helplessness, aging, injury, and the sheer availability of mental resources. With rational learning and lifestyle management, esp. with respect to the natural creativity cycle, the equilibrium can be maintained at the high learn drive level for decades.

Signal timing vs. learntropy

The degree of reward obtained from individual messages in the learning stream will determine the level of signal learntropy. A lecture on a boring topic will carry low learntropy. Surfing the net for titbits of information needed to solve a specific problem will carry high learntropy.

Unlike Shannon entropy that is based on averages, learntropy will be more of a trailing average where recent messages will carry a higher weight than messages delivered earlier in time. In addition, learntropy is rooted in rules that govern the consolidation of memory, incl. the spacing effect.

The learntropy of a boring lecture will shoot up once a golden nugget of knowledge fills an important gap in understanding. The increase in learntropy will be proportional to the expression of the stability of the memory trace determining knowledge valuation (incl. descending traces in the knowledge valuation network). The impact of a golden nugget will wane in time. The cumulative effect of those happy discoveries will determine the level of learntropy at any given time (e.g. during a lecture).

The above shows that educators can influence learntropy, enhance the learn drive, and enhance long-term learning outcomes. Feeding passive knowledge is a bad strategy. Providing answers should be selective and should favor high importance abstract and universal questions. Free explorations of self-directed learning are the best formula for lifelong sustainable learn drive and lifelong learning.

All forms of schooling tend to suppress the learn drive. As a result, many adults may find it difficult to internalize the message on the importance of learntropy in learning. However, in the modern world, nearly everyone is faced with the need to solve a minor technical or health problem on their own. The problem may be as simple as a trivial change to setup in Facebook options. The harder it is to find the solution to a problem, the greater the reward in finding answers. The harder it is to find answers, the more persistent and extensive the search and exploration. Those feelings should be familiar to everyone. However, suppression of the learn drive always results in lesser knowledge, lower self-esteem, and all explorations might come to an end earlier. In other words, those who lost their creative drive at school, or later in life, will give up earlier, or perhaps never even try. In that sense, all technical problems and glitches that come with computers, Internet, technology, etc. have some positive side effect of stimulating the vestiges of the lost learn drive even in the most passive individuals. The only requirement is that those quests need to end with a degree of success. Otherwise, the opposite may happen. The penalty signal may lead to conditioning a withdrawal from exploration.

You can quickly answer this instant quiz about your own learn drive. If you face a minor problem in life, do you seek a human expert or you rely on Google? If your car fails, or you computer crashes, or you get injured, or you got a stomach ache, where do you go?

Learntropy and learn drive

In a process similar to forgetting, the impact of learntropy reward will decline exponentially over time. Like in spaced repetition review, the new reward will bring back learntropy to a high level. Like in a spacing effect, longer breaks may result in the same message being more rewarding.

There is a major difference between the reward signal determining learntropy and the consolidation signal determining recall in learning: once you learn something, repeated review in short spaces of time is pointless, once you drive recall probability to 100%, you can let time pass before the next review. The upper limit on learntropy might be hard to reach. If you love a lecture, with some twists of facts or delivery, you can love it more. If you remember a singular memory, you cannot remember it better by tricks employed in a short space of time. You can reformulate the memory using mnemonic techniques and affect its durability, but once the probability of recall is 100%, the best thing to do for the memory might be to leave it unused for a while or employ it in varying context, which may essentially lead to developing new memories that will form redundant connections to the original singular memory.

Extinction of learntropy occurs via lack of reward signal. Extinction of learn drive is a matter of forgetting (incl. forgetting through brain cell loss).

Learntropy will be additive over individual messages with exponential decline and diminishing returns. By optimizing the timing of rewarding messages, we can drive learntropy high and make learning become one of the most pleasurable activities on the par with rewards of food, sex, drugs, etc. If you are skeptical, recall obsessive videogamers who can literally starve while playing nights. Videogames can highjack the learn drive and combine it with the reward of gambling. Rewards of gambling might also be governed by similar rules of decline and boost as learntropy, however, they are subject to variable reward which can lead to addiction. It is important to distinguish between the pleasure of learning and harmful addictions (see: Addiction to learning).

Learntropy will determine the learn drive, but both will be sustained with different rules. Learn drive is knowledge dependent, and as such will be subject to spaced repetition. As knowledge is a network, speaking of optimum stimulation of learn drive is probably pointless. To maximize learn drive, we should engage in lifelong learning, respect natural creativity cycle, and take care of the brain health (i.e. health in general).

Optimum information delivery

In schooling, we might envisage a lecture delivered at optimum learntropy level, in which a student keeps saying "wow! wow!". She keeps taking down notes as fast as humanly possible. More often though, the lecture will buzz a high entropy signal or ooze boredom. Its learntropy will be low or even negative.

If optimum learntropy levels depend on the student, how can a teacher optimally deliver knowledge to a classroom? Sometimes universal delivery is impossible. In other cases, it is difficult enough to require genius teaching skills. For most teachers, lecture delivery keeps most kids bored or frustrated.

In lecture delivery, a lucky few may get most of the message. For a fraction of the gifted, the lecture may carry nothing new. For them it is boring. For other kids, message complexity goes above their comprehension level. In such cases, the lecture can be frustrating if they try to decode it. A lecture on string theory might be comparable to a noise of randomly shuffled English words. Lecturing is an exercise in timewasting. Nobel Prize winner Carl Wieman compared it to blood-letting.

To avoid the frustration of negative learntropy, students will tune out like you tuned out from that Thai channel I mentioned earlier. Children will ignore the static noise coming from the teacher and tune in to other channels that carry more appropriate levels of learntropy (e.g. Facebook on a phone under the desk). Even if their comprehension is good, the knowledge delivered may not complement their current knowledge. If it does not generate high-quality high-value generalization, it will be considered obvious or irrelevant.

Low learntropy, even if occurring occasionally, conditions the student to tune out. After a while, students will develop a filter that will turn a teacher into a silent radio channel carrying zero entropy and zero learntropy. Improvements to lecture quality will become futile. The teacher disappears!

In a classroom setting, a student will often not be able to zero in on a better signal. The same signal is dished out to all students and they all may get equally bored. In contrast, Googling for good keywords can bombard the brain with perfectly timed low probability messages that will fit the current knowledge tree like a jigsaw puzzle. Google is a very cheap and efficient generator of "wow!".

In incremental learning, the learntropy scanner will pick best channels, prioritize those and employ perfect timing for maximizing semantic connectivity and memory consolidation. This should make it easy to understand why I am extremely happy, I will never ever be forced to sit in a school bench! I love learning too much!

All the above examples illustrate how intricate the interaction between the signal and the brain is in recognizing things worth learning. The reward of learning is the best known indicator of learning quality. When students are happy, we are on the right track. When schools are the place of misery, we are failing on a societal scale.

The only reliable detector of knowledge complementarity and coherence are the neural networks of the learn drive system. This is why knowledge cannot be prepackaged and imposed on students.

This is explained using a crystallization metaphor. The neural details of the reward system follow in the section: Learning rewards.

Gripping lectures

We love learning, but we usually hate to be taught. Those feelings correlate with creativity, which can probably be explained by the fact that creative elaboration is essential for pattern completion that underlies comprehension.

In learning, we decide what to investigate. The learntropy evaluation strictly depends on the status of the brain and current memory activations. In teaching, knowledge is dished out independent of what we think of it. Many students list boring subjects as their number one reason for disliking school. Not bullying, stress, or early waking. Excruciating boredom! I write about the astronomical difference between self-directed learning and learning at school here. It is all about the learn drive!

I am amazed with how many resources are wasted on research that looks for ways to keep kids interested during lectures, while it should be obvious that lectures are just a poor educational tool. Eye contact analysis? Engagement analysis? Efforts to quantify passion? All kids are equipped with natural learn drive and our priority should be to ensure we do not destroy that drive. Force-feeding knowledge is the prime destroyer of the learn drive. In addition, there are many socioeconomic factors that prevent a great chunk of kids to thrive even in the best circumstances. Some kids will never show passion for learning. In most cases, it is not their fault. Only a tiny fraction are limited by disabilities, health, and less fortunate genetic endowments. The exponential decay in the learn drive with age is caused primarily by compulsory schooling. Passive lecturing is a huge contributor to that process.

Naturally, there are lectures that work. Khan Academy is jam-packed with good examples. Even a spoken lecture with no slides can work. A TED talk on YouTube can be fun. It can satisfy the learn drive. MOOCs are founded on the principle that one rock-star teacher is better than thousands of rank-and-file teachers repeating the same mantra. You can learn a lot even if you are just a passive listener. There are conditions though: you need to be intensely curious about the subject, or you need to love the speaker, or both. There is only one sure mechanism for ensuring the lecture is interesting: you need to choose it on your own! This is just one more aspect of the need for self-directed learning.

In addition to choice, in lecturing, you definitely need a pause button in case you need to take a toilet break, or quiet the hunger pangs. Nothing can ruin a lecture as effectively as a bursting bladder. Last but not least, most lectures could benefit from Netflix's Skip Intro feature.

Naturally, the lecture will work best if you enhance it with your own creative thinking or even quick research. This is why pausing for a minute, or for a day might be essential for learning efficiency. Against the claims of some psychiatrists, creative breaks and a wandering mind have nothing to do with ADHD. As long as they are remotely relevant, they are hallmarks of great learning.

I use two methods for consuming lectures incrementally. My first method is to listen and exercise. Exercise improves focus. Good focus reduces the need for a pause, however, it also reduces the creative aspect of learning. For subjects of highest priority, I use incremental video where I can pause and resume multiple times. I can even keep the most important lecture extracts for future review. However, even incremental video isn't the best approach to learning. It cannot compete in speed and volume with incremental reading. Sometimes it makes better sense to employ incremental reading and process the lecture transcript than to listen to the lecture itself. This is particularly visible in fact-rich lecturing.

I choose my video materials mostly on the basis of speakers who I just love to listen to. In the context of this article, I know you would love Ken Robinson lectures! Go and see: Robinson: Schools kill creativity!

Learning rewards

The pleasure of learning might be one of the most satisfying possible pleasures. As opposed to eating or having sex, the pleasure of learning does not terminate with the act. The pleasure of learning is sustainable and wanes slowly only with the overload of networks involved in learning. It can be reset back to the baseline with sleep. The pleasure of learning has been shown to involve the same mechanisms as the pleasure of heroin or cocaine. Unlike feeding or sex, pleasurable learning can fill most of the waking time. In that sense, the pleasures of learning, creativity, problem solving, and productivity might be great tools in stoic hedonic therapy. Whereas the need for food is easily satisfied in a healthy individual, the need for learning may never end. The learn drive depends on the status of current knowledge and this status can be manipulated with learning itself.

All people with mood swings should consider learning as therapy.

Learn drive reward

I have mentioned a couple of examples of how the learn drive leads to a reward signal in the brain. We know that low probability information can be rewarding. So can a generalization that contributes new knowledge. A snippet of information that leads to a great goal of understanding is highly valued. A missing piece in a jigsaw puzzle carries a great reward. One obscure word, once decoded, can make a whole long text switch from a tangle of sentences into a clear line of reasoning.

Confirming a model via a generalization or laying foundations for a new better model both feel great. In addition, all model confirmations associated with strong emotions can lead to euphoria: "My team is the best in the world!", or "Yes! My newborn is healthy indeed!", or "Yeah! I knew that hard work will earn me that promotion!". However, when discussing the learn drive, I would like to filter out that extra emotional layer that may obscure the picture. We need to remember that learning is pleasurable independent of whether it brings rewards from employing the knowledge.

The Aha!, Wow! or Eureka! of discovery is the purest and ultimate prize in learning. It does not need to entail further reward in accolades or praise from others. Here, the knowledge is its own reward.

The common denominator of this reward is the encoding of new highly-valued information in memory.

The learn drive reward comes from high-value knowledge ready for long-term storage.

In our quest to understand reality, while the total amount of information stored in the brain increases, the entropy of stored knowledge drops. With learning and modeling, it takes less and less effort to understand the complexity of the world.

Evolution of the learn drive

Scientists say that smart animals play more. I say that it is even more interesting to note that species that play more are smarter. I hypothesize that the learn drive may have been the trigger factor in the explosion of the human brain size. It is not that birds or mammals faced a change in environment that required more thinking. It is not that humans suddenly faced extinction had they not blown up the size of their cortex. It may have been the emergence of the learn drive that suddenly allowed better usage of the expensive increase in the number of brain cells. Before there was the learn drive, adding brain size might leave an animal with an extra head weight to carry and an extra set of cells to feed. Without the learn drive, the extra brain space might remain unused and likely undergo wasteful atrophy. If schooling attempts to override the learn drive, it will contribute to the disuse of that evolutionary advantage. It will contribute to society that is less smart and less creative.

If we plot the brain size over the timeline of human evolution, we can see a powerful upswing around 2 million years ago. Paleoanthropologists tend to attribute that swing to better brain nutrients in the diet, cooking, and the like.

If the hypothesis on the emergence of the learn drive is correct, Homo habilis would be a candidate for the starting point of the breakthrough. This could point to the transition from a simple procedural play drive of birds and mammals towards a more sophisticated declarative learn drive that ultimately leads us to building abstract models of reality, which underlie human intelligence. Homo habilis has also been hypothesized to lead to the emergence of childhood dominated by brain growth (from weaning to an average of 7 years old).

The late arrival of the learn drive in evolution would suggest that it is not a simple property emergent in neural networks (see: Biederman model). Otherwise it might easily show up in fish or earlier. The learn drive requires a dedicated set of neural structures that are able to send a reward signal at the point of detecting an incremental contribution to a coherent structure of declarative knowledge. This signal and the underlying structure might differ in procedural learning and declarative learning. It might also differ for different classes of sensory input.

Procedural learning reward

I hypothesized about circuits that might run procedural learning back in the 1980s. In my Master's Thesis, out of ignorance, I used my own term "stochastic learning". I had no idea that two decades earlier, back in 1969, David Marr proposed a theoretical model of the cerebellar cortex that fit my own thinking. In the new millennium, there is a lot of data to confirm the model.

The idea of a procedural learning circuit is very simple. Imagine you ride a bicycle. You apply your conscious mind to learn individual moves needed to mount the bike and to then continue pedalling. However, once you are on the way, the procedural learning system makes sure you can execute all moves automatically with minimum neural effort without participation of conscious supervision or minimal supervision over a set of command neurons. Procedural learning will determine your motor program. This procedural learning system will make minor random adjustments to the sequence of signals sent to the motor system (hence the name "stochastic learning"). You can view those random changes as procedural creativity. Each time your bike loses balance, a penalty signal will be sent from the error-detecting network to cancel proposed corrections. That penalty signal will play the role of a teaching signal for the motor program.

During sleep, memories will be reorganized to eliminate the need for conscious input, simplified, optimized, and garbage signals that have a low contribution to the skill will be rejected. With each kilometer cycled, the sequence of signals will be perfected by trial and error. With each bout of sleep the wrinkles will get smoother. Riding a bike will become a pleasure. That pleasure seems to peek in the transition from clumsy conscious rider to a natural.

In a similar fashion, with each sentence typed on the computer, you will strike fewer typos. Do you know where ")" is on the keyboard? How about "}"? The more fluent you are in typing, the more likely you are to forget this detail. When the conscious control of motor sequences is taken away, declarative knowledge of the position of ")" on the keyboard may be thrown away as "garbage". It is no longer needed.

Declarative learning reward

Things are a bit more complex in explaining the declarative learn drive. There is a definite reward to declarative learning. Some things are just interesting, and finding out the truth is pleasing. At the neural level, the brain will scan inputs and neural activations to look for areas of high learntropy with the maximum delivery of new knowledge matching the current status of memory. Any meaningful message of low probability will be deemed more attractive. A bright fractal pattern will be deemed beautiful. A gray randomness of colors will be deemed boring. The same will occur in the case of a more complex visual message. A vibrant forest is beautiful. The same forest may seem unattractive in winter, in draught, or under the impact of environmental pollution. Steven Pinker remarked that we are attracted to images that ooze vitality. I disagree. The attraction is much wider. We may be equally well attracted to a deathly volcano or a frozen landscape of Antarctica. We love environments, signals, messages, or brain activations that can express complex information using simple models. The picture of a beautiful beach can be represented by a couple of simple shapes and textures.

Entropy of information is related to the compressability of data. Signal processing begins on the input. The retina performs a 100-fold compression of the visual input signal. The visual cortex receives simple representations of shapes and relationships. The hippocampus receives that information in an episodic context. Those signals may end up changing the status of a single synapse in the neocortical long-term memory storage.

The learn drive is based on seeking effective ways of representing knowledge in neural networks. Learn drive, memory optimization in sleep, and forgetting are essential to maximize compressibility, abstractness, applicability, and performance. This is how the brain makes sure that we can see a complex world using simple representations. That's the core of human intelligence. If artificial intelligence researchers could equip robots with a human-like learn drive, given sufficient memory, their learning capacity might be inexhaustible.

Reward centers in learning

In 2014, researchers reported that the activity in the nucleus accumbens was increased in the state of "high curiosity". They have also demonstrated what we have always known: this state improved memory performance. In addition, that improved performance spilled onto incidental learning, i.e. learning that would not spark curiosity on its own. This research was widely reported in media with a wrong interpretation: "curiosity primes the brain for better memory". For example, Scientific American headlined "Neuroimaging reveals how the brain’s reward and memory pathways prime inquiring minds for knowledge". The paper itself suggested the need for "stimulating curiosity".

As reward centers can be involved in the anticipation of pleasure, we should rather see the results of the research as an indicator that the learn drive is associated with pleasure. It is the learn drive that causes learning. It is learning that is pleasurable. The headline should be "Neuroimaging confirms that efficient learning is pleasurable". In other words, the sequence is not "drive -> pleasure -> learning", but "drive -> learning -> pleasure".

Instead of speaking of the need to "stimulate curiosity", which should rather speak of the need to "develop the learn drive". The key difference is in perceiving stimulation as quick-fix approach that might be used in a classroom as opposed to a long-term process that takes months and years. An advertising campaing may use cheap tricks to stimulate our curiosity, while a lifelong passion is a formula for insatiable and unwaning learn drive, which is a perfect warranty for unceasing learning.

It is true that the state of curiosity will improve attention and this will improve overall learning, however, this should not ever be used as a classroom strategy. Gamification of learning makes sense only if rewards come from target learning, not from learning that surrounds the target. Many learning programs for children use bright colors, unusual sounds or smiling faces to attract attention to induce learning. However, once habituation sets it, this form or artificial gamification stops being effective. Moreover, incidental knowledge does not last. Any effort to employ curiosity to spark incidental learning is non-specific and inefficient. Equally well we might hope that pharmacological intervention, e.g. with Ritalin, could improve learning. Instead, learning must be its own reward.

The nucleus accumbens and the ventral tegmental area are involved in pleasure, in anticipation of pleasure, and in signal evaluation. The signals from the knowledge valuation network converge into those areas in both their motivational and affective valence. Dopamine is involved in the anticipation of pleasure. As dopamine is involved in attention, anticipation of pleasure alone would lead to improved learning due to a better focus on the source of information that is expected to deliver the pleasure.

If you are unconvinced, think of how much you hate your news channel when they do their tricks to pique your interest, and then say "find out after the break". You can get even more livid when they ruin it all with "Breaking News!". Anticipation can lead to frustration too. Only actual learning provides the reward. Only actual learning reward makes sense from the point of view of evolution. We do not want to reward an animal for the mere sight of food.

The buzz in the nucleus accumbens can be a direct expression of pleasure or might also indicate the state of pleasure seeking. In the end, the actual interpretation does not matter for the ultimate conclusion: boredom and displeasure are the enemies of learning.

For efficient learning in which new knowledge complements current knowledge, we need to follow the learn drive. In simple terms, this means that the pleasure of learning is desirable in education. We should never learn in the state of displeasure (cf. Desirable difficulty). Painful learning comes from the brain letting the student know that, in information theoretic sense, the new knowledge does not fit! It will be rejected. Pleasure is a good guide!

From the above neural reasoning we derive the obvious, the best warranty of efficient learning is to let students learn on their own and follow their own passions.

Biederman model

Pleasure of reading about the pleasure of reading

In 2006, Irving Biederman and Edward A. Vessel, published a paper that gave me unforgettable pleasure to read. The article itself explained the pleasure of reading to me. In a paper titled "Perceptual pleasure and the brain", Biederman hypothesized that a gradient of opioid receptors in brain structures responsible for visual perception might contribute to the pleasure of viewing nice scenes such as beautiful landscapes. Biederman's idea seemed to explain to me what I have known for ages: learning is pleasurable. I always liked to learn, however, I never truly understood what underlies my liking in terms of brain science. Biederman's explanation was a perfect fit and it was powerfully pleasurable. It explained something that bothered my mind for a longer while. At the moment of reading, I was very self-analytical. While reading about the pleasure of reading I was trying to "feel" how the enlightenment of reading provides the pleasure. The pleasure of reading about the pleasure of reading became unforgettable.

What Biederman and Vessel proposed is monumental. Let me therefore name their thinking for simplicity: the Biederman model (name choice by seniority). In visual perception, successive layers of neurons are responsible for more abstract representations of the visual scene. Metaphorically speaking, it starts from pixels and colors, then it moves on to edges, textures and surfaces, then to objects, then to faces, places, and collections, and then to meaningful episodic scenes that, at the end of the chain, may activate a representation of a "beautiful mountain", and be remembered as such with only a few details perpetuated beyond the first impression in working memory. Millions of pixels of a photograph will turn into a meaningful scene that can be verbalized in just a few sentences and remembered as such for years, at a very little neural cost.

Biederman model capitalizes on an earlier discovery (Michael E. Lewis et al., 1981) that there is a gradient of mu-opioid receptors along the visual perception pathway. The more meaning the neuron carries, the more opioid receptors it is likely to have. We know that opiates are rewarding and addictive. Biederman model is based on the hypothesis that this gradient of opioid receptors is the source of perceptive pleasure.

There is a similar hierarchical system for processing speech and music. A temporal cortex involves processing sounds from pitch to melody. The processing of the rhythm involves yet other areas of the brain. Chances are, all those perceptive networks work along similar principles. This is the study subject of neuroesthetics.

Opioid vs. dopamine pleasure

There is a slight problem with the Biederman model though. The pleasure of learning can be analyzed consciously. The pleasure of reading about Biederman model, in my own case, could be decomposed and tracked down to individual components of the model. This fact implies that the pleasure is integrated with conscious experience. Consciousness is a notoriously hard nut to crack for neuroscience. Most of what we know about consciousness is either speculative or based on hard and expensive experiments in which electrodes implanted in the brain can be used to elicits effects that can later, or concurrently, be reported by the affected individual. The evidence seems to be converging on the integrative model of consciousness in which an activation of several structures in the brain gets integrated and perceived as conscious self. In that line of thinking, activating a Halle Berry neuron somewhere in the cortex is not enough to bring Halle to one's consciousness. Millions of concept neurons can get activated at the same time and a thinking mind can only operate on a few pieces of the model of the perceived reality (see: attention). To bring Halle to one's mind, the activation must get integrated with other components of conscious perception, including the reward of the perception.

For those reasons, opioid receptors in cortical neurons will not do much for the ultimate reward of learning. An opiod antagonist, naloxone, can take away some of the pleasure of music in some people. However, the opioid pleasure of learning should rather produce a mild bliss of first-time micro-dose heroin or morphine use. In that sense, release of endomorphins and activation of opioid receptors can make a contribution to the pleasure of learning. Nevertheless, this pleasure isn't specific enough to give one a jolt of "wow!", "aha!" or "eureka!" (Biederman calls it "click of comprehension"). For that ultimate learning reward, there must be an integrative reward experience coming from the pleasure centers in the brain.

Pleasure of association

That ultimate pleasure jolt of discovery will come from a meaningful association. It can be explained using the pleasure of understanding the Biederman model itself. When thinking about the model, we activate two important concepts in our minds: (1) a gradient of meaning (derived from understanding neural structures involved in visual perception), and (2) a gradient of pleasure (derived from the observation on the content of opioid receptors in visual pathways). Once these two concept come up in mind, there is a glue of analogy: the concept of "gradient". That glue helps bring up the association that gives a jolt of pleasant enlightenment: MEANING = PLEASURE! That's exactly what I experienced when reading Biederman's paper. For that jolt to happen, it is not enough that there are more opiate receptors associated with the concept of the gradient of pleasure than with gradient's mathematical underpinnings or its association with the word "gradient". It is not enough that there is more opiate associated with the novel concept of "gradient of meaning" than with the often used term "meaning". The jolt happens when those two highly priced concepts collide: meaning + pleasure.

Biederman noticed that the gradient of receptors proceeds far into the associative areas, incl. the parahippocampal cortex. We may remember that further downstream, in the hippocampus we have found the Halle Berry neuron. To illustrate the difference between the opioid pleasure and the associative pleasure, let us imagine meeting Halle on a beautiful beach. While walking on a beach, we may experience a delicate heroin-like breeze of bliss, which comes from the realization that our environment is perceptively beautiful: "the beach I walk on feels great". Once Halle shows up on a horizon, visual analysis may provide another breeze of opioid pleasure coming from the signal "beautiful lady approaching". Then the visual processing unit may identify the lady as Halle, which might activate cortical representation of Halle, which could be opioid-rich. However, only the ultimate association of Halle and "my beach" would trigger a major discovery, perhaps an atavistic reproductive dream: "Halle walks the same sand like me!". This is where the reward from the ventral striatum and the nucleus accumbens might come to play in "liking" the situation, and a jolt of dopamine might trigger a behavioral program of "wanting". The details of that behavioral "wanting" program have been cut out from this text by censorship. Nevertheless, execution of that program would inevitably be halted in highly-developed individuals by executive signals from the prefrontal cortex. In short, an injection of dopamine in the pleasure centers of the brain may give the brain some indecent ideas, while the release of opioid peptides might just result in an associative bliss.

The pleasure of learning does not need to involve attractive representatives of the opposite sex. Halle showed up in my example only because of the discovery of the Halle Berry neuron. For the pleasure of learning, all that is needed is a powerful and highly-valued association of ideas that activates the pleasure centers in the brain. The pleasure happens each time we learn something new, and the jolt is most powerful when we learn something of high value. The pleasure of discovering the Biederman model came from high valuations of the pleasure of learning itself in my knowledge valuation network. High valuations lead to high reward, which may facilitate memory (see: Dopamine may modulate plasticity in learning)

Impact of memory on the pleasure of learning

I would also add to Biederman's hypotheses on desensitization, i.e. the decline in pleasure with repeated exposure. Biederman suggests that children love repetitive videogames because of the gambling factor. However, gambling is no less potent in adults. I posit that children enjoy repetitive learning more because of childhood amnesia. Some of the repeat pleasure may come from limited comprehension, but some will simply be explained by accelerated forgetting. Poor comprehension and forgetting are the primary differentiators between the adult and the child brains.

We should also notice that a great deal of decline in pleasure of review will come not from competitive learning but from long term-memory consolidation that might result in signals flowing efficiently in the system. Competitive learning may be important in pattern recognition but in associative learning, it will be high retrievability that will undermine the pleasure of repeated exposure.

Stages of learn drive evolution

When I hypothesized on the emergence of powerful learn drive in humans, I had in mind the direct channel from knowledge to reward centers. It would ultimately be a higher level of learn drive than the one implied by the Biederman model. Each time receptors are involved, evolution has a simple and grateful material to work with. Receptor gradient has originally been discovered in a rhesus cortex. Similar mechanisms might be involved in simpler brains or even more primitive nervous systems deprived of central control. I have no idea what an ant thinks or how it feels, but finding a great food source must definitely be a source of some kind of ant pleasure. From this we can conclude that the pleasure of learning might not be much phylogenetically younger than the nervous system itself. However, in the course of evolution, the drive has built up new layers of functionality and efficiency. Playful creativity seems to emerge only with some birds and with mammals. That evolutionary process might have ultimately peaked as human learn drive. This will naturally, at some point, be implemented in thinking machines. Understanding the power of the learn drive will be vital for survival of humanity: both in its need for artificial intelligence and the threat of having AI turn against mankind.

Desirable difficulty

Desirable difficulty is a concept that might be an excuse for tolerating the displeasure of learning at school. Here I explain why this excuse is unjust and dangerous.

Robert Bjork might be the best expert on learning theory. If he tells you that difficulties can be desirable in learning, he is right and it does not stand in contradiction to the fact that good learning is always pleasurable. Desirable difficulty is a conglomerate of concepts in which obstacles in learning lead to better learning. Let's tackle those one by one in the light of the pleasure of learning:

  • active recall: active recall is superior to passive review. Active recall is harder. This is a desirable difficulty. We need active recall in learning because it is the only procedure by which a memory engram can be effectively reconsolidated in spaced repetition. Active recall occurs each time we employ useful knowledge in practice. This use is pleasurable because it leads to productivity, which is a reward independent of learning. Humans simply love to achieve goals. If review is planned artificially, like in SuperMemo, it does not lead to a productive act and it may easily lose its appeal. All successful users of SuperMemo link the review with their goals. They see each item and each repetition as a step to a better future. Not all users have this imaginative capacity. This is why SuperMemo has not swept mankind off its feet despite its amazing efficiency.
  • spaced repetition: memory consolidation is more effective if retrievability of memory is less. This leads to difficulty in recall. This is a desirable difficulty. Like with active recall, the reward of review comes from the employment of knowledge and productivity. In SuperMemo, by default, most of review ends with successful recall and there might be some link between difficulty and pleasure. Again, only a subset of users of SuperMemo can find this process pleasurable. Those who don't usually do not last long and drop out. We tell all users, make SuperMemo fun, or it won't work for you! See also: Pleasure of knowing
  • incremental review: SuperMemo advocates learning in spaces. It is more efficient from the point of view of memory and creativity to read an article in small portions over a longer period of time. The same refers to watching a video or listening to a lecture. This results in minor battles for context retrieval. However, it brings an extra bonus in creative elaboration. It also improves memory encoding, generalization, and long-term memory consolidation. Paradoxically, those extra difficulties result in extra learning efficiency that makes incremental reading one of the most pleasurable forms of learning.
  • learning context: changing the context in retrieval is a very simple and effective type of desirable difficulty. If the encoding is correct, retrieval will be successful, it will be more effective and it will be rewarding. If context change leads to generalization and better memory encoding, the effectiveness of learning will increase and the reward of learning will increase.
  • problem solving: solving problems can be very pleasurable. The harder the problem, the greater the pleasure of a solution. Problem solving involves a learning process as the solution requires intermediary steps that result in storing new knowledge in memory. All those steps are pleasurable. If the student struggles with the task and makes no progress, he will learn nothing and receive no reward. The tasks turns out too difficult. If the students fails to solve the problem, but makes progress with intermediary steps, even if they are unrelated to the solution, the learning will be there and the reward will be there. Again, if the difficulty is desirable, it will lead to a reward. If there is no reward, the difficulty appeared insurmountable. As such, it is neither rewarding nor desirable.
  • learning by doing: learning by doing may involve play, creativity, problem solving and more. Learning by doing takes more time and often brings better results and more reward.
  • delayed feedback: delayed feedback, in some circumstances, may result in more processing. In simplest terms, if the teacher does not tell you how well you have done, you may wonder for a while longer. This can benefit memory. If it does, the ultimate effect will be rewarding.
  • help withdrawal: I write about help withdrawal in the context of schools suppressing the learning drive. Kids who receive no answers may become more curious. Curiosity increases the reward of learning. Students who do not receive assistance in correcting their false models of reality, get stronger rewards for resolving inconsistencies on their own.
  • other difficulties: the number of obstacles that can improve learning is endless, some of those can be hormonal in nature, some can involve motivational forces. The common denominator of all those obstacles seems to be some form of deeper processing, memory consolidation, improved attention, and more. Inevitably, obstacles that lead to better learning also involve better reward.

Desirable difficulty does not take away the pleasure of learning. Just the opposite, it makes learning more effective and more fun. If difficulty goes too far, and it results in displeasure then the difficulty is no longer desirable. This simple equivalence comes from the mechanics of the reward system in learn drive.

Note that reward bonus for efficient learning due to desirable difficulty does not need to correspond to high learntropy. Learntropy is a metric for an information channel. Active recall, for example, is unrelated to novelty. It refers to memory reconsolidation. Similarly, problem solving may in part come from the need to achieve goals unrelated to learning, or be rewarded by productivity other than gains in new knowledge.

Note also that nearly all of the above desirable difficulties are inherently wired into the process of incremental learning.

Addiction to learning

Inborn addiction

We are born in love with learning. That love usually wanes fast during the years of compulsory schooling. The longer we can sustain the love of learning, the bigger the benefit for the brain, health, and mankind. Love of learning has nothing to do with addiction. The definition of addiction includes adverse consequences that are a result of compulsive engagement in an activity.

Negative side effects of learning are tiny in comparison to benefits. If there is a degree of voracity or even compulsion, it can boost the positive effects even further. It is possible to boost one's love of learning. Good learning provides the best boost to further learning.

Learning and gambling

There is a close connection between the reward systems involved in learning and in gambling. Gambling and learning new words both activate the ventral striatum in a similar fashion. This close connection with gambling may confuse the picture for learning. A gambler at a slot machine does not learn much. Addictive videogaming is better. It can be pretty educational. Many team game addicts achieve fluency in English having made no progress at school before. Addiction to sports news may also involve a degree of learning. I learned about Cabinda only during the Africa Cup of Nations (football). Addiction to Facebook updates is not different either. It is based on variable reward in anticipation of specific gains, however, it can also involve a great degree of learning. That learning may involve gossip, celebrity news, fake news, or actual useful learning. Even political poll updates can cause an addiction. In the battle between Hillary Clinton and Donald Trump, the polls were balanced enough to produce the cliffhanger effect. Compulsive checks for new polls have all hallmarks of an addiction. This kind of addiction, however, can lead to a great deal of learning. It is up to the student to separate gambling from learning. Voracious learning is good. Learning derived from an addiction may be good too. However, gambling on its own brings little value to human existence. This is why it is very important to understand Reward diversity in preventing addictions

Learning and sleep

Obsessive learning may encroach on sleep time, and may contribute to the epidemic of insomnia and DSPS. Creative minds with powerful learn drive may stay up learning till the early morning hours. This violation of sleep pattern was difficult or impossible before the arrival of electric lighting. The good news is that the learn drive tends to wane with network fatigue. The longer we learn, the greater the degree of saturation in memory circuits. Only sleep can bring relief. This is why even most voracious learners tend to get sleepy and give up learning at some point. If a reader skips the night over a novel, this may be a likely combination of insufficient sleep drive, reduced learning, and increased variable reward that is typical of suspenseful fiction.

Learning and exercise

I hear that obsessive learning can lead to less exercise. That would be bad. However, I think that it is bad learning that is more likely to have this effect. Good learning is joyous and sparks extra energy. A happy kid should not survive long sitting over a book or over a computer. There must be a way to vent energy. Perhaps we should rather say that reduced exercise is a hallmark of learning addiction, while good learning has neurotrophic effects and should make one burst with extra energy to burn?

Learning restraint

Learning has its cost and it takes time. This is why it should be judicious. However, good learning is nearly always a good long-term investment. This is why we should never fear an addiction. Just the opposite, we should cherish and stoke up the learn drive to provide for happy lifelong learning.

Displeasure of learning

When I claim that all learning is pleasurable, I hear a chorus of voices like "I had to go through an awfully stressful exam that gave me lots of good knowledge for life". Those voices confuse the pleasure of good learning with the displeasure of factors that turn learning into a horror for many students. Those horror factors are bad teachers, harsh parents, deadlines, stress, bad sleep, awful textbooks, excess volume, and more.

I hear that without deadlines or school-imposed goals, the learning would be replaced with videogames, novels, TV, hobbies, sports, etc. This might be true for many reasons. Some of those activities may carry pleasures unrelated to learning. However, they will also be beneficial for reasons of learning or exercise. A well-rounded student should be free to slow down, allocate his time for fun learning and other fun activities. Slow progress might bring more benefit.

There is no way the equation of learning could produce unhappiness in the wake of good learning. The blame will always be elsewhere. All negatives should be studied and eliminated.

In the ultimate account, even if there is a displeasure related to exams, certificates and duties, this displeasure should be imposed on the student by herself.

Pleasurable learning can be buried in displeasure caused by stress, bad people, bad schools, bad textbooks, and more.

Learning and procrastination

If learning is the most sustainable form of pleasure, why do half of the students procrastinate? This is nearly a triple of the figure for the general population.

The answer is simple and important: students procrastinate because as much as good learning is a pleasure, bad learning is highly unpleasant. Most of assignments at school or even college carry a great deal of mismatch with the needs of the learn drive. This kind of learning is ineffective and unpleasant. Those kids will often play computer games in the evening claiming they need to rest their brains. I doubt their brains are at rest. They actually do jobs that they find pleasurable. A great deal of that pleasure comes from new learning. Unfortunately, there are no credits at school for good gaming, so the sinusoidal cycle of chores-and-fun begins on the next day or even the same day with homework.

I never stop being amazed how many students call themselves lazy. At the same time they can do many heroic feats of physical of mental work as long as these are enjoyable or serve their own goals. Even those with thousands of memorized items in SuperMemo often give themselves low conscientiousness scores. Goals of learning can be hazy, but even if they are crystal clear, poor match between the input and prior knowledge can result in significant displeasure. If learntropy is low, assignments can be boring. If it is negative, they will be repulsive.

The battle between high goal valuations and negative rewards of bad learning will result in procrastination. Procrastinators often call themselves lazy even if they are nothing but.

If you think you are lazy about learning, you need to re-evaluate your materials and your methodology. Even simple violations of the natural creativity cycle can kill the fun of learning.

Learning and depression

Learning is a sustainable and non-addictive form of pleasure with hardly any side effects other than cost in time. In addition, good learning tends to absorb the mind, and promote more learning by boosting the learn drive. This means that learning should be employable as therapy in depression.

Learning at school

If learning is a source of pleasure and reward, why do we see rampant depression in kids of school age? Despite being institutions of learning, schools are more likely to contribute to depression than to act as a remedy. Without the freedom to learn, it is hard to achieve good learning. For learning to be pleasurable, it needs to be powered by the learn drive. It cannot be coercive or mandatory. It must be free.

Impact of memory on mood

Free learning is fun, however, the pleasure of learning is not what makes learning a great weapon against depression.

Memory is a factor that may trigger or suppress depression. Memories determine how input signals get routed in the brain. Memory determines what concepts get associated with inputs or neural activations. Memories determine how we react to the sound of a passing car. It may bring up the memories of a happy vacation, the inspiration of Elon Musk, or memories of a car accident that crippled a loved one.

For memories to have a significant impact on mood, we need many of them. It is not enough to sit down a session with psychotherapist and learn a few key facts about the brain, our lives, or coping strategies. It takes months and years of learning to develop healthy tracks in the brain. We may build associations that are inherently optimistic or inherently pessimistic. We need thousands of such associations to swing the balance. However, even years of learning may easily be overturned by a pathology or trauma. Neurohormones can instantly change the mode in which the brain works. A switch in neurohormonal profile will instantly give preference to a subset of memories that may affect mood in a negative way. Trauma can plant memories that will stoke up new source of activation that will override activation from other sources. In other words, an armament of good memories may count for nothing if a switch changes the tracks in use or if a new source of activation is born in the brain. It is hardly possible to mitigate the death of a close person with learning.

Once depression hits, the affected individual faces a double whammy. Not only are good memories on defense. Bad memories start circling around facilitating their own new tracks and gaining upper hand. The brain reprograms itself and swings the balance of mood in a wrong direction. When this process becomes a runaway, we may have a clinical depression at hand. To complete bad news, depressed patients lose their love of life and their love of learning.

Can learning disrupt this cycle? It can be extremely hard! Respect for circadian cycle is the first step towards recovering the derailed brain. In the circadian cycle, peak creativity window needs to be captured to attempt remedial learning. Learning needs to be prolific, intense, effective, and pleasurable. Incremental reading would be fantastic if it was not that difficult. For a depressed individual with no skills in the department, SuperMemo is no remedy. It is too late. Trying to master incremental reading in a bad state of mind could only make matters worse. It could result in a hate of incremental reading.

If learning is possible, it can act as a refuge, which might help suppress negative memories and build new connections. As of that point, the process of building new tendrils of knowledge may begin. This process that should take the mind towards a more optimistic interpretation of the world is slow and laborious. In most severe cases, it may take months or years of hard work and the outcome is not guaranteed.

The ultimate conclusion is that learning is not a panacea, however, it can play an important role in therapy. Most of all, the risk of depression can be staved off years in advance by rich and effective learning. That learning must proceed in conditions of freedom and respect for the learn drive. In short, love of learning is a good way towards the love of life.

Anti-depressants

I am a medical Luddite. For a healthy body, I stick to the rule "if it ain't broke, don't fix it". I avoid all forms of pharmacological intervention. I believe in powers of homeostasis and dangers of homeostatic intervention. The strongest drugs I use are coffee and beer. I do not even use aspirin. I am most dismayed by the misuse of antibiotics, painkillers, sleeping pills and anti-depressants. It has been decades since I last took an antibiotic. Long enough to forget. I will use one on a death bed if necessary. All drugs have their legitimate use and so do anti-depressants. As they result in receptor downregulation, once taken, they make the neurotransmitter status quo worse. This usually means, the more the drug is taken, the more it needs to be taken to avoid a setback. However, in severe cases of clinical depression, the drugs may stop the runaway process. They may protect the brain from self-injury. Once a depressed patient starts losing brain cells, the road to recovery becomes long and bumpy. The moment anti-depressant therapy begins, if it works, is the best moment to use learning as therapy. As long as the brain is willing to proceed, learning can start up those delicate tendrils of knowledge that will hook onto reality to produce vestigial learn drive. In the ideal case, once the drugs are withdrawn, that learn drive should survive to begin a process that is a reverse of depression: positive feedback of learning, creativity, good sleep, and good mood. This is not easy, but it is very important. If drug therapy is the only thing that changes in a patient's life, it will work only as a break in the pathological process. It will not set the brain in a better state than the one from before the problem started. Improvements require active effort. Without a healthy learn drive, building up positive memories will not begin.

Learn drive and optimism

Toddlers seem to show the most exuberant learn drive. No wonder, healthy children are born optimistic. There is a correlation between optimism and the learn drive. Happy mind might act as an energizer of the learn drive on the neurochemical basis. Pessimism will definitely act as a suppressant or filter that will prevent the expression of the learn drive. In that sense, pessimistic mind may mask the learn drive. In depression, the learn drive may disappear entirely. No wonder Dr Robert Sapolsky called depression the worst disease in the world.

A consensus seems to emerge that schools are a major contributor to depression among teenagers (and later in life). The mechanism isn't clear, but learned helplessness and the suppression of the learn drive emerge as possible keys to the pathology.

Can learning help you?

If you are reading this, and you are not sure learning can help you, ask yourself the question: Are you in a good mood today? As mentioned above, when you are on a downswing and looking for a solution, your interpretations are darker, and you may not find this text comforting enough. Remember then about the concept of activation energy: you need a little first step to begin, and you may then be pulled in by a vortex of interesting things to learn.

If you are in no mood for quantum mechanics today, start from petty celebrity news, or sports news. Lowly learning is better than no learning! Alternatively, if trivia make you even more depressed, see this fun-to-read text from Susan Engel about learning and depression.

Further reading

Optimization of education: Global or Local?

Is there a risk in using pleasure as a guiding light in education?

Perfect model of education

Over long years of schooling, we slowly develop an imaginary model of a perfect academic learning process in which we set long-term goals, follow the curriculum, add important pieces of knowledge, and get to the point when we receive a college degree with rock solid knowledge in a given area supported by extensive general knowledge needed for an efficient function in society. The longer we stay in the school system, the harder it is to step away and have an objective view of that model. Paradoxically, verification of that model comes hardest to those minds who do well at school and start believing they have succeeded thanks to that perfect model of academic learning. Smart people suffer less pain at school, and, as a result, think less of the problem of the school system. Successful students internalize the model and perpetuate it by providing the same fixed path for future generations.

The model in which we design student's knowledge via curriculum is wrong! The model of a perfect school gives credit to the system and the teachers, while all actual learning should be credited to the student. When kids fail school in droves, we tend to blame the kids, or their parents, while a small fraction of successful students will continue dreaming of the perfect school model for their own kids, and keep pushing the model on the less fortunate ones.

Optimization based on the learn drive

Unlike the curriculum, the optimization mechanism behind the learn drive has been perfected in the course of human evolution. It is capable of driving individual knowledge to the level needed to disentangle all complexities of science or engineering. Before the arrival of compulsory schooling, mankind has achieved all imaginable breakthroughs needed to start Enlightenment or Industrial Revolution. Compulsory schooling has originally helped to lift the "unenlightened" masses to a new level, however, it is increasingly driving itself into the optimization corner in which enlightenment is replaced by suppression of creative minds.

Designing a child's mind

I hear this all the time from highly educated and very smart people that education is too important to let it rely on self-learning or on the blindness of the learn drive. Apparently, education is so important that we should plan it and design it globally with the best tools of science and using the best experts. While I was preoccupied with efficient learning, and before I really started thinking about the education system, I lived with the same conviction. It is quite natural to default to expert opinion.

Highly educated people often utter the following claims:

  • children are incapable of long-term planning, therefore a curriculum is needed
  • learn drive is a type of local optimization, while we need to plan education globally
  • following student interests is a recipe for disaster: they will all end up immersed in mind-numbing videogames

The problem is that global optimization of education sets performance targets that keep getting tighter. Global optimization keeps employing the same inefficient learning tools in an attempt to transfer more "necessary" knowledge to student minds. The outcome is misery for millions of students. While Stalin optimized globally for massive achievements of the Soviet Union, it was the market economics with its simple optimization algorithms that lifted the western world to new heights. See: Modern schooling is like Soviet economy

Currently employed optimization of education uses knowledge tests as the measure of performance, but relies on cramming and short-term memory to achieve more in a shorter period of time. As a result, it keeps losing its grip on the learn drive. Competition between nations also employs performance tests. Instead of optimizing for actual long-term knowledge, we optimize for the speed of knowledge turnover in student heads. The result is unhappy students with knowledge that is tiny relative to the time invested and to the actual human potential.

Reliance on emergence

Optimization of education can employ the concept of emergence. The learn drive is a mechanism by which knowledge is self-organizing with no effort from teachers, and no pain from a child. Natural learning may take long hours, but it is pleasurable, and healthy kids don't mind learning all day long as long as this is learning of their own choosing.

There are two vital facts we should hold in mind in reference to the local optimization of learning based on the learn drive:

  • without a reliance on the learn drive, there is no good learning. All attempts at override will be massively rejected by human memory
  • learn drive brings amazingly efficient long-term optimization of the learning process. Nearly all human achievement before the 1850s has been accomplished with the guidance of the learn drive

A skeptic would notice that human progress has accelerated since the introduction of compulsory schooling. He would be right. However, we have been on an accelerating ascent of progress ever since the emergence of the first forms of life 4 billion years ago. I see Guttenberg and Tim Berners-Lee as more significant contributors to that acceleration than that of the respectable Johann Julius Hecker.

Local optimization based on the learn drive is highly unintuitive. Creation science comes from a similar unintuitive feelings about the mechanism of natural selection. How can a local evolutionary optimization based on random mutations lead to a marvel of a human being? Global design/optimization/guidance by the hand of God seems unavoidable. Fewer people subscribe to the creation science today, however, a vast majority of the population has no idea what mechanism underlies the learn drive, and why ignoring it is the chief problem of the Prussian education system.

The tree metaphor

Given enough time and access to knowledge-rich environments, without the need for an education system, the knowledge of an individual grows into a large, comprehensive, and coherent body. This is true of all free, and healthy individuals. The size and the quality of the tree may depend on one's personality, interests, and the starting point of the intellectual development. However, one of the chief myths of education is that the organic growth of knowledge leads to multiple biases and areas of ignorance. Those blank spots are allegedly larger than those that remain after years of schooling. Due to the computational power of the learn drive, and the phenomenon of emergence, the opposite is true. The metaphor I like to use to explain the power of the learn drive is that of a tree growth.

Natural growth of individual human knowledge can be compared to a growth of a tree. Individuals cells in the meristem of a tree twig know very little of the tree and its global growth goals. The meristem follows simple hormonal, biochemical, or biophysical rules (e.g. apical dominance). Those simple rules guiding growth towards light are highly efficient and the tree can shape its crowns beautifully. It will also efficiently organize into a canopy with other species. Force of gravity is tackled optimally. Redistribution of nutrients is easy. Absorption of light is excellent. All obstacles, e.g. other trees, rocks or lamp posts, are handled with ease. Similar mechanisms ensure an efficient growth of a plant root system. A simple set of local rules is also employed by the growth cone in sprouting new neural connections in the brain.

The tree of knowledge works along similar principles. The learn drive mechanism makes sure that individual leaves of memory crave light of new discovery and sprout branches in the direction of inspiration. Locally, the learn drive may seem simple and blind. Globally we grow great individuals with erudite knowledge needed to support all vital human functions in society. Self-learning brains can fit any environment and fulfill all imaginable human goals.

As much as trees need water, CO2, some nutrients and light, brains need energy, rich input, and unconstrained freedom. All attempts at coercive regulation suppress the learn drive and the tree of knowledge fails to germinate on its own

Another metaphor that can help explain the emergence in building up coherent knowledge is the Knowledge crystallization metaphor:

Crystallization metaphor of schooling and unschooling
Crystallization metaphor of schooling and unschooling

Figure: In perfect schooling we create a perfect crystal of knowledge. In college, we add an extra crystal of specialization. In reality though, learning looks a bit less perfect. For most kids, knowledge never builds sufficient coherence and falls apart due to interference (i.e. fast forgetting). As a result, in real schooling, knowledge asymptotically reaches a certain volume and keeps churning around from that point on with little progress in stability or coherence. In contrast, in free learning, the acquisition of knowledge is chaotic and uneven. However, as long as it is based on the learn drive, the volume of knowledge is very large. Individual crystals of knowledge collide, and build consistency and coherence. This in turn helps stability and further integration of knowledge. By the time of college, in terms of volume, free learners should know far more than ordinary students. Free knowledge has multiple areas of strength, and multiple areas of weakness. However, it is superior in coherence. This is why it is more applicable in problem solving

Local optimization

Local optimization of the learn drive leads to a perfect match between human ability and individual's environment and goals (see: Optimality of the learn drive). Global optimization of schooling suppresses the learn drive, defers to the suppressed learn drive when matching individuals with their jobs, and results in an unhappy society where most individuals crave 9-5 jobs for their comfort where the leadership, learning, and responsibility are delegated to someone else. The opposite happens in democratic schools which rely on self-learning to produce self-determined, self-fulfilled and self-reliant individuals ready to accept any challenge in their chosen area of interest.

In his historic commencement speech, Steve Jobs joked that before he was diagnosed with cancer, he did not know what the pancreas was. Apparently, his blind learn drive left a gap in his extensive knowledge. Even if this was true, I would never trade Steve Jobs and his opus vitae for a few failures of the local optimization of learning. One of the main points of his inspiring speech was to follow one's learn drive. In his words "the only way to do great work is to love what you do". This truth has been repeated by all wise people for millennia.

Is global optimization possible?

Global optimization finds an optimum for all input values. Global optimization of learning is done at the level of the department of education, e.g. by means of tools such as common core and standardized testing. Global optimization is based on the flawed reasoning that we can design a child's mind. Global optimization can also be done by parents who attempt to predict a child's future.

Can we determine a child's future in advance? If parents were to choose future globally and optimally, we would have a surplus of lawyers and doctors. We would also have a major increase in frustrated college dropouts. If governments were to help a bit and redistribute the jobs for kids optimally at early age, we would end up with a variant of 1984. Few kids would love to find out at the age of 6 they are set for a life as a book-keeper or a carpenter. Job selection should obviously be based on love and passion, not a government decree.

Perhaps kids should then be allowed to optimize globally? That would not work either, we would end up with a surplus of rock musicians, professional videogamers, and football players.

Contrast this with optimization via the learn drive that has delivered the best of human achievement for centuries.

Is then a curriculum an attempt to find an intermediary optimum on the way to a global optimum. Curriculum as a guide to what is worth knowing seems like a good idea. When a kid or a teacher runs out of enthusiasm for learning, they might consult the curriculum. If the learn drive is in overdrive though, why slow down? Is there a risk the kid will never learn the dangers of alcohol? This isn't too likely. On the other hand, I am not aware of a curriculum that teaches kids how to employ incremental reading. I might be biased, but I would definitely put that skill ahead of the need to cram Kawalec or Battle of Cedynia (examples taken from my own curriculum). I can appreciate late Julian Kawalec today. However, mandatory reading of his novels imposed by the communist authorities was a source of school torture for me. You probably wonder who Kawalec was. I would love to tell you but Wikipedia has an article on his achievements in Polish only.

If you test student knowledge against the curriculum, it is easy to see they master a tiny subset of that globally optimized plan. They add to this a great deal of their own knowledge about the world obtained via self-learning. This leads to the illusion of good schooling. If curriculum was not obligatory, and teachers had more room to adapt, the volume of knowledge and its coherence would increase. Coherence and speed are two hallmarks of self-learning. Fewer kids might choose to solve quadratic equations, but they would fill up that space many times over with other skills they consider important to them. All those who plan careers in STEM would get to quadratic equations anyway, sooner or later. The rest would fall back on current default, which is to learn the equations and forget them fast. Most people do not know how to tackle quadratic equations. Few know of their purpose. Equations in the curriculum add distress and the cost of knowledge that might have been opportunistically acquired efficiently in a happy state of mind.

If the global long-term optimization is not possible, intermediate steps in the form of a curriculum plan are only less complex. They are still a departure from the optimum determined by the learn drive.

The only way to optimize efficiently is to let the learn drive determine the trajectory with gentle nudges from parents, mentors, peers, strangers, social media, wikipedia, Google, and more. Optimization of education must adhere to the fundamental law of learning (next).

Fundamental law of learning

Most people know that learning can be pleasurable. However, very few people appreciate how important this fact is for the future of education.

Only a constant stream of precious findings in neuroscience helps us see the fundamental importance of pleasure in learning. The reward process begins at the level of perception, and proceeds via associative learning, to creativity, to problem solving, and the ultimate pleasure of achieving goals. At each station there are pleasure signals to reward the progress of brainwork.

I was slow to understand the power of pleasure too. Back in 1991, we wrote conservatively: "There is a sure way to tell if a given student will be successful in his work. If he finds pleasure in long-lasting learning sessions, he is bound to do a terrific job" (see: SuperMemo Decalog). Today, we realize that the pleasure is so inherently associated with all forms of learning in neural networks that it emerges as one of the best yardsticks in measuring learning progress.

This makes it possible to formulate the fundamental law of declarative learning:

When there is no pleasure, there is no good learning.

Naturally, this law needs to be qualified to be precise. Good declarative learning results in pleasure. This fact can be masked by factors such as the fact that a bit of good learning can hide in a mass of bad learning. Pleasure itself is no warranty of learning. Facts that we discover can be distressing. Some declarative learning may occur in conditions of displeasure (e.g. fear conditioning). Classical conditioning often involves pain. Clinical depression will impede one's inclination to take on biking, but will not ruin the procedural learning that occurs while biking.

The fundamental law of declarative learning simply states that the acquisition of quality knowledge that satisfies the learn drive will produce a reward signal. Absence of that signal is an indication of the absence of learning. Dry facts can be committed short-term to declarative memory without having fun, but those facts will not adhere to solid models of reality if there is no reward from learning. Those facts are likely to be eliminated from memory fast by a healthy system of forgetting. Even worse, bad and persistent engrams can cause problems with learning later in life! The emergence of any coherent model in memory will inevitably produce a reward signal.

If you happen to impose the suffering on yourself on your own, you need to rethink your strategies. You may need to slow down, or go back to basics, learn the rules of mental and sleep hygiene, manage your stress, learn the 20 rules of formulating knowledge or perhaps give incremental reading a try. If you persist despite pain, you will not be rewarded with good results. Gladwell's 10,000 hour rule also needs to be qualified. No violin virtuoso has ever been born out of sheer suffering through hours of practice. Like with learning, great music is a child of love.

On the other hand, most of students of this world suffer of no fault of their own. Bad learning is imposed on them from above!

Students of the world unite! You no longer need to suffer the pain of learning. If you suffer, you have your basic student right to protest. If you suffer, something is going wrong! You can stop learning! If anyone demands learning from you, and you do not enjoy it, you can strike back, and demand pleasurable learning! This is not your elitist hedonistic weak heart demand. This is a demand of reason. No pleasure, no learning! Your suffering is a waste of time, a waste of health, and a waste of human global resources! See: Declaration of Educational Emancipation
Learn drive vs. School drive
Learn drive vs. School drive

Figure: This is how school destroys the love of learning. Learn drive is the set of passions and interests that a child would like to pursue. School drive is the set of rewards and penalties set up by the school system. Learn drive leads to simple, mnemonic, coherent, stable and applicable memories due to the fact that the quality of knowledge determines the degree of reward in the learn drive system. School drive leads to complex, short-term memories vulnerable to interference due to the fact that schools serialize knowledge by curriculum (not by the neural mechanism of the learn drive). Competitive inhibition between the Learn drive and the School drive circuits will lead to the weakening of neural connections. Strong School drive will weaken the learn drive, destroy the passion for learning, and lead to learned helplessness. Powerful Learn drive will lead to rebellion that will protect intrinsic passions, but possibly will also lead to problems at school. Storing new knowledge under the influence of Learn drive is highly rewarding and carries no penalty (by definition of the learn drive). This will make the learn drive thrive leading to success in learning (and at school). In contrast, poor quality of knowledge induced by the pressures of the School drive will produce a weaker reward signal, and possibly a strong incoherence penalty. The penalty will feed back to produce reactance against the school drive, which will in turn require further coercive correction from the school system, which will in turn reduce the quality of knowledge further. Those feedback loops may lead to the dominance of one of the forces: the learn drive or the school drive. Thriving learn drive increases rebellion that increases defenses against the school drive. Similarly, increased penalization at school increases learned helplessness that weakens the learn drive and results in submission to the system. Sadly, in most cases, the control system settles in the middle of those two extremes (see: the old soup problem). Most children hate school, lose their love of learning, and still submit to the enslavement. Their best chance for recovery is the freedom of college, or better yet, the freedom of adulthood. See: Competitive feedback loops in binary decision making at neuronal level
Copyright note: you can republish this picture under a Creative Commons license with attribution to SuperMemo World, and a link to the updated version here

Summary: Pleasure of learning

  • human brain naturally tunes in to "interesting information" in the environment
  • learning and discovering new things is rewarding
  • many educators subscribe to the dangerous myth that learning may cause displeasure and still be effective
  • surprisal is highly valued in new knowledge acquisition
  • predictability and surprisal may both add to attractiveness of the information channel
  • attractiveness of the information channel depends on the prior knowledge
  • information delivered to the brain must account for prior knowledge. This factor makes universal delivery, e.g. via lecturing, very difficult
  • attractiveness of the information channel depends on the speed of delivery and the speed of processing
  • the speed and complexity of information delivery in learning must to tailored to individual needs
  • the encoding of a new high value associative memory occurs simultaneously with sending a signal to reward centers in the brain
  • failed tailoring of information channels in schooling leads to lack of reward
  • learning provides a unique type of sustainable pleasure that may have therapeutic value
  • for systemic reasons, schooling usually fails to tune in to child interests
  • unrewarding nature of schooling is the chief cause of near-universal dislike of "learning" at school
  • by destroying the pleasure of learning we contribute to creating an unhappy society
  • the fundamental law of declarative learning states: No pleasure, no learning!