Optimality of the learn drive

From supermemo.guru
Jump to navigation Jump to search

This article by Dr Piotr Wozniak is part of SuperMemo Guru series on memory, learning, creativity, and problem solving.

Goals in learning

Education systems around the world attempt to define optimum curriculum for an average intelligent man on the planet. In that quest, adults use the entire experience of mankind to optimize the goals for a child. The problem with this approach is that all knowledge in a concept network grows layer by layer in a conceptualization process. The adaptation process resulting from the conceptualization is controlled by a set of human needs, the interaction with the environment (esp. the knowledge environment), and the goals. It is possible to explain goals to a child, and hope the adaptation will proceed in the direction of those goals. However, the goals themselves will obtain valuations that will reflect the directions taken by the conceptualization process itself. It is only possible to plant seeds of ideas. It is only possible to plant the seeds of goals. It is not possible to set an extrinsic goal for a brain and hope it arrives on its own by rational control. Without a goal rooted in the knowledge valuation network, it is impossible to employ the learn drive in a quest to achieve that goal.

In free learning, extrinsic goal setting for an intelligent developing brain is virtually impossible

Free exploration

The conceptualization process is based on the phenomenon of emergence. It can only be controlled by the environment and the needs. We can plant books in a child's room, but we can't make a child read. All the controlling factors must not interfere with the central valuation system of the entire process: the learn drive. Interference with the learn drive will inevitably lead to a war of the networks and the suppression of the drive. As a result, the entire conceptualization process can be derailed. Interference with the needs, by means of rewards and penalties may distort the conceptualization too. If a child is left hungry, it will experience powerful incentives to direct the conceptualization towards the body's nutritional needs. However, this type of incentivization is not necessarily conducive for high level intellectual achievement. In an extreme case, we can raise a perfect spartan warrior by the age of 5. However, a brain busy with conceptualizing for the goals of warriorship is not likely to soar in the realms of abstract knowledge later in life.

If the basic needs are satisfied, access to knowledge becomes the chief driving force. Open access to the world wide web of knowledge is probably the least prejudiced tool of efficient and unbiased conceptualization. On the sidelines, the adults may plant enticing baits that trigger explorations. The baits might have a form of a microscope for a future scientist, or a baseball bat for a future sports star. With a bit of luck, the baits can nudge the trajectory, however, all explorations must be free to be optimal.

Free exploration is necessary for efficient conceptualization. Its direction can occasionally be nudged with lucky incentivization

Harms of intervention

Each time the adult world attempts to define optimum educational targets, it fails to account for the fact that all educational trajectories are based on incremental optimization based on the valuation of individual pieces of knowledge. All lofty educational goals may easily be ruined by the fact that pre-designed trajectories include learning steps that are too steep or too flat. Each time suboptimal learning choices are made, the process is slowed down, and the learn drive is being conditioned out of the picture. In addition, coercive learning that keeps stumbling over steps that are too steep will inevitably lead to toxic memories and the hate of learning.

Coercive intervention in the learning process will inevitably suppress the learn drive, and undermine further explorations

Learning as hill climbing

The learn drive operates by the comparison of value gradients in the environment. By comparing the valuation of pieces of knowledge, and the learntropy of the sources of information, the brain will fashion an objective function to optimally choose the most efficient stream of high value knowledge from the environment. This optimization may lead to a local minimum. A degree of stochastic disruption is often needed in the algorithms used in mathematical optimization. In the same way, the learn drive may employ the random aspect of creativity. This makes the guidance of the learn drive non-deterministic. The cumulative effect of minor stochastic disruptions will be that the same learn drive, in the same environment, for two identical brains will produce vastly different outcomes.

In a population of children, each will find her own local optimum niche. This will lead to a harmonious balance in the distribution of skill assets in a population. The role of the adult world should be limited to convincingly sketching out lofty horizons, and to non-coercive assistance for those who get stuck in unattractive local minima. The learn drive satisfies the local or populational optimality criterion.

The guidance of the learn drive in exploration can be compared to hill-climbing in mathematical optimization

Learn drive as optimum guidance

In the process of building an efficient concept network, new pieces of knowledge must match prior knowledge (see: Jigsaw puzzle metaphor). This implies that it is impossible for the external agent to predict, which pieces would provide such a match. Interactive tutoring may be pretty effective in identifying such pieces, however, the entire process must be under the supervision of the learn drive. For example, a tutor may be pretty efficient at explaining the intricacies of a math problem, but the learn drive may call for a switch to another form or area of learning (e.g. due to domain-specific fatigue).

The learn drive has the capacity to compare the value of two pieces of knowledge (using the knowledge valuation network). It also has a capacity to evaluate the value of the information stream (i.e. it's learntropy). With those capacities, the learn drive can be compared to a sense of smell that can efficiently detect information value in the environment.

Due to its reliance on prior knowledge and unique knowledge valuations, the learn drive is irreplaceable. As such, it provides the optimum guidance in the learning process.

The guidance of the learn drive is optimum. Child is always right

It is easy to misunderstand the claim of the optimality of the learn drive. It is an efficient control system that supervises the selection of pieces of knowledge and the selection of information channels (on the basis of their learntropy). This does not imply that the reliance on the learn drive is hermetic. The learn drive needs to compete with other drives (e.g. the sex drive). In that sense, the learning choice may be suboptimum, esp. in conditions of reward deprivation that amplify selected drives (see: Reward deprivation).

Moreover, the optimum choice of knowledge or information channel does not ensure optimum knowledge. In an extreme case, knowledge may be false and lead to the death of an individual. The optimality of the learn drive is limited to choices made in learning.

Optimality of the learn drive does not imply optimality of decisions, let alone behaviors, let alone the outcomes

Populational optimality

The emergence of knowledge conceptualized under the guidance of the learn drive can be compared to the emergence of species in the evolution. Optimality criterion ensures locally optimum trajectory that can lead to a diverse and well-balanced ecosystem. As there is no optimum species, there is no optimum of human knowledge.

As Georgios Zonnios put it:

Evolution naturally proceeds towards a state of high interdependence between the many different parts. For knowledge, this means individual learners will learn things that are relevant to their context. For society, this means that all angles of knowledge will be covered by different people in different ways. Where opportunities abound and resources are sufficient, change may occur very quickly. For example, a significant shortage of workers in a specific field will naturally cause individuals to work towards that field, possibly by increasing incentives to work there

Curriculum designers attempt to play God and design a perfect singular species that could optimally use the earth's resources. This is a futile effort that undermines the lofty goals of education. Analogously, the late specialization in college is tantamount to deferring speciation till the emergence of the Dinosauria.

There is a simple mountain climb metaphor for the superiority of the learn drive over direct instruction:

In mountain climbing, the adult may see the summit (goals), but the child can see the path (via the learn drive). The adult will always attempt to deterministically go for the summit in sight. The child may climb to new heights (i.e. new discoveries). The view of the path ensures local optimality of the climb, and global optimality for a population of climbers. This way the individual climb does not need to be globally optimal nor deterministic. For more see: Mountain climb metaphor of schooling

Architectural differentiation

The power of free learning is rooted in the fact that the same abstract knowledge can be represented by different concept network architectures with different valuations, stabilities, retrivabilities, etc.

Different networks may produce different outcomes for the same input. They may produce different solutions to the same problem. Most of all, they may favor different models for the same supporting evidence.

Even if the topology of the network was to be identical, the outcomes depend on link properties and a degree of creative randomness. The outcomes of conceptual computation feed back to the network and result in diversification that is inevitable even if the environmental inputs and brain states where to be identical.

For example, a scientist who arrives at studying the brain from the field of artificial neural networks may have connectionist misinterpretation of how the brain works. As I arrived to the same field of study via the two component model of long-term memory, I instantly favor the grandmother cell theory, which in turn favors my own take on the brain's conceptualization process, which in turn favors the differentiation in education, which loops back to my fervent support of free learning. A well-schooled approach would be to study the perfect textbook called "The Brain" and there would be no competing schools of thought. There would only be the one "true" school as defined in the perfect textbook.

For those architectural reasons, the sequence of learning determines the layering and the ultimate structure of knowledge. Homogenized learning at school, aims at identical models and identical architectures. In reality, schooling collapses due to the natural differentiation of concept network architectures. Instead of paddling against the river of diverging conceptualization processes, we should let each student build her own semantic framework for each abstract model. This is the key to human innovation.

Conceptualization is always divergent even if environmental inputs, brain states, and network topologies were identical at the starting point

Videogame argument

The most often raised argument against the optimality of the learn drive is the claim that kids left free to choose would play videogames all day long.

Parents are right that kids would indeed binge on gaming in the first period of freedom. That binging would gradually ease due to the reward depletion that increases valuations of competitive rewards: friends, sports, YouTube, social media, etc.

School is the prime cause of that game binging. Young kids may grow with a gaming console used by their father, only to begin their true adventure with gaming in proportion to the pressures of schooling. At some point, a secondary factor may start playing a role: inconsistent parent generating variable reward by occasional bans or limits on digital devices. This can spiral into a true addiction that may take quite a while to recover from at the time when the child is given total freedom.

Even a simple and consistent limit on gaming may backfire if it is too narrow relative to the needs. If there is no saturation, if the child is left unsatisfied, the value of the reward of gaming will increase by sensitization. This is analogous to sensitization that occurs when we are allowed to incompletely satisfy the thirst for water (e.g. at 80% only). This will result in relative suppression of other sources of reward even if they are freely available. Next time we are thirsty, we may fight for water a bit earlier and a bit harder. By choosing narrow time margins for gaming, the restrictions may increase the craving even if they are set consistently.

The optimality of the learn drive can be undermined in the same way as a single pest species can ruin a perfectly harmonious ecosystem.

Even a small seemingly rational intervention in the reward system of the brain may override the reward of the learn drive and undermine its optimality

Local maximum problem

In the optimization process, there are blind pathways strewn with candy leading kids astray in videogames. In theory it seems possible to design a learning space in which a human would land in a local maximum of some virtual reality. Such a design is unlikely in the light of the access to an infinite variety of explorations on the web. However, it is interesting theoretically.

If someone was to find a fake summit in virtual reality, coercive learning is the exact societal error that could perpetuate the fake find for ever. Learn drive is partially stochastic and its optimality must be seen from the populational point of view.

Optimality of the learn drive refers to its being the best comparator of the value of knowledge, and of the information channels. It does not mean that the learn drive is free from the competition from other rewards (e.g. gambling, alcohol, sex, etc.). The key to harmonious development is freedom. It is the limits on freedom that result in reward deprivation that may lead to addictions (incl. game addiction).

If the learn drive was to land one in a local maximum, freedom to explore is the best chance to get out of the trap, at least as a population

Electronic circuit metaphor

If you compare the learn drive control system to an electronic circuit, the reasoning about optimality can be simplified and obvious. The learn drive is like a switch. If you damage the switch, the whole network of knowledge can become dysfunctional. Without the learn drive, the store of knowledge cannot expand efficiently, and it cannot heal the damage to the switch itself.

See: Circuit metaphor of the learn drive

Optimal control theory

Personal anecdote. Why use anecdotes?
There is an ironic twist to the concept of the learn drive, and its use in education. One of the last acts of coercion in my 22 years of schooling was a clash with Professor Puchalka at the University of Technology. In 1986, I was finally free to learn the way I wanted. Having gotten rid of my military service problem, I was free to quit the university. However, I opted to pursue an MS degree in computer science using a so-called individual path of study. I could add and remove subjects from my list of books to study (see: How I invented perfect schooling). There was only one caveat. My new path had to be approved by the faculty and Prof. Puchalka agreed on one condition: his own lecture on control theory was to remain on the list. It remained compulsory (if I wanted a degree). Puchalka said "control theory is everything. No engineer could leave the school with a degree without understanding the subject". He was right: control theory dominates so many branches of science that without it, we keep choosing wrong strategies all over the place. Puchalka was also awfully wrong. It is precisely the control theory that should tell him that you cannot control the learn drive of a student.

The interaction of (1) the conceptualizing brain armed with the learn drive with (2) the environment is an example of a continuously operating dynamical system, i.e. the exact kind of system Prof. Puchalka wanted me to study. Stable and effective control is based on the freedom of choice. The learn drive guidance system is the optimum controller in the learning process. The learn drive uses knowledge valuation network as a filter of the input received by the sensor. That filtered signal is fed into a signal comparator. The controlled process variable is learntropy of information channels available in the environment. It is compared with the expected value of learntropy derived as a trailing average of input valuation. When the value of the signal drops below a certain level, the learn drive system may initiate a search for new sources of knowledge.

Only the learn drive system maximizes the learntropy of the input signal. A teacher introduces a control error in the system. Reactance is the controller's response to the error. Freedom is a conditio sine qua non of efficient learning. That includes the freedom to skip control theory. I would be back to control theory anyway; at the right time, in the right context.

You cannot make a student learn efficiently without building a foundation in the knowledge valuation network. Even worse, coercion leads to reactance that may hurt the learning process. In 1986, I was desperate to learn programming. I was right to give programming a priority. Not only did programming lay some groundwork for a better understanding of the control theory (algorithms can be more intuitive than calculus). More importantly, programming directed me on the path to SuperMemo that totally changed the course of my life. Puchalka was a great expert in control theory, but this did not help him understand the optimum theory of control of the learning process. I cheated on the exam (probably the only time in my life), left school with very poor understanding of control theory, and had some toxic associations with the subject for quite a few years afterwards. Luckily, dendritic exploration of the world of knowledge had to lead me back to the subject. It is the control theory that provides the theoretical underpinnings of the optimality of the learn drive. Teachers should never coerce students into learning. The smarter the kid, the bigger the reactance and the more violent the defense of one's own autonomy. 33 years later, professor Puchalka has long retired, and I feel utterly vindicated. He was right about the value of control theory, but I was right about my terms of learning. This episode from my own history adds extra fuel to my fight for the educational liberation of the young generation: Compulsory schooling must end



For more texts on memory, learning, sleep, creativity, and problem solving, see Super Memory Guru