SuperMemo does not work for kids

Jump to navigation Jump to search

This text is part of: "I would never send my kids to school" by Piotr Wozniak (2017)


Back in 1990, we had no doubt: SuperMemo would conquer the world. I envisaged it in every home. Every child would be introduced early to maximize her performance at school. Older people would use it to retain mental vitality into their nineties. We could even cure Alzheimer's disease. Unfortunately, we have gradually been discovering many areas where SuperMemo is not applicable. In this short text, I explain why SuperMemo does not work for kids.

Using SuperMemo with kids

SuperMemo Algorithm is based on a mathematical model of memory. Those changes are molecular and structural. The underlying assumption is that synapses that hold a memory are there in place and are accessible via the same fixed neuronal pathway. Only this pre-assumed network stability allows for memories to last for years and decades. In an Alzheimer's patient, a neural pathology takes away neurons needed to form memories. In a child, the opposite is a problem: too many neurons keep being added to the picture.

In a fast-growing brain with fast recycling of neurons and neural connections, SuperMemo does not work. A child's brain is in the state of neural flow, and a specific memory will often be lost to processes other than just forgetting based on interference or molecular decay. A memory may be lost because the involved synapses are lost or the neural pathway to those synapses gets severed or re-wired.

In an adult student, resuming repetitions after a memory lapse will often be a form of re-learning. It will re-establish memories that got weak. In a child, the same process will not differ from learning from scratch. The same memory that might have been established a dozen of times can still behave as if it has never been there. Such a memory cannot be subject to SuperMemo review. It is as if we deleted an item and typed it in in SuperMemo again. This will primarily be the case for memories that are abstract in nature, for example, the alphabet, numbers, proper names, capitals of the world, etc. Before a rock solid model of reality develops, those abstractions do not integrate well and result in low memory coherence.

Low coherence is less of a problem if there is a sensory, emotional, or procedural component to memories. This will stop being a problem entirely if the new piece of knowledge complements the world model formed by a child's brain.

If knowledge is like a jigsaw puzzle, child's brain will reject new pieces that do not fit the puzzle.

As explained in Pleasure of learning, pieces that fit produce a reward, and spark further interest. Pieces that do not fit cause discouragement and get rejected by the memory system. They can lead to toxic memories. In short, each time there is a passion and interest, different memory rules apply. If the kid is fascinated with trains, Lego, planets, or minerals, he will be able to establish complex memories "ahead of time".

Flat forgetting curve

This amazing graph, which I managed to obtain from a pre-schooler's SuperMemo collection, shows how different kids are from adults:

A forgetting curve from a preschooler's SuperMemo collection
A forgetting curve from a preschooler's SuperMemo collection

Figure: A forgetting curve from a preschooler's SuperMemo collection. The absence of forgetting indicates the absence of intentional declarative learning. The decay constant is nearly zero which makes optimum interval meaningless. 1706 repetition cases have been recorded. This flat forgetting curve would go unnoticed in older versions of SuperMemo due to the adult-centric assumption that on Day=0, retrievability is 100%. Overtime, this forgetting curve will lean down to produce a graph typical of adult learning. This process may take a few years and should not be artificially accelerated, e.g. by means of coercion. This curve is a hypothetical expression of the semantic brain

Instead of a typical forgetting curve that you know from collections used by adults, you see a brain whose content does not seem to be affected by the learning process in SuperMemo. It can be interpreted as absence of memory, absence of learning, or absence of forgetting. All interpretations carry a grain of truth.

The flat graph comes from the fact that small kids have a very poor metacognitive capacity. They do not employ mnemonic techniques. They easily remember by relying on "animal memory": colors, shapes, sounds, situations, episodes, threats, and important words of their native language. This is an extreme case of the need for knowledge coherence. Whatever enhances child's model of the world will integrate with his "world" knowledge. It can even integrate at the physical neural level and last for a lifetime. All irrelevant and abstract material will be rejected. As a result, this is a perfect illustration of the fact that early teaching makes little sense. It is the child's brain that makes choices on what to remember. SuperMemo becomes useless because it does not have a contribution to the child's knowledge. That contribution will only slowly develop over time when memory improves, mnemonic capacity improves, and the ability for abstract reasoning shows up.

It the presented example, exactly 2/3 of knowledge is retained (stable in memory), while 1/3 is instantly rejected. There is no magic behind 66%. The number is a sheer coincidence. All it says is that 33% of the material is new and rejected, while 66% of the material is already well remembered and will be retained. The actual forgetting curve will never truly be flat. A child forgets and re-learns on a regular basis. This is why items that get lost may get compensated for with items that are re-learned. It all happens behind the back of SuperMemo.

For a child, the best form of learning is to "experience life"

Proponents of early instruction are scornful about ideas such as "learning by doing", "project learning", "group learning", "outdoor learning", etc. It is very easy to dream of instilling math genius or reading genius in little kids. The main problems with that eager approach is that (1) kids lack in long-term memory, (2) this deficiency is a welcome sign of growth, and (3) early instruction can lead to slower development and toxic memories.

Let's have a look at the possible interpretations of the graph:

  • absence of forgetting: as all knowledge retained is gained, lost, or re-gained behind the back of SuperMemo, the forgetting curve is perfectly flat. It seems as if forgetting did not take place
  • absence of learning: the total volume of knowledge seems to be constant. In reality, new items keep flowing into SuperMemo to express new knowledge
  • absence of memory: the child has a negligible mnemonic capacity and negligible conscious control over what to remember. She remembers things her brain deems useful and consistent with the model of the world. The ability to commit things to memory with intentionality will likely develop in the age bracket of 4-8

The correct interpretation is that what we see is rather the absence of "SuperMemo memory", i.e. conscious ability to commit knowledge to memory and retain it along with spaced repetition as prescribed by the two component model of memory (relevant to a mature brain).

Systematic scanning of memory, e.g. at repetition, is a thing kids develop only at 4-7. Conscious ability to commit things to memory comes at the same time or even later. Ability to visualize and memorize abstract concepts comes even later. A late arrival of the said skills may mean the brain is experiencing a prolonged growth window. Prolonged and extensive growth is good news for the brain, bad news for the early adoption of SuperMemo.

Interestingly, this flat graph is reminiscent of the best performing students, who instinctively know which material will not stick in their memory. Good students use good knowledge formulation techniques and produce relatively little forgetting. Their forgetting curves are also pretty flat, however, they start from the very high base approaching 100%. This perfect learning would also be possible in children if SuperMemo knowledge was chosen on the basis of what they already know, not at 66% chance (as in the graph).

Employing accessory networks

Early in development, before declarative interests develop, emotions and procedures can drive and sustain learning.

For example, the value of accessory networks can be seen when developing speech. Here are some early words that might easily stick to memory:

  • ball, zipper, train, car: they all involve action, and connections in and to the motor networks are early to develop
  • sad face, fire, crash: they all involve emotion, and those are the ones of the phylogenetically oldest and earliest developing portions of the brain
  • big, hand, cat: they all involve a sensory component
  • mama, milk, cold: these are items of importance that will be reinforced often and probably also wired early in the emotional and sensory context
  • words of special interest: those are entirely unpredictable. The kid may pick up a complex word in circumstances that are hard to explain and invariably involve curiosity, passion, hyperfocus, etc.

Employing the world model

Once the kid is able to build a model of the world, he will gradually be able to expand on that model. Accessory networks that provided the early incentive to build the model will no longer be needed. Declarative knowledge can abstract from basic needs related to survival and advance to understanding the world in which knowledge has its own value.

At its early stage, the brain is still in turmoil and long-term memory is crippled, however, first budding interests in specific areas of knowledge may show up. This is the stage when parents and educators make gravest errors in child learning. They attempt to impose their own understanding of the world and the learning process on the child. As only coherent models of reality can grow in complexity and be reinforced in sleep, all preconceived educational ideas that do not fit those models will be rejected by the child and her memory.

Early education must be child-led education at the risk of inflicting harm on young brains!

A preschool with a program of academic learning may do more damage than good. Instead of fostering learning, it will begin the long process of conditioning in which kids hate learning, hate school and hate the coercion coming from the adult world.

Early learning

To fix knowledge with SuperMemo, there must be a clear pathway in a kid's brain that leads to a given synapse or a set of synapses. The pathways that develop early can be reused early, and be subject to learning. Abstractions need to build up gradually on knowledge with a lesser degree of complexity. Skills involved in reading develop slowly and incrementally: decoding letters, putting them in strings, converting to sounds, etc.

Larry Sanger, the inventor of Wikipedia, did a great job. His kids could read the US Constitution at the age of three! They also adopted SuperMemo early. Will Larry be able to convert this early capital in long-term success in harboring future geniuses? There is no guarantee. Larry's focused and passionate parenting might actually be the best component of the entire equation. Sanger wrote a nice monograph for other parents to follow his example. This must be taken with caution though. Overzealous parent might proceed too fast and violate the fundamental law of learning. If the kid keeps jumping on the book during reading, he is not ready.

Similarly, SuperMemo is a great knowledge and memory management system. You can actually start in pregnancy with a kickchart, and follow it with SleepChart, and follow the rules of incremental learning by managing, for example, the exercise and learning program. Managing and presenting knowledge is, at that stage, a parent's responsibility. Later on, say at the age of 3-5, you can try spaced repetition. The kid should not sit down to the computer, but you can employ spaced review by interweaving questions into play. This is less for a memory's sake, and more for understanding of what the kid knows and what is worth working on. Multiple memory lapses in an adult can be harmful. They may lead to establishing a toxic memory. In a kid, multiple lapses are a norm and a monumental waste of time. They are the best illustration of the fact that childhood amnesia is not a result of retrieval failure. Childhood amnesia is real. Memories are lost, and they are lost much faster than it is popularly believed.

If items do not click (i.e. do not fit the jigsaw puzzle), they should be postponed, e.g. by a year. The kid may eat potatos daily and still fail to remember the word "potato" just because it is too complex or it interferes with the word "tomato". Hard items can be tackled anew when they might become memorizable in a new memory context. They may become memorizable after changes to the current brain's "semantic status quo". As for items that "work", they are probably a waste of time too. It is unlikely they are remembered because of SuperMemo. More likely, they are remembered despite interference from SuperMemo. The kid probably needs or uses the piece of knowledge on a regular basis. However, every user of SuperMemo knows that well-remembered items cost little in the process. For the parent, it is an interesting source of information on child's strengths and also for his surprising lapses (e.g. how memories related to summer disappear in winter).

Coercive learning

SuperMemo is unsuitable for little children for their poor mnemonic capacity. However, it may be doubly harmful if the use of the program is coerced. If learning is a result of threats or fake rewards, the negligible mnemonic capacity can be obliterated by the lack of intrinsic motivation. The following forgetting curve from a first grader might be one of the worst I have ever seen:

Figure: Good educators know that you cannot motivate a child extrinsically. Poor intrinsic motivation, and poor mnemonic skills make SuperMemo unsuitable for children. In the presented case, a forgetting curve shows a catastrophically poor performance in a 7-year-old child coerced to learn vocabulary of a foreign language. This is a classic case of asemantic learning. The curve bears no relevance to a child's IQ. At this age, some kids may already show some success, as long as the use of SuperMemo is entirely voluntary

Personal anecdote. Why use anecdotes?
I recall my mom's heroic efforts to make sure I could speak German as a kid. She would speak to me in German. She would send me to a German class in the kindergarten, and in the primary school. She tried all tricks in the book. She gave up after some 8-9 years of trying in vain. I was roughly 12 years old! I never learned any German. All her efforts were to naught. Kids need to want to learn or they will learn little. Coercive learning is wasteful, may result in toxic memories, and may ultimately lead to a hate of learning

When SuperMemo starts working?

We can measure the development of memory proceeding from emotional, sensory, and motor towards the abstract. Those measurements demonstrate the futility of all forms of acceleration into the areas where the involved neural circuits are immature.

This should tell parents that forcing their own accelerated agenda, like reading, or even passive reading, algebra, and so on, may backfire and discourage. This also explains why learning by doing, social interaction, conversation with a parent, following one's interests can do marvels. This phenomenon is most dramatically pronounced early in development. It will take years of meticulous weaving of granular memories before an abstract brain develops. I support early math or reading as long as there is support and active engagement from the kid herself! One may need to wait until early teens before the brain develops all abstract capacities needed to soar at the higher level and enjoy calculus, or incremental reading.

When SuperMemo starts working, the news is not so happy: the brain is past its most explosive growth phase

Conclusions: SuperMemo for kids

SuperMemo cannot be used in small kids because:

The latter factor is even worse because the kid may take a lifetime pass on the blessings of spaced repetition. Toxic associations can form even in voluntary setting where extrinsic motivation is built with bribes. Nothing works better than self-directed plunge into spaced repetition or incremental reading. Age of 9-12 is when more and more parents start having some success with SuperMemo as long as it is a supervised process. It is very rare to have solo SuperMemo students at this age.

I do invite kids to work with SuperMemo from time to time to collect some data, and I only had one case when a 9 year old asked to have the program installed on his computer. His forgetting curve, when working with me, looked fantastic. We focused on memorizing things he liked and found easy to memorize. His new solo data is not available yet, however, I am not optimistic.

If your little ones use SuperMemo, please send to me their forgetting curve, other data, or their entire collections.

As data collected with children are pretty precious, the following guidelines could be used by parents who are curious, want to understand their child better, and minimize damage at the same time:

  • never push kids to use SuperMemo
  • do not provide extrinsic reward (e.g. bribes)
  • ask kids about things they want to keep in SuperMemo
  • keep in SuperMemo only knowledge that is well-remembered, i.e. delete, dismiss, or postpone knowledge that causes problems

Further reading


  • Spaced repetition algorithm does not work well in childhood due to childhood amnesia
  • Children are slow to develop abilities needed to use SuperMemo, incl. long-term memory
  • Active conscious act of committing knowledge to memory is a skill little kids do not have
  • Active conscious scan of memory to retrieve knowledge is not available to little kids
  • The ability to store asemantic knowledge develops very slowly. For a little child, even numbers and alphabet are hard to memorize
  • The greatest danger of using SuperMemo in children is that any form of coercion may result in the hate of SuperMemo