The birthday of spaced repetition: July 31, 1985

From supermemo.guru
(Redirected from Birth of spaced repetition)
Jump to navigation Jump to search

This text is part of: "History of spaced repetition" by Piotr Wozniak (June 2018)

The drive for better learning

I spent 22 long years in the education system. Old truths about schooling match my case perfectly. I never liked school, but I always liked to learn. I never let school interfere with my learning. At entry to university, after 12 years in the public school system, I still loved learning. Schooling did not destroy that love for two main reasons: (1) the system was lenient for me, and, (2) at home, I was free to learn what I loved. In Communist Poland, I never truly experienced the toxic whip of heavy schooling. The system was negligent, and I loved the resulting freedoms.

We all know that best learning comes from passion. It is powered by the learn drive. My learn drive was strong and it was mixed with a bit of frustration. The more I learned, the more I could see the power of forgetting. I could not remedy forgetting by more learning. My memory was not bad in comparison with other students, but it was clearly a leaky vessel.

In 1982, I paid more attention to what most students discover sooner or later: the testing effect. I started formulating my knowledge for active recall. I would write questions on the left side of a page and answers in a separate column to the right:

English-Polish word pairs for active recall review
English-Polish word pairs for active recall review

Figure: In 1982, my learning materials were organized into columns: questions and answers, for active recall. The picture shows my Polish-English word pair database (June 1982)

This way, I could cover the answers with a sheet of paper, and employ active recall to get a better memory effect from review. This was a slow process but the efficiency of learning increased dramatically. My notebooks from the time are described as "fast assimilation material", which referred to the speed of learning and denoted this specific two-column way, in which my knowledge was written down.

In the years 1982-1983, I kept expanding my "fast assimilation" knowledge in the areas of biochemistry and English. I would review my pages of information from time to time to reduce forgetting. My retention improved, but it was only a matter of time when I would hit the wall again. The more pages I had, the less frequent the review, the more obvious was the problem of leaking memory. Here is an example of a repetition history from that time:

Intermittent learning started on June 3, 1982
Intermittent learning started on June 3, 1982

Figure: Repetition history is the record of review with dates and scores. In this example, 8 repetitions were executed in the years 1982-1985. The page of 38 word-pairs registered 0-7 memory lapses per repetition. This intermittent review was replaced with the SuperMemo method in 1985

Between June 1982 and December 1984, my English-Polish word-pairs notebook grew to include 79 pages looking like this:

Figure: A typical page from my English-Polish words notebook started in June 1982. Word pairs would be listed on the left. Review history would be recorded on the right. Recall errors would be marked as dots in the middle

Those 79 pages would include a mere 2794 words. This is just a fraction of what I needed, and already quite a headache to review. Interestingly, the pages were built for passive understanding of English (English:Polish). Only in 1984, I started learning English in an active way (Polish:English). This two year delay came from the fact that I was simply late to discover that passive knowledge of vocabulary is sufficient in reading, but it is not enough to speak a language. This kind of ignorance after 6 years of schooling is a norm. Schools do a lot of drilling, but shed very little light on what makes for efficient learning.

In late 1984, I decided to improve the review process and carry out an experiment that changed my life. In the end, three decades later, I am super-proud to notice that it actually affected millions. The experiment has opened the floodgates. We have reached the era of faster and better learning.

This is how this initial period was described in my Master's Thesis in 1990:

Archive warning: Why use literal archives?
This text was part of: "Optimization of learning" by Piotr Wozniak (1990)

It was 1982 when I made my first observations concerning the mechanism of memory that were later used in the formulation of the SuperMemo method. As a then student of molecular biology I was overwhelmed by the amount of knowledge that was required to pass exams in mathematics, physics, chemistry, biology, etc. The problem was not in being unable to master the knowledge. Usually 2-3 days of intensive studying were enough to pack the head with data necessary to pass an exam. The frustrating point was that only an infinitesimal fraction of newly acquired wisdom could remain in memory after few months following the exam.

My first observation, obvious for every attentive student, was that one of the key elements of learning was active recall. This observation implies that passive reading of books is not sufficient if it is not followed by an attempt to recall learned facts from memory. The principle of basing the process of learning on recall will be later referred to as the active recall principle. The process of recalling is much faster and not less effective if the questions asked by the student are specific rather than general. It is because answers to general questions contain redundant information necessary to describe relations between answer subcomponents.

To illustrate the problem let us imagine an extreme situation in which a student wants to master knowledge contained in a certain textbook, and who uses only one question in the process of recall: What did you learn from the textbook? Obviously, information describing the sequence of chapters of the book would be helpful in answering the question, but it is certainly redundant for what the student really wants to know. The principle of basing the process of recall on specific questions will be later referred to as the minimum information principle. This principle appears to be justified not only because of the elimination of redundancy.

Having the principles of active recall and minimum information in mind, I created my first databases (i.e. collections of questions and answers) used in an attempt to retain the acquired knowledge in memory. At that time the databases were stored in a written form on paper. My first database was started on June 6, 1982, and was composed of pages that contained about 40 pairs of words each. The first word in a pair (interpreted as a question) was an English term, the second (interpreted as an answer) was its Polish equivalent. I will refer to these pairs as items.

I repeated particular pages in the database in irregular intervals (dependent mostly on the availability of time) always recording the date of the repetition, items that were not remembered and their number. This way of keeping the acquired knowledge in memory proved sufficient for a moderate-size database on condition that the repetitions were performed frequently enough.

The birthday of spaced repetition: July 31, 1985

Intuitions

In 1984, my reasoning about memory was based on two simple intuitions that probably all students have:

  • if we review something twice, we remember it better. That's pretty obvious, isn't it? If we review it 3 times, we probably remember it even better
  • if we remember a set of notes, they will gradually disappear from memory, i.e. not all at once. This is easy to observe in life. Memories have different lifetimes

These two intuitions should make everyone wonder: how fast and how many notes we lose and when we should review next?

To this day, I am amazed that very few people ever bothered to measure that "optimum interval". When I measured it myself, I was sure I would find more accurate results in books on psychology. I did not. See: Why spaced repetition research kept failing?

Experiment

The following simple experiment led to the birth of spaced repetition. It was conducted in 1985 and first described in my Master's Thesis in 1990. It was used to establish optimum intervals for the first 5 repetitions of pages of knowledge. Each page contained around 40 word-pairs and the optimum interval was to approximate the moment in time when roughly 5-10% of that knowledge was forgotten. Naturally, the intervals would only be suited for that particular type of knowledge, and for a specific person (in this case me). In addition, to speed things up, the measurement samples were small. Note that this was not a research project. It was not intended for publication. The goal was to just speed up my own learning. I was convinced someone else must have measured the intervals more accurately, but 13 years before the birth of Google, I thought measuring the intervals would be faster than digging into libraries to find better data. The experiment ended on Aug 24, 1985, which I originally named the birthday of spaced repetition. However, while writing this text in 2018, I found the original learning materials, and it seems my eagerness to learn made me formulate an outline of the algorithm on Jul 31, 1985. This was the day I started learning human biology using my spaced repetition algorithm.

For that reason, I can say that the most accurate birthday of SuperMemo and computational spaced repetition was Jul 31, 1985.

By July 31, before the end of the experiment, the results seemed predictable enough. In later years, the findings of this particular experiment appeared pretty universal and could be extended to more areas of knowledge and to the whole healthy adult population. Even in 2018, the default settings of Algorithm SM-17 do not depart far from those early rudimentary findings.

Spaced repetition was born on Jul 31, 1985

Here is the original description of the experiment from my Master's Thesis with minor corrections to grammar and style. Emphasis in the text was added in 2018 to highlight important parts. If the text seems boring and unreadable, compare Ebbinghaus 1885. This is the same style of writing in the area of memory. Only goals differed. Ebbinghaus tried to understand memory. 100 years later, I just wanted to learn faster:

Archive warning: Why use literal archives?
This text was part of: "Optimization of learning" by Piotr Wozniak (1990)

Experiment intended to approximate the length of optimum inter-repetition intervals (Feb 25, 1985 - Aug 24, 1985):

  1. The experiment consisted of stages A, B, C, ... etc. Each of these stages was intended to calculate the second, third, fourth and further quasi-optimal inter-repetition intervals (the first interval was set to one day as it seemed the most suitable interval judging from the data collected earlier). The criterion for establishing quasi-optimal intervals was that they should be as long as possible and allow for not more than 5% loss of remembered knowledge.
  2. The memorized knowledge in each of the stages A, B, C, consisted of 5 pages containing about 40 items in the following form:
    Question: English word,
    Answer: its Polish equivalent.
  3. Each of the pages used in a given stage was memorized in a single session and repeated next day. To avoid confusion, note that in order to simplify further considerations I use the term first repetition to refer to memorization of an item or a group of items. After all, both processes, memorization and relearning, have the same form: answering questions for as long as it takes for the number of errors to reach zero.
  4. In the stage A (Feb 25 - Mar 16), the third repetition was made in intervals 2, 4, 6, 8 and 10 days for each of the five pages respectively. The observed loss of knowledge after these repetitions was 0, 0, 0, 1, 17 percent respectively. The seven-day interval was chosen to approximate the second quasi-optimal inter-repetition interval separating the second and third repetitions.
  5. In the stage B (Mar 20 - Apr 13), the third repetition was made after seven-day intervals whereas the fourth repetitions followed in 6, 8, 11, 13, 16 days for each of the five pages respectively. The observed loss of knowledge amounted to 3, 0, 0, 0, 1 percent. The 16-day interval was chosen to approximate the third quasi-optimal interval. NB: it would be scientifically more valid to repeat the stage B with longer variants of the third interval because the loss of knowledge was little even after the longest of the intervals chosen; however, I was then too eager to see the results of further steps to spend time on repeating the stage B that appeared sufficiently successful (i.e. resulted in good retention)
  6. In the stage C (Apr 20 - Jun 21), the third repetitions were made after seven-day intervals, the fourth repetitions after 16-day intervals and the fifth repetitions after intervals of 20, 24, 28, 33 and 38 days. The observed loss of knowledge was 0, 3, 5, 3, 0 percent. The stage C was repeated for longer intervals preceding the fifth repetition (May 31 - Aug 24). The intervals and memory losses were as follows: 32-8%, 35-8%, 39-17%, 44-20%, 51-5% and 60-20%. The 35-day interval was chosen to approximate the fourth quasi-optimal interval.
It is not difficult to notice, that each of the stages of the described experiment took about twice as much time as the previous one. It could take several years to establish the first ten quasi-optimal inter-repetition intervals. Indeed, I continued experiments of this sort in following years in order to gain deeper understanding of the process of optimally spaced repetitions of memorized knowledge. However, at that time, I decided to employ the findings in my day-to-day process of routine learning.

On July 31, 1985, I could already sense the outcome of the experiment. I started using SuperMemo on paper to learn human biology. That day marks the best date to nominate for the birthday of SuperMemo and the birthday of spaced repetition.

The events of July 31, 1985

On July 31, 1985, SuperMemo was born. I had most of my data from my spaced repetition experiment available. As an eager practitioner, I did not wait for the experiment to end. I wanted to start learning as soon as possible. Having built a great deal of notes in human biology, I started converting those notes into Special Memorization Test format (SMT was the original name for SuperMemo, and spaced repetition).

Human biology in the Special Memorization Test format started on Jul 31, 1985 (i.e. the birth of SuperMemo)

Figure: Human biology in the Special Memorization Test format started on Jul 31, 1985 (i.e. the birthday of SuperMemo)

My calculations told me that, at 20 min/day, I would need 537 days to process my notes and finish the job by January 1987. I also computed that each page of the test would likely cost me 2 hours of life. Despite all the promise and speed of SuperMemo, this realization was pretty painful. The speed of learning in college is way too fast for the capacity of human memory. Now that I could learn much faster and better, I also realized I wouldn't cover even a fraction of what I thought was possible. Schools make no sense with their volume and speed. On the same day, I found out that the Polish communist government lifted import tariffs on microcomputers. This should make it possible, at some point, to buy a computer in Poland. This opened a way to SuperMemo for DOS that was developed 2.5 years later.

Also on July 31, I noted that if vacation could last forever, I would achieve far more in learning, and even more in life. School is such a waste of time. However, the threat of conscription kept me in line. I would enter a path that would make me enroll in university for another 5 years. However, most of that time was devoted to SuperMemo, and I have few regrets.

My spaced repetition experiment ended on Aug 24, 1985. I also started learning English vocabulary. By that day, I managed to have most of my biochemistry material written down in pages for SuperMemo review.

Note: My Master's Thesis mistakenly refers to Oct 1, 1985 as the day when I started learning human biology (not July 31 as seen in the picture above). Oct 1, 1985 was actually the first day of my computer science university and was otherwise unremarkable. With the start of the university, my time for learning and energy for learning were cut dramatically. Paradoxically, the start of school always seems to augur the end of good learning.

First spaced repetition algorithm: Algorithm SM-0, Aug 25, 1985

As a result of my spaced repetition experiment, I was able to formulate the first spaced repetition algorithm that required no computer. All learning had to be done on paper. I did not have a computer back in 1985. I was to get my first microcomputer, ZX Spectrum, only in 1986. SuperMemo had to wait even longer. I got my first computer with a floppy disk drive, Amstrad PC 1512, in the year 1987.

I often get asked this simple question: "How can you formulate SuperMemo after an experiment that lasted 6 months? How can you predict what would happen in 20 years?"

The first experiments in reference to the length of optimum interval resulted in conclusions that made it possible to predict the most likely length of successive inter-repetition intervals without actually measuring retention beyond weeks! In short, it could be illustrated with the following reasoning. If the first months of research yielded the following optimum intervals: 1, 2, 4, 8, 16 and 32 days, you could hope with some confidence that the successive intervals would increase by a factor of two.

Archive warning: Why use literal archives?
This text was part of: "Optimization of learning" by Piotr Wozniak (1990)

Algorithm SM-0 used in spaced repetition without a computer (Aug 25, 1985)

  1. Split the knowledge into smallest possible question-answer items
  2. Associate items into groups containing 20-40 elements. These groups are later called pages
  3. Repeat whole pages using the following intervals (in days):
    I(1)=1 day
    I(2)=7 days
    I(3)=16 days
    I(4)=35 days
    for i>4: I(i):=I(i-1)*2
    where:
    • I(i) is the interval used after the i-th repetition
  4. Copy all items forgotten after the 35th day interval into newly created pages (without removing them from previously used pages). Those new pages will be repeated in the same way as pages with items learned for the first time
Note, that inter-repetition intervals after the fifth repetition were assumed to increase twice in subsequent repetitions. This fact was based on an intuition rather than on experiment. In two years of using the Algorithm SM-0 sufficient data were collected to confirm a reasonable accuracy of this assumption.

To this day I hear some people use or even prefer the paper version of SuperMemo. Here is a description from 1992.

Note that the intuition that intervals should increase twice is as old as the theory of learning. In 1932, C. A. Mace hinted on the efficient learning methods in his book "The psychology of study". He mentioned "active rehearsal" and "repetitive revisions" that should be spaced in gradually increasing intervals, roughly "intervals of one day, two days, four days, eight days, and so on". This proposition was later taken on by other authors. Those included Paul Pimsleur and Tony Buzan who both proposed their own intuitions that involved very short intervals (in minutes) or "final repetition" (after a few months). All those ideas did not permeate well into the practice of study. Only a computer application made it possible to start learning effectively without studying the methodology.

That intuitive interval multiplication factor of 2 has also shown up in the context of studying the possibility of evolutionary optimization of memory in response to statistic properties of the environment: "Memory is optimized to meet probabilistic properties of the environment"

Despite all its simplicity, in my Master's Thesis, I did not hesitate to call my new method "revolutionary":

Archive warning: Why use literal archives?
This text was part of: "Optimization of learning" by Piotr Wozniak (1990)

Although the acquisition rate may not have seemed staggering, the Algorithm SM-0 was revolutionary in comparison to my previous methods because of two reasons:

  • with the lapse of time, knowledge retention increased instead of decreasing (as it was the case with intermittent learning)
  • in a long term perspective, the acquisition rate remained almost unchanged (with intermittent learning, the acquisition rate would decline substantially over time)

[...]

For the first time, I was able to reconcile high knowledge retention with infrequent repetitions that in consequence led to steadily increasing volume of knowledge remembered without the necessity to increase the timeload!

Retention of 80% was easily achieved, and could even be increased by shortening the inter-repetition intervals. This, however, would involve more frequent repetitions and, consequently, increase the timeload. The assumed repetition spacing provided a satisfactory compromise between retention and workload.

[...]

The next significant improvement of the Algorithm SM-0 was to come only in 1987 after the application of a computer to supervise the learning process. In the meantime, I accumulated about 7190 and 2817 items in my new English and biological databases respectively. With the estimated working time of 12 minutes a day for each database, the average knowledge acquisition rate amounted to 260 and 110 items/year/minute respectively, while knowledge retention amounted to 80% at worst.

Birth of SuperMemo from a decade's perspective

A decade after the birth of SuperMemo, it became pretty well-known in Poland. Here is the same story as retold by J. Kowalski, Enter in 1994:

It was 1982, when a 20-year-old student of molecular biology at Adam Mickiewicz University of Poznan, Piotr Wozniak, became quite frustrated with his inability to retain newly learned knowledge in his brain. This referred to the vast material of biochemistry, physiology, chemistry, and English, which one should master wishing to embark on a successful career in molecular biology. One of the major incentives to tackle the problem of forgetting in a more systematic way was a simple calculation made by Wozniak which showed him that by continuing his work on mastering English using his standard methods, he would need 120 years to acquire all the important vocabulary. This not only prompted Wozniak to work on methods of learning, but also, turned him into a determined advocate of the idea of one language for all people (bearing in mind the time and money spent by the mankind on translation and learning languages). Initially, Wozniak kept increasing piles of notes with facts and figures he would like to remember. It did not take long to discover that forgetting requires frequent repetitions and a systematic approach is needed to manage all the newly collected and memorized knowledge. Using an obvious intuition, Wozniak attempted to measure the retention of knowledge after different inter-repetition intervals, and in 1985 formulated the first outline of SuperMemo, which did not yet require a computer. By 1987, Wozniak, then a sophomore of computer science, was quite amazed with the effectiveness of his method and decided to implement it as a simple computer program. The effectiveness of the program appeared to go far beyond what he had expected. This triggered an exciting scientific exchange between Wozniak and his colleagues at Poznan University of Technology and Adam Mickiewicz University. A dozen of students at his department took on the role of guinea pigs and memorized thousands of items providing a constant flow of data and critical feedback. Dr Gorzelańczyk from Medical Academy was helpful in formulating the molecular model of memory formation and modeling the phenomena occurring in the synapse. Dr Makalowski from the Department of Biopolymer Biochemistry contributed to the analysis of evolutionary aspects of optimization of memory (NB: he was also the one who suggested registering SuperMemo for Software for Europe). Janusz Murakowski, MSc in physics, currently enrolled in a doctoral program at the University of Delaware, helped Wozniak solve mathematical problems related to the model of intermittent learning and simulation of ionic currents during the transmission of action potential in nerve cells. A dozen of forthcoming academic teachers, with Prof. Zbigniew Kierzkowski in forefront, helped Wozniak tailor his program of study to one goal: combining all aspects of SuperMemo in one cohesive theory that would encompass molecular, evolutionary, behavioral, psychological, and even societal aspects of SuperMemo. Wozniak who claims to have discovered at least several important and never-published properties of memory, intended to solidify his theories by getting a PhD in neuroscience in the US. Many hours of discussions with Krzysztof Biedalak, MSc in computer science, made them both choose another way: try to fulfill the vision of getting with SuperMemo to students around the world