SuperMemo 1.0 for DOS (1987)

This text is part of: "History of spaced repetition" by Piotr Wozniak (June 2018)

SuperMemo 1.0: day by day (1987)

SuperMemo history page at supermemo.com says that "Wozniak wrote his first SuperMemo in 16 evenings". The reality was a bit more complex. Let me describe it in detail using the notes from the day.

Building up for the idea

I cannot figure out what I meant writing on Jul 3, 1987 that "I have an idea for a revolutionary program arranging my work and scientific experiments in SMTests" (SMTests stands for SuperMemo on paper here). A transition from paper to a computer seems like an obvious step. There must have been some mental obstacle on the way that required "thinking out of the box". Unfortunately, I did not write down details. Today it only matters in that it illustrates how excruciatingly slow a seemingly obvious idea may creep into the mind.

On Sep 8, 1987, my first PC arrived from Germany (Amstrad PC 1512). My enthusiasm was unmatched! I could not sleep. I worked all night. The first program I planned to write was to be used for mathematical approximations. SuperMemo was second in the pipeline.

Figure: Amstrad PC-1512 DD. My version had only one diskette drive. Operating system MS-DOS had to be loaded from one diskette, Turbo Pascal 3.0 from another diskette, SuperMemo from yet another. Until I had my first hard drive in 1991, my English collection had to be split into 3000-item portions. My 39,000 items had to be kept on 13 diskettes. I had many more for other areas of knowledge. On Jan 21, 1997, SuperMemo World has tracked down that original PC and bought it back from its owner: Jarek Kantecki. The PC was fully functional for the whole decade. It is now buried somewhere in dusty archives of the company. Perhaps we will publish its picture at some point. The presented picture comes from Wikipedia

Oct 16, 1987, Fri, in 12 hours, I wrote my first SuperMemo in GW-BASIC (719 minutes of non-stop programming). The program would allow of adding new items, and making simple repetitions. It was slow like a snail and buggy. I did not like it much. I did not start learning. Could this be the end of SuperMemo? Wrong choice of a programming language? Busy days at school kept me occupied with a million unimportant things. Typical school effect: learn nothing about everything. No time for creativity and your own learning. Luckily, I never stopped using SuperMemo on paper. The idea of SuperMemo could not have died. It had to be converted into a piece of software sooner or later.

On Nov 14, 1987, Sat, SuperMemo on paper got its first user: Mike Kubiak. He was very enthusiastic. The fire kept burning. On Nov 18, I learned about Turbo Pascal. It did not work on my computer. In those days, if you had a wrong graphics card, you might struggle. Instead of Hercules, I had a text-mode monochrome (black-and-white) CGA. I managed to solve the problem by editing programs in the RPED text editor rather than in the Turbo Pascal environment. Later, I got the right version of Turbo Pascal for my display card. Incidentally, old SuperMemos show in colors. I was programming it in shades of gray, and never knew how it really looked in the color mode.

Writing SuperMemo 1.0

SuperMemo 1.0 chronology:

Nov 21, 1987 was an important day. It was a Saturday. Days free from school are days of creativity. I hoped to get up at 9 am but I overslept by 72 minutes. This is bad for the plan, but this is usually great for the brain and the productivity. I started the day from learning with SuperMemo on paper (reviewing English, human biology, computer science, etc.). Later in the day, I read my Amstrad PC manual, learned about Pascal and Prolog, spent some time thinking about how the human cortex might work, did some exercise, and in the late evening, in a slightly tired state of mind, in afterthought, decided to write SuperMemo for DOS. This would be my second attempt. However, this time I chose Turbo Pascal 3.0 and never regretted. To this day, as a direct consequence, SuperMemo 19 code is written in Pascal (Delphi XE3). The name SuperMemo was proposed much later. In those days, I called my program: SMTOP for Super-Memorization Test Optimization Program. In 1988, Tomasz Kuehn insisted we call it CALOM for Computer-Aided Learning Optimization Method.
Nov 22, 1987 was a mirror copy of Nov 21. I concluded that I know how the brain cortex works and that one day it would be nice to build a computer using similar principles (check Jeff Hawkins's work). The fact that I returned to programming SuperMemo in the late evening, i.e. very bad time for creative work, seems to indicate that the passion has not kicked in yet.
Nov 23, 1987 looked identical. I am not sure why I did not have any school obligations on Monday, but this might have saved SuperMemo. On Nov 24, 1987, the excitement kicked in and I worked for 8 hours straight (in the evening again). The program had a simple menu and could add new items to the database.
Nov 25, 1987 was wasted: I had to go to school, I was tired and sleepy. We had excruciatingly boring classes in computer architecture, probably a decade behind the status quo in the west.
Nov 26 was free again and again I was able to catch up with SuperMemo work. The program grew to be 15,400 bytes "huge". I concluded the program might be "very usefull" (sic!).
On Nov 27, I added 3 more hours of work after school.
Nov 28 was Saturday and I could add 12 enthusiastic hours of non-stop programming. SuperMemo now looked like almost ready for use.
On Nov 29, Sunday, I voted for economic reforms and democratization in Poland. In the evening, I did not make much progress. I had to prepare an essay for my English class. The essay described the day when I experimented with alcohol one day in 1982. I was a teetotaller, but as a biologist, I concluded I need to know how alcohol affects the mind.
Nov 30 was wasted at school, but we had a nice walk home with Biedalak. We had a long conversation in English about our future. That future was mostly to be about science, probably in the US.
Dec 1-4 were wasted at school again. No time for programming. In a conversation with some Russian professor, I realized that I completely forgot Russian in short 6 years. I used to be proudly fluent! I had to channel my programming time into some boring software for designing electronic circuits. I had to do it to credit a class in electronics. I had a deal with the teacher that I would not attend lectures, just write this piece of software. I did not learn anything and to this day I mourn the waste of time. If I was free, I could have invested this energy in SuperMemo.
Dec 5 was a Saturday. Free from school. Hurray! However, I had to start from wasting 4 hours on some "keycode procedure". In those days, even decoding the key pressed might become a challenge. And then another hour wasted on changing some screen attributes. In addition, I added 6 hours for writing "item editor". This way, I could conveniently edit items in SuperMemo. The effortless things you take for granted today: cursor left, cursor right, delete, up, new line, etc. needed a day of programming back then.
Dec 6 was a lovely Sunday. I spent 7 hours debugging SuperMemo, adding "final drill", etc. The excitement kept growing. In a week, I might start using my new breakthrough speed-learning software.
On Monday, Dec 7, after school, I added a procedure for deleting items.
On Dec 8, while Reagan and Gorbachev signed their nuclear deal, I added a procedure for searching items and displaying some item statistics. SuperMemo "bloated" to 43,800 bytes.
Dec 9 was marred by school and programming for the electronics class.
On Dec 10, I celebrated power cuts at school. Instead of boring classes, I could do some extra programming.
On Dec 11, we had a lovely lecture with one of the biggest brains at the university: Prof. Jan Weglarz. He insisted that he could do more in Poland than abroad. This was a powerful message. However, in 2018, his Wikipedia entry says that his two-phase method discovery was ignored, and later duplicated in the west because he opted for publishing in Polish. Weglarz created a formidable team of best operations research brains in Poznan indeed. If I did not sway in the direction of SuperMemo, I would sure come with a begging hat to look for an employment opportunity. In the evening, I added a procedure for inspecting the number of items to review each day. This is today's Calendar (or Workload in older versions).

First repetitions in SuperMemo

Dec 12, 1987 was a Saturday. I expanded SuperMemo by a pending queue editor, and seemed ready to start learning, however, … on Dec 13, I was hit by a bombshell: "Out of memory". Turbo Pascal refused to compile my program for it grew too large. In those days, memory in DOS was split in 64KB segments and I was probably limited to using just a single segment. I somehow managed to fix the problem by optimizing the code.

The last option I needed to add to SuperMemo was for the program to read the date. Reading the date was a big deal hack in those days. Without it, I would need to type in the current date at every day at the start of the work with SuperMemo.

Finally, at long last, in the afternoon, on Dec 13, 1987, I was able to add my first items to my human biology collection: questions about the autonomic nervous system. As much as July 31, 1985 could be considered the birthday of spaced repetition. Dec 13, 1987 could be named the birthday of spaced repetition software.

Spaced repetition software was born on Dec 13, 1987

By Dec 23, 1987, my combined paper and computer databases included 3795 questions on human biology (of which almost 10% were already stored in my new SuperMemo program database). Sadly, I had to remove the possibility of storing full repetition histories from SuperMemo on that day. There wasn't enough space on 360K diskettes. Spaced repetition research would need to wait a few more years. Only in 1996, SuperMemo resumed collecting the full record of repetitions.

Figure: SuperMemo 1.0 for DOS (1987) was the first computer application of spaced repetition. It introduced Algorithm SM-2 that remains popular three decades later

Algorithm SM-2

Here is the description of the algorithm used in SuperMemo 1.0 for DOS (1987). The description was taken from my Master's Thesis written 2.5 years later (1990). SuperMemo 1.0 was soon replaced by a better-looking SuperMemo 2.0 that I would give away to friends at the university. There were insignificant updates to the repetition spacing algorithm that was named Algorithm SM-2 after the version of SuperMemo. This means there has never been Algorithm SM-1.

I mastered 1000 questions in biology in the first 8 months of use. Even better, I memorized exactly 10,000 items of English word pairs in the first 365 days. I worked an average of 40 minutes per day. This speed of learning was used as a benchmark in advertising SuperMemo in its first commercial days. Even today, 40 minutes is the daily investment recommended to master Advanced English in 4 years (40,000+ items).

Algorithm SM-2 remains popular and is still used by applications such as Anki, Mnemosyne, and more.

Archive warning: Why use literal archives?

This text was part of: "Optimization of learning" by Piotr Wozniak (1990)

3.2. Application of a computer to improve the results obtained in working with the SuperMemo method

I wrote the first SuperMemo program in December 1987 (Turbo Pascal 3.0, IBM PC). It was intended to enhance the SuperMemo method in two basic ways:

apply the optimization procedures to smallest possible items (in the paper-based SuperMemo items were grouped in pages),
differentiate between the items on the base of their different difficulty.

Having observed that subsequent inter-repetition intervals are increasing by an approximately constant factor (e.g. two in the case of the SM-0 algorithm for English vocabulary), I decided to apply the following formula to calculate inter-repetition intervals:

I(1):=1
I(2):=6
for n>2 I(n):=I(n-1)*EF
where:

I(n) - inter-repetition interval after the n-th repetition (in days)

EF - easiness factor reflecting the easiness of memorizing and retaining a given item in memory (later called the E-Factor).

E-Factors were allowed to vary between 1.1 for the most difficult items and 2.5 for the easiest ones. At the moment of introducing an item into a SuperMemo database, its E-Factor was assumed to equal 2.5. In the course of repetitions, this value was gradually decreased in case of recall problems. Thus the greater problems an item caused in recall the more significant was the decrease of its E-Factor.

Shortly after the first SuperMemo program had been implemented, I noticed that E-Factors should not fall below the value of 1.3. Items having E-Factors lower than 1.3 were repeated annoyingly often and always seemed to have inherent flaws in their formulation (usually they did not conform to the minimum information principle). Thus not letting E-Factors fall below 1.3 substantially improved the throughput of the process and provided an indicator of items that should be reformulated. The formula used in calculating new E-Factors for items was constructed heuristically and did not change much in the following 3.5 years of using the computer-based SuperMemo method.

In order to calculate the new value of an E-Factor, the student has to assess the quality of his response to the question asked during the repetition of an item (my SuperMemo programs use the 0-5 grade scale - the range determined by the ergonomics of using the numeric key-pad). The general form of the formula used was:

EF':=f(EF,q)
where:

EF' - new value of the E-Factor

EF - old value of the E-Factor

q - quality of the response

f - function used in calculating EF'.

The function f had initially multiplicative character and was in later versions of SuperMemo program, when the interpretation of E-Factors changed substantially, converted into an additive one without significant alteration of dependencies between EF', EF and q. To simplify further considerations only the function f in its latest shape is taken into account:

EF':=EF-0.8+0.28*q-0.02*q*q

which is a reduced form of:

EF':=EF+(0.1-(5-q)*(0.08+(5-q)*0.02))

Note, that for q=4 the E-Factor does not change.

Let us now consider the final form of the SM-2 algorithm that with minor changes was used in the SuperMemo programs, versions 1.0-3.0 between December 13, 1987 and March 9, 1989 (the name SM-2 was chosen because of the fact that SuperMemo 2.0 was by far the most popular version implementing this algorithm).

Algorithm SM-2 used in the computer-based variant of the SuperMemo method and involving the calculation of easiness factors for particular items:

Split the knowledge into smallest possible items.
With all items associate an E-Factor equal to 2.5.
Repeat items using the following intervals:
I(1):=1

I(2):=6

for n>2: I(n):=I(n-1)*EF
where:
- I(n) - inter-repetition interval after the n-th repetition (in days),
- EF - E-Factor of a given item
If interval is a fraction, round it up to the nearest integer.
After each repetition assess the quality of repetition response in 0-5 grade scale:
5 - perfect response

4 - correct response after a hesitation

3 - correct response recalled with serious difficulty

2 - incorrect response; where the correct one seemed easy to recall

1 - incorrect response; the correct one remembered

0 - complete blackout.
After each repetition, before computing the new interval, modify the E-Factor of the recently repeated item according to the formula:
EF':=EF+(0.1-(5-q)*(0.08+(5-q)*0.02))
where:
- EF' - new value of the E-Factor,
- EF - old value of the E-Factor,
- q - quality of the response in the 0-5 grade scale.
If EF is less than 1.3 then let EF be 1.3.
If the quality of the response was lower than 3 then start repetitions for the item from the beginning without changing the E-Factor (i.e. use intervals I(1), I(2) etc. as if the item was memorized anew).
After each repetition session of a given day repeat again all items that scored below four in the quality assessment. Continue the repetitions until all of these items score at least four.

The optimization procedure used in finding E-Factors proved to be very effective. In SuperMemo programs you will always find an option for displaying the distribution of E-Factors (later called the E-Distribution). The shape of the E-Distribution in a given database was roughly established within few months since the outset of repetitions. This means that E-Factors did not change significantly after that period and it is safe to presume that E-Factors correspond roughly to the real factor by which the inter-repetition intervals should increase in successive repetitions.

During the first year of using the SM-2 algorithm (learning English vocabulary), I memorized 10,255 items. The time required for creating the database and for repetitions amounted to 41 minutes per day. This corresponds to the acquisition rate of 270 items/year/min. The overall retention was 89.3%, but after excluding the recently memorized items (intervals below 3 weeks) which do not exhibit properly determined E-Factors the retention amounted to 92%. Comparing the SM-0 and SM-2 algorithms one must consider the fact that in the former case the retention was artificially high because of hints the student is given while repeating items of a given page. Items preceding the one in question can easily suggest the correct answer.

Therefore the Algorithm SM-2, though not stunning in terms of quantitative comparisons, marked the second major improvement of the SuperMemo method after the introduction of the concept of optimal intervals back in 1985. Separating items previously grouped in pages and introducing E-Factors were the two major components of the improved algorithm. Constructed by means of the trial-and-error approach, the SM-2 algorithm proved in practice the correctness of nearly all basic assumptions that led to its conception