Forgetting curve

Jump to navigation Jump to search


Forgetting curve describes the decline in the probability of recall over time (source: Wozniak, Gorzelanczyk, Murakowski, 1995):




In a mass of remembered details, the shape of the forgetting curve will depend on (1) memory complexity (i.e. how difficult it is to uniformly bring individual knowledge details from memory), and (2) memory stability (i.e. how well individual details have been established in memory). For example, a set of easy French words, memorized on the same day, may align into a curve that meets the above formula. Those French words will have low complexity (because they are easy), and low stability (because they have just been learned). Those French words will be lost to memory, one by one, at equal probability over time. The chance of recalling a given word will be R (retrievability) after time t. With time going to infinity, the recall will approach zero. However, if all words are reviewed again, their stability will increase and retention time will be extended. This is used in spaced repetition to minimize the cost of an indefinite retention of memories.

Power or Exponential?

Forgetting is exponential, however, superposition of forgetting rates for different stabilities will make forgetting follow the power law. In other words, when memories of different complexity are mixed, the forgetting curve will change its shape, and may be better approximated with a negative power function (as originally discovered by Hermann Ebbinghaus in 1885). Plotting the forgetting curve for memories of different stability is of less interest. It can be compared to establishing a single expiration date for products of different shelf life produced at different times. Power approximations face the problem of t=0 point. On the other hand, exponential forgetting may seem devastating in its power. Luckily, for well-formulated material, decay constants are very low due to high memory stabilities developed after just a few reviews.

Forgetting is exponential due to the random nature of memory interference

For more see: Exponential nature of forgetting


Spaced repetition software SuperMemo routinely collects data and displays a set of forgetting curves that depend on memory stability and knowledge complexity. Each user and each knowledge collection are assigned a set of 400 forgetting curves for different combinations of stability and complexity levels. In addition, newer SuperMemos keep 400 curves where time is expressed by memory retrievability estimate. This rich dataset helps SuperMemo keep an accurate model of each student's memory.

Examples of curves collected with SuperMemo:

See also:

This text is a part of a series of articles about SuperMemo, a pioneer of spaced repetition software since 1987


Forgetting curve collected with SuperMemo 17

Figure: The first forgetting curve for newly learned knowledge collected with SuperMemo. Power approximation is used in this case due to the heterogeneity of the learning material freshly introduced in the learning process. Lack of separation by memory complexity results in superposition of exponential forgetting with different decay constants. On a semi-log graph, the power regression curve is logarithmic (in yellow), and appearing almost straight. The curve shows that in the presented case recall drops merely to 58% in four years, which can be explained by a high reuse of memorized knowledge in real life. The first optimum interval for review at retrievability of 90% is 3.96 days. The forgetting curve can be described with the formula R=0.9906*power(interval,-0.07), where 0.9906 is the recall after one day, while -0.07 is the decay constant. In this is case, the formula yields 90% recall after 4 days. 80,399 repetition cases were used to plot the presented graph. Steeper drop in recall will occur if the material contains a higher proportion of difficult knowledge (esp. poorly formulated knowledge), or in new students with lesser mnemonic skills. Curve irregularity at intervals 15-20 comes from a smaller sample of repetitions (later interval categories on a log scale encompass a wider range of intervals)

Exponential forgetting curve collected with SuperMemo 17

Figure: Exponential forgetting curve for SuperMemo learning material after an equivalent of 6th optimum interval review. 14,550 repetitions have been used to plot the presented graph

Cumulative forgetting curve collected with SuperMemo 17

Figure: Cumulative forgetting curve for learning material of mixed complexity, and mixed stability. The graph is obtained by superposition of 400 forgetting curves normalized for the decay constant of 0.003567, which corresponds with recall of 70% at 100% of the presented time span (i.e. R=70% on the right edge of the graph). 401,828 repetition cases have been included in the graph. Individual curves are represented by yellow data points. Cumulative curve is represented by blue data points that show the average recall for all 400 curves. The size of circles corresponds with the size of data samples.

A forgetting curve from a preschooler's SuperMemo collection
A forgetting curve from a preschooler's SuperMemo collection

Figure: A forgetting curve from a preschooler's SuperMemo collection. The absence of forgetting indicates the absence of intentional declarative learning. The decay constant is nearly zero which makes optimum interval meaningless. 1706 repetition cases have been recorded. This flat forgetting curve would go unnoticed in older versions of SuperMemo due to the adult-centric assumption that on Day=0, retrievability is 100%. Overtime, this forgetting curve will lean down to produce a graph typical of adult learning. This process may take a few years and should not be artificially accelerated, e.g. by means of coercion. This curve is a hypothetical expression of the semantic brain

Figure: Good educators know that you cannot motivate a child extrinsically. Poor intrinsic motivation, and poor mnemonic skills make SuperMemo unsuitable for children. In the presented case, a forgetting curve shows a catastrophically poor performance in a 7-year-old child coerced to learn vocabulary of a foreign language. This is a classic case of asemantic learning. The curve bears no relevance to a child's IQ. At this age, some kids may already show some success, as long as the use of SuperMemo is entirely voluntary