Linking Skinner’s Teaching Machines to Modern Adaptive Instructional Design

1/30/2026

Linking Skinner’s Teaching Machines to Modern Adaptive Instructional Design From Teaching Machines to Adaptive Learning Systems B. F. Skinner’s teaching machine...

Linking Skinner’s Teaching Machines to Modern Adaptive Instructional Design

From Teaching Machines to Adaptive Learning Systems

B.

F.

Skinner’s teaching machines of the 1950s were early prototypes of adaptive learning, grounded in behaviorist principles.

Skinner’s machines presented material in small, carefully sequenced steps with immediate feedback and reinforcement for each correct response.

By controlling the rate of content and requiring a correct answer before advancing, these devices ensured a high success rate.

Skinner’s approach aimed to “arrange contingencies of reinforcement” so that complex skills were built gradually from simple components.

This shaping of behavior through successive approximations meant difficulty increased only as the learner demonstrated readiness, an early form of adaptive difficulty.

Skinner argued that immediate positive feedback and error-minimizing sequences would keep learners engaged and on track (Skinner, 1958).

Notably, Skinner’s programmed instruction was linear – the same sequence for all learners – but it was personalized in pace: each student could proceed as quickly or slowly as needed.

The goal was mastery of each step, reflecting the behaviorist belief that strong stimulus–response associations(evidenced by correct answers) form the basis of learning. Other pioneers introduced more explicitly adaptive features.

Norman Crowder developed intrinsic branching programsin the late 1950s that adjusted to the learner’s responses.

In Crowder’s system, each frame of instruction was a multiple-choice question.

If the student answered correctly, the program branched forward; if the answer was incorrect, the program detoured the student to remedial frames targeting the specific error.

In this way, Crowder’s “intrinsic programming” dynamically altered the path through material for each learner, providing immediate reteaching or review following an error.

As Crowder explained, learners come in with different prior knowledge and make different mistakes, so instruction should be “completely flexible” and adapt to those needs.

This was a stark departure from Skinner’s one-size-fits-all sequence – so much so that people began distinguishing “Skinnerian (linear) programs” versus “Crowderian (branching) programs” in programmed instruction (Crowder, 1960; Skinner, 1968, as discussed in McDonald, 2003).

Crowder’s adaptive strategy – essentially a “smart regression” logic – prefigures modern adaptive tutoring: on a wrong answer, the learner is immediately given an easier explanation or practice item, and only when they demonstrate understanding do they return to the main curriculum. Around the same time, Gordon Pask built one of the first truly adaptive teaching machines.

Pask’s Self-Adaptive Keyboard Instructor (SAKI), patented in 1956, taught typing skills using a cybernetic feedback system (Watters, 2018).

Unlike earlier machines that merely presented fixed frames or branched paths, Pask’s device continuously measured the learner’s performance (both accuracy and response latency) and adjusted the difficulty of tasks in real time.

As Pask described, “the difficulty of the questions are not pre-programmed or pre-ordained” – instead, the machine builds a profile of what the individual learner finds easy or hard, and adapts accordingly.

For example, if a learner struggled with a particular key or problem type, SAKI would present that item more frequently and with more support (e.g. slower pace or added hints).

Once the learner improved, the machine would fade out the help and increase the pace or complexity.

In essence, Pask’s system implemented a form of real-time difficulty adjustment very similar to modern adaptive learning algorithms.

The logic – also evident in later computer-based tutors – was to keep the learner in an optimal zone of challenge: not bored by tasks that are too easy, but not overwhelmed by those too hard.

Cybernetics and Management(Beer, 1959) marveled that Pask’s adaptive trainer could “constantly adjust all the variables to reach a desired goal… In short, you are being conditioned”.

This early use of performance-driven adaptation directly anticipates features of today’s intelligent tutoring systems and adaptive learning software.

Adaptive Difficulty and Performance Thresholds

A core idea carried from mid-century programmed instruction into modern adaptive design is the use of performance thresholds to guide difficulty and advancement.

In behaviorist instructional design, mastery criteria are typically set to ensure a learner has adequately learned a unit before moving on.

Historically, a common rule of thumb was requiring about 80% correct on practice or a test to signal mastery (or fluency) of a skill.

This 80% criterion became pervasive in both programmed instruction and applied behavior analysis – for example, many skills programs or IEP objectives might say a student must demonstrate “80% accuracy” across several trials or days to be considered mastered (Leaf & McEachin, 1999, etc.).

The “4 out of 5 trials at 80%” type mastery criterion is still prevalent.

The choice of 80% was somewhat conventional, balancing a high success rate with a tolerance for the occasional mistake.

In behavior-analytic terms, one might consider a skill in the acquisition phase when performance is around 60–70% (some learning but not consistent), moving into fluency as it approaches 80%+ accuracy (more solid, quick responding), and nearing generalization or full mastery as accuracy climbs to ~90% or above.

These thresholds (roughly 60% = acquisition, 80% = fluency/mastery, 90% = generalization) are not hard rules but have been used as guidelines in educational practice.

They align with the instructional hierarchy proposed by Haring et al. (1978), in which a learner’s stage can be inferred from performance: during acquisition, errors are common and accuracy is improving; during fluency, errors decrease and speed increases; and by generalization/maintenance, the skill is performed with high accuracy across contexts. Importantly, research in applied behavior analysis has examined the effects of different mastery criteria on long-term learning.

Single-subject design studies (common in behavior analysis) have compared outcomes when mastery is set at various accuracy levels.

For example, Fuller and Fienup (2018) taught individuals new skills to mastery criteria of 50%, 80%, or 90% correct, and then measured retention of those skills weeks later.

The results were striking: only the skills mastered at the 90% criterion were consistently maintained over time, whereas those mastered at 50% or 80% tended to be forgotten or performed poorly later.

In a more granular study, Richling et al. (2019) used criteria of 60%, 70%, 80%, 90%, and 100% (with performance required across three consecutive sessions) to teach children with disabilities; they found that only the 100% mastery condition led to strong maintenance for all learners.

Conditions with 60–80% criteria often saw the skill deteriorate after teaching ended.

Two other studies (Longino et al., 2021; Pitts & Hoerger, 2021) reported that 90% and 100% criteria yielded better maintenance than more lenient criteria.

These findings echo early mastery learning research with adults.

In the 1970s, Johnston and O’Neill (1973) and Semb (1974) working with college students similarly found that more stringent criteria (e.g. requiring near-perfect scores on module tests in a Personalized System of Instruction course) produced more durable learning than lower criteria.

In other words, a student who only learned something to “70% correct” might pass in the short term, but is less likely to retain or generalize that knowledge compared to a student pushed to 90–100% accuracy.

This body of evidence has encouraged instructors and instructional designers to set high proficiency bars (in clinical ABA practice, often mastery is now set at 90% or even 100% for certain critical skills).

It also illustrates the behavioral principle that consistency and fluency in performance (not just moderate success) are needed for true mastery.

Thus, modern adaptive systems often hold learners at a level until they achieve a high accuracy criterion, rather than moving on too quickly.

Many adaptive learning platforms explicitly incorporate mastery-based progression: students must demonstrate, say, 80–90% correct on a topic (sometimes across multiple attempts or days) before new, harder material is introduced.

This ensures the foundation is solid – a direct legacy of behaviorist mastery learning strategies. At the same time, researchers have long been interested in the optimal difficulty during practice – how challenging should tasks be to maximize learning rate? Intriguingly, a recent line of cognitive and computational research suggests that there is a “sweet spot” around an ~85% success rate (15% error rate) during training for maximal learning efficiency.

Wilson et al. (2019) dubbed this the “85% rule,” deriving from models and experiments showing that if a learner is too accurate(>95%) the material might be too easy (not enough challenge to stimulate improvement), whereas if accuracy is too low (<70%) the material may be too difficult, causing frustration or inefficient learning.

In other words, people (and even machine learning algorithms) learn fastest when they are getting things right about 80–90% of the time – a level that keeps them engaged but still making some mistakes to learn from.

This resonates with the intuitions of both teachers and behaviorists.

The Goldilocks principle of difficulty – not too easy, not too hard – underlies techniques like shaping and fading in animal training (gradually increasing task difficulty) and the common practice of raising a student’s level once they’ve achieved roughly “B” level proficiency on the current tasks.

It’s also seen in video games that automatically increase difficulty when a player succeeds or drop to an easier mode after failures, and in adaptive testing algorithms that select harder questions as the examinee answers correctly.

Skinner’s own preference was actually to minimize errors completely through careful programming (he even explored “errorless learning” approaches), but later work by others acknowledged that a modest error rate can indicate productive struggle.

Modern adaptive systems often strive to keep the learner in that productive zone by real-time difficulty adjustment.

For instance, an adaptive quiz engine might start presenting harder questions as the learner answers correctly, until the error rate starts to rise, thereby hovering around an 80–85% success rate.

Conversely, if the learner falters – e.g. getting 2 of the last 3 items wrong – the system may automatically select easier items or review material (a rule of thumb in some tutoring systems akin to Crowder’s old branching logic).

This kind of dynamic difficulty adjustment can be implemented with simple heuristics or more sophisticated algorithms, but the concept is directly traceable to early programmed instruction and behaviorist prescriptions to “aim for a high but not perfect success rate” during practice. A concrete example of smart regression/progression logic is found in Pask’s 1950s SAKI machine described earlier.

SAKI’s algorithm would increase problem difficulty or decrease assistance as the learner performed well, but if the learner started making errors, the machine would immediately adjust by reverting to easier prompts: “the number ‘5’ is eluding you… ‘5’ is put before you with renewed deliberation, slowly, and the red light [hint] comes back on brightly”.

Only once the learner regained accuracy on “5” would the pace quicken and the hint fade again.

This mirrors the common adaptive rule in educational software: if a student answers several items incorrectly, the system will drop back to an earlier, simpler level or provide extra help until their performance recovers.

In psychometrics, similar staircase methods (one step up for X correct, one step down for Y wrong) have been used since the 1950s to find a person’s skill threshold.

The general principle is to respond to recent performance data in real time, a hallmark of adaptive instructional systems.

Personalization Through Data and Behavior Analysis

Another link between mid-20th-century behaviorist systems and today’s adaptive learning is the use of data to individualize the learning trajectory.

Skinner’s machines were relatively simple, but he emphasized data collection – the machine recorded each student’s responses, which allowed teachers (or programmers) to see which frames caused trouble.

This feedback to the programmer was valuable: if many students were getting a particular frame wrong, Skinner would revise the material.

In a sense, this was an early use of learner performance data to improve instruction design.

Modern adaptive platforms take this much further.

They maintain detailed learner profiles, tracking responses over time, and often begin by assessing prior knowledge to customize the starting point.

This practice has roots in earlier “individually prescribed instruction” models from the 1960s, where a student might take a placement test and then skip objectives they already mastered.

Today’s systems can automatically analyze a learner’s historical performance (e.g. past courses, pre-test results, or even real-time analytics from the current session) and initiate instruction at an appropriate difficulty level.

For example, an adaptive math program might use last year’s exam data to place an adult learner at a certain lesson, rather than starting from scratch.

Kara and Sevim (2013) note that adaptive learning environments typically “monitor student activity, interpret these activities based on domain-specific models, and then adjust content delivery in accordance with the learner’s characteristics” – such as their background or prior performance.

In essence, the system first “identifies differences such as [the learner’s] background [and] prior knowledge… and offers a learning environment to suit these differences.”.

This could mean selecting different content, altering the difficulty, or even changing the pedagogy (e.g. providing more visual explanations if a learner has struggled with text).

The idea that instruction should start at the learner’s current level was also a tenet of mastery learning and PSI in the 1970s (where students would test out of modules they already knew).

It reflects Vygotsky’s “zone of proximal development” as well – teaching at a level just beyond the learner’s independent ability.

Adaptive initial placement is simply much more data-driven now: with large item banks and algorithms, systems can fine-tune the starting point and continuously update the learner’s model as new data come in. Finally, it’s worth noting that both K-12 and adult learning contexts have embraced these adaptive, mastery-based strategies – though often under different names.

In K-12, Computer-Assisted Instruction (CAI) in the 1970s (e.g. the PLATO system) incorporated branching tutorials and mastery tests influenced by Skinner and Crowder.

Precision Teaching, a behaviorist approach developed by Ogden Lindsley, stressed daily measurement and aiming for fluency (often defined as high accuracy and speed), using timings and performance charts to adapt practice frequency – concepts we see in today’s learning analytics dashboards.

For adult learners, personalized adaptive learning in higher education and corporate training is a growing area.

Adult students bring widely varying prior knowledge and need flexible pacing – exactly what Skinner’s self-paced machines and Keller’s Personalized System of Instruction (PSI) were designed to address.

Modern adaptive courseware (for example, systems like ALEKS in college math or various adaptive e-learning platforms in professional training) echo those principles: they allow self-pacing, require mastery on quizzes (often ~80–90% to pass), and adjust the sequence of topics based on the learner’s demonstrated mastery or difficulties.

A recent scoping review of adaptive learning in higher education found substantial benefits, concluding that adaptive learning systems can significantly improve adult learners’ engagement, knowledge retention, and overall success compared to one-size-fits-all instruction.

This aligns with decades of mastery learning research that showed higher achievement when learners are allowed as much time and practice as needed to reach criterion (Kulik et al., 1990).

In military and technical training for adults, real-time adaptive difficulty is used to keep trainees in that optimal learning zone – for example, flight simulators that adapt scenarios based on the trainee’s performance, or language learning apps that adjust question difficulty on the fly.

These systems quietly implement the same “contingency management” that Skinner advocated: they deliver immediate feedback, they reinforce successes (even if just with a “ding” or progress bar), and they use the learner’s behavior to decide what to do next (harder task, easier task, or repeat).

Modern algorithms may use fancy names – reinforcement learning, Bayesian knowledge tracing, multi-armed bandits – but at heart they are fulfilling what Richard Atkinson in 1972 called the essential “ingredients for a theory of instruction”: a model of the learning process, measurable performance, and decision rules for selecting the next instructional action (Atkinson, 1972).

In fact, Atkinson and colleagues in the 1960s were already formulating teaching as an optimal policy problem: given a student’s pattern of responses, what should the ideal tutor do next (give a harder item, review an easier item, provide a hint, etc.) to maximize learning? This work directly foreshadows today’s AI-driven tutors.

It highlights a continuity from the single-subject experiments of behavior analysts – who would meticulously adjust their teaching based on one learner’s data – to personalized learning at scale with computers: both rely on data-informed adjustment of instruction for each learner. Conclusion In summary, many modern adaptive instructional design principles are deeply rooted in the work of Skinner and other behaviorists.

The concepts of adaptive difficulty, mastery thresholds, and data-driven individualized instructionwere present in embryonic form in mid-century teaching machines and programmed instruction.

Skinner’s insistence on immediate feedback, frequent active responding, and tailored pacing set the stage for today’s on-demand, interactive learning platforms.

Crowder’s branching logic introduced the idea of contingent path based on performance, which we now see in every “choose your own learning path” system or remedial loop in intelligent tutors.

The use of performance criteria like 80% or higher as gates for progress comes straight from mastery learning and behavior-analytic practice, reinforced by research showing superior long-term outcomes when high accuracy and fluency are achieved before moving on.

And the overall notion of keeping learners in a flow state – appropriately challenged but supported – harkens back to techniques like shaping, fading, and differentiated instruction that the best behaviorist teachers used, now corroborated by cognitive research (the 85% rule).

Both K-12 and adult education implementations have validated these ideas.

Adults, in particular, benefit from adaptive systems that respect their prior knowledge and time constraints, allowing them to test out of what they know and focus on what they need, while adjusting difficulty to maintain engagement (Estella, 2024).

In a very real sense, today’s sophisticated adaptive learning technologies are the direct descendants of Skinner’s teaching machines – “the new forms of teaching machines,” as Kara and Sevim (2013) put it.

The tools have evolved from mechanical boxes to AI-driven software, but the fundamental principles – individualization, ongoing assessment, responsive feedback, and mastery-oriented progression – show a clear throughline from the behaviorist era to the present.

As we design modern learning experiences, we continue to “stand on the shoulders” of these early giants, merging their empirically grounded techniques (often demonstrated through single-subject studies) with contemporary data science to fulfill the longstanding dream of efficient, personalized education for every learner.

References (APA w/ links):

Skinner, B.

F. (1958).

Teaching Machines.

Science, 128(3330), 969-977. (PDF full text) – Classic paper in which Skinner introduces the concept of teaching machines and programmed instruction, describing principles like small-step sequencing, immediate reinforcement, and self-pacing.

Skinner demonstrates how automated instruction can shape behavior by rewarding correct responses. Crowder, N.

A. (1960).

Automatic Tutoring by Intrinsic Programming.

In A.

A.

Lumsdaine & R.

Glaser (Eds.), Teaching Machines and Programmed Learning: A Source Book (pp. 286-298).

Washington, DC: National Education Association. – Norman Crowder’s method of intrinsic (branching) programming, which adapts instruction based on learner errors.

Crowder argues that effective programs must diagnose a student’s needs in real time and provide targeted remediation for mistakes, rather than a one-path-fits-all approach.

Crowder’s systems foreshadow modern adaptive branching logic. Pask, G. (1958). *Electronics, “Learning Machines” (No. 101). (Referenced in Stafford Beer’s 1959 book Cybernetics and Management). – Gordon Pask’s work on adaptive teaching machines, especially the SAKI system for keyboard training.

Pask’s machine built a probabilistic model of the learner’s skill and adjusted question presentation speed and frequency based on performance.

This is one of the first demonstrations of real-time adaptive difficulty: tasks the learner found easy were given less often and with less help, while difficult tasks were repeated more often with more cues. Keller, F.

S. (1968). “Goodbye, Teacher…” Journal of Applied Behavior Analysis, 1(1), 79-89. (link) – Keller’s introduction of the Personalized System of Instruction (PSI), a mastery-based, self-paced instructional method for college students.

In PSI, students must master each unit at a high criterion (often 90–100% on a test) before proceeding, and they learn asynchronously.

This paper contextualizes the mastery learning approach in adult/higher education, demonstrating improved outcomes when students are allowed to reach true mastery without time penalties (Kulik et al., 1979 review found PSI greatly increased achievement).

Illustrates adoption of behaviorist mastery criteria and pacing in adult learning. Haring, N.

G., Lovitt, T.

C., Eaton, M.

D., & Hansen, C.

L. (1978).

The Fourth R: Research in the Classroom.Columbus, OH: Charles E.

Merrill. – This work presents the Instructional Hierarchy (Acquisition, Fluency, Generalization, Adaptation stages of learning) which has been influential in special education and behavior analysis.

It outlines how instructional strategies should change as the learner moves from ~60% accuracy (acquisition) to high accuracy and speed (fluency ~80%+) to application in new contexts (~90%+ generalization).

While not online, its concepts are summarized in many sources and form the basis for thinking about stage-wise criteria in skill learning (e.g., ). Fuller, J.

L., & Fienup, D.

M. (2018).

A preliminary analysis of mastery criterion level: Effects on response maintenance.

Behavior Analysis in Practice, 11(1), 1–8. (DOI) – An experimental study (multiple-probe single-case design) evaluating how different mastery criteria (50%, 80%, 90% correct) during training affect the long-term maintenance of skills for individuals with autism.

It found only the 90% criterion led to consistently maintained responses after a few weeks.

This evidence supports using high accuracy criteria (as opposed to a lax 60–80%) if the goal is durable learning. Richling, S.

M., Nazaruk, E., Ekert, K., & Dixon, M.

R. (2019).

Mastery criteria: Is higher better? Journal of Applied Behavior Analysis, 52(4), 882–898. (DOI) – Another study on mastery criteria, here systematically comparing 60%, 70%, 80%, 90%, 100% (with 3 consecutive session requirements) in an applied setting.

It showed significantly better maintenance for the highest criteria.

The authors discuss the trade-off between efficiency (fewer trials to reach 60% vs more trials to reach 100%) and effectiveness (higher criteria yielding more persistent learning).

This is a single-subject research approach providing empirical backing for the often-cited 80%/90% rules. Wilson, R.

C., Shenhav, A., Straccia, M., & Cohen, J.

D. (2019).

The Eighty-Five Percent Rule for optimal learning.

Nature Communications, 10:4646. (Open Access) – A contemporary computational study that finds the optimal error rate during training is around 15% (i.e., 85% correct), across both artificial neural networks and analyses of animal/human learning.

It frames this sweet spot in terms of keeping difficulty neither too low nor too high, and explicitly connects to educational practices: “this intuition…is at the heart of modern teaching methods”and is analogous to shaping in behaviorism.

Bridges behaviorist concepts of gradually increasing difficulty (shaping/fading) with modern machine learning and cognitive theory, supporting the idea that ~80–90% accuracy is a productive training target. Kara, N., & Sevim, N. (2013).

Adaptive Learning Systems: Beyond Teaching Machines.

Contemporary Educational Technology, 4(2), 108–120. (Full text PDF) – A review comparing 1950s-60s teaching machines to today’s adaptive learning technologies.

It highlights both similarities (individualized pacing, immediate feedback, focus on reinforcement and mastery) and differences (modern systems leverage AI and often constructivist models).

Importantly, the authors note that many principles of teaching machines – e.g. personalization, active learner participation, ongoing progress tracking – are still central in adaptive e-learning environments.

This paper situates historical approaches in a modern context, essentially arguing that adaptive learning systems are an evolutionary extension of the teaching machine concept. Watters, A. (2018, Aug 8).

Teaching Machines: An American Story (And the Case for Gordon Pask). (Blog post)[Hacked Education]. (Link) – Ed-tech historian Audrey Watters provides a narrative on the development of teaching machines with a focus on Pask.

This piece gives a rich description of Pask’s SAKI adaptive machine, quoting primary sources like Pask (1958) and Stafford Beer’s account.

It emphasizes how Pask’s device was “possibly the first truly cybernetic [adaptive] device” in education, able to model the learner’s knowledge state and adjust difficulty on an individualized basis.

The blog format makes it very accessible, and it links the mid-century innovations to current discussions on personalization in ed-tech. Estella, L. (2024).

Adaptive Learning and Assessment Strategies for Adult Learners: A Comprehensive Approach. [Unpublished manuscript]. – (Cited via ResearchGate abstract) A recent paper examining the impact of adaptive learning in adult education settings.

Although full text is not publicly available, the abstract reports that implementing adaptive learning technology “significantly improve[s] learner engagement, knowledge retention, and overall success in adult education”.

It also notes challenges like the need for instructor training and infrastructure.

This reference underscores that the principles inherited from Skinner and behaviorist designs (e.g. self-pacing, immediate feedback, mastery learning) are highly relevant and beneficial for adult learners, who bring diverse backgrounds and require flexible, personalized pathways. There have been recent critiques and refinements of the “85% Rule,” often prompted by broader frameworks like desirable difficulties, expertise reversal, and task complexity.

Here’s what the research says:

Desirable Difficulties — Not Always 85% 🎯 The concept of desirable difficulties (Bjork, 1994) emphasizes that learning benefits vary depending on task type and learner characteristics, not a fixed accuracy target .

For example:

Retrieval practice benefits more from harder questions.

Spacing/interleaving impose effortful conditions that often lower accuracy but improve retention.

Very complex materials may demand lower initial accuracy thresholds to allow manageable processing (en.wikipedia.org, en.wikipedia.org). Takeaway: In tasks requiring high cognitive load (e.g., conceptual problem solving), optimal learning might occur at lower than 85% accuracy, depending on support and learner expertise.

Expertise Reversal Effect — What Works for Novices May Hurt Experts Instructional methods effective for novices can backfire as learners become more skilled (en.wikipedia.org):

Early learners benefit from structured guidance, whereas experts benefit from reduced scaffolding.

This aligns with adaptive fading principles, where difficulty is adjusted based on learner progression, rather than maintaining fixed accuracy targets.

Task Complexity Matters

Bjork and colleagues have shown that learning gains vary with task complexity:

For simple binary tasks, 85% might be ideal.

For complex, multi-step tasks, other dynamics (e.g. retrieval difficulty, interleaving, adaptive feedback) interact, making a single accuracy rule less predictive (en.wikipedia.org, collaborate.princeton.edu).

Summary — Is 85% Rule Still Valid? For binary-choice tasks and stochastic gradient–learning models, the 15.87% error (85% accuracy) rule is well-supported (nature.com). But in real-world education—especially with complex content and diverse learners—adaptive challenge pointsand desirable difficulty frameworks argue against a fixed target. Experts recommend modulating task difficulty based on learner level, employing scaffolding, spacing, and varied practice rather than insisting on 85% accuracy.

📚 Recommendations for Practice

📎 Key References Wilson, R.

C., Shenhav, A., Straccia, M., & Cohen, J.

D. (2019).

The Eighty‑Five Percent Rule for Optimal Learning.

Nature Communications, 10(1), 4646.

Demonstrated theoretical support for ~85% optimal learning in binary tasks (en.wikipedia.org, en.wikipedia.org, nature.com). Bjork, R.

A. (1994).

Desirable Difficulties in Learning.

Concept shows learning advantages from challenge (e.g. retrieval, spacing), not fixed mastery levels (en.wikipedia.org). Kalyuga, S. et al. (2003).

Expertise Reversal Effect.

Discusses shifting instructional strategies based on learner expertise (en.wikipedia.org).

Conclusion: The 85% Rule holds solid for simple, two-choice tasks and is supported by models and lab findings.

But real-world learning—especially with complex content, diverse learners, or advanced learners—benefits more from adaptive, context-sensitive approaches, where performance targets vary by task and learner expertise.

Certainly! Here's an improved summary based on your request:

🧠 1.

Does learning require 90% accuracy or higher? Yes.

Multiple studies within behavior analysis suggest that higher mastery criteria (≥ 90%) yield significantly better long-term retention than lower thresholds. Fuller & Fienup (2018) found that only skills mastered at 90% correct showed consistent retention several weeks later; lower criteria (e.g., 80%) didn’t maintain gains (centralreach.com). Richling et al. (2019) compared criteria from 60% to 100%, finding that performance maintained best under the highest criteria, especially 90–100% correct . This supports the idea that 90% (or higher) accuracy before progression leads to more robust, durable learning than permissive thresholds like 85%.

⚡ 2.

Precision Teaching & CentralReach on fluency—more than rate + accuracy Precision Teaching (PT) blends rate (frequency) and accuracy to define true fluency, surpassing traditional "percent correct" mastery benchmarks. Kubina & Morrison (2000) show that PT practitioners set performance standards—target rates per minute—that ensure retention, endurance, application, and stable performance (centralreach.com, centralreach.com). Critical outcomes (REAPS/RESA) include retaining skills over time, enduring performance over longer periods, generalizing skills, and maintaining stable behavior under distractions—all dependent on hitting fluency standards (link.springer.com). Fluency aims (e.g., 150–200 words per minute reading, 70–90 digits per minute math) are based on empirical norms and sensitivity: learners must hit quantified performance rates to demonstrate 'functional mastery'—else errors and slow responding indicate more training is needed (centralreach.com). PT uses Standard Celeration Charts to graph frequency, detect trends, and guide instructional adjustments in real time (en.wikipedia.org). Thus, PT in systems like CentralReach underlines that fluency = accuracy + speed + maintenance + application, not just static accuracy benchmarks.

🎯 Summary Behavior-analytic research highlights 90–100% accuracy before progression leads to true retention and mastery. Precision Teaching, as promoted in CentralReach, operationalizes this through combined fluency aims, graphing, and performance standards far richer than static accuracy cutoffs.

📚 References (APA) Fuller, J.

L., & Fienup, D.

M. (2018).

A preliminary analysis of mastery criterion level: Effects on response maintenance.

Behavior Analysis in Practice, 11(1), 1–8. https://doi.org/10.1007/s40617-017-0201-0 (centralreach.com) Richling, S.

M., Nazaruk, E., Ekert, K., & Dixon, M.

R. (2019).

Mastery criteria: Is higher better? Journal of Applied Behavior Analysis, 52(4), 882–898. https://doi.org/10.1002/jaba.640  Kubina, R.

M., Jr., & Morrison, R.

S. (2000).

Fluency in education.

Behavior and Social Issues, 10, 83–99.

Retrieved from CentralReach (centralreach.com) Jimenez, J.

E., et al. (2024).

A precision teaching framework for training autistic students to fluency.

Journal of Behavioral Education. (link.springer.com) Binder, C. (1996).

Behavioral fluency: Evolution of a new paradigm.

The Behavior Analyst, 19, 163–197. (Discussed in CentralReach blog) (centralreach.com)

Bottom line: Yes—rigorous learning data often supports ≥ 90% correctness (plus fluency) as the threshold for true mastery.

And Precision Teaching shows fluency is a multidimensional construct: it includes speed, stability, endurance, retention, and generalization.

So fluency is far more than what percent-correct numbers capture.



Edited by Rob Spain, M.S., BCBA, IBA


Want More Evidence-Based ABA Strategies?

Subscribe to our weekly newsletter for practical tips, research updates, and free resources for school-based BCBAs.

Subscribe to Behavior School Newsletter

Weekly Tips for School-Based BCBAs

Get evidence-based strategies, research updates, and free resources delivered to your inbox every week.

Unsubscribe anytime. We respect your privacy and never share your email.