This is an extract from our forthcoming book Fear is the Mind Killer: Why Learning to Learn deserves lesson time – and how to make it work for your pupils.

The feedback we have received so far has been pretty phenomenal.

You can read all the very kind things people have said about it here.

And you can buy it here (John Catt) or here (Amazon).

This extended excerpt is the first part of Chapter 3: Learning to Learn on Trial.

The case against Learning to Learn

‘If you have a theory, you must try to explain what’s
good and what’s bad about it equally.’

 Richard Feynman (2007) [1]


In Chapter 2: A brief history of Learning to Learn, we saw that the research on Learning to Learn presents us with an apparent paradox. On the one hand, there is a compelling body of research literature to suggest that teaching children in such a way as to develop metacognition and self-regulation – the key components of Learning to Learn – leads to significant gains in student learning. On the other, when Learning Learn initiatives have been evaluated on a large scale, the results have been rather more mixed. In Silicon Valley parlance, Learning to Learn doesn’t scale – or, at least, it hasn’t been proven to be scalable yet.

There are two ways in which we might interpret this observation. First, it is possible that there is something about Learning to Learn itself that makes it hard to scale. We might be able to harness the power of metacognition and self-regulation with a small number of highly motivated teachers, but it doesn’t lend itself to being implemented at a system level. A second interpretation would be to say that scaling up itself is hard to do. We think it’s likely that there is some truth in both of these interpretations.

However, there is good news on both fronts. First, there is good reason to believe that the Learning Skills curriculum is easier to implement, and therefore to scale up, than previous initiatives. We will explain why in Part 2, and especially in Chapter 8: An implementation (and evaluation) checklist. And second, in recent years, a new field of study has emerged – implementation science – which we believe holds the answers to the problem of how to scale up promising initiatives in such a way that the benefits can be replicated at a system level. We’ll discuss implementation science in more detail later in this chapter, and again in Chapter 8.

However, some people – typically, those who would describe themselves as traditionalist – don’t think this is a scaling up problem at all. Instead, they object to Learning to Learn because they don’t believe it is possible to teach people how to get better at learning in an abstract, or ‘generic’ way. There is a fascinating discussion to be had here. So, in this chapter – in the spirit of Feynman’s excellent suggestion in the quote above – we will explore a number of arguments for and against teaching children how, as well as what, to learn. To do this, we will put Learning to Learn in the dock. Its alleged crime? Well, as we saw in Chapter 1, Learning to Learn stands accused of being ‘bad education’; a ‘snake oil hoax’; a mistaken attempt to teach the unteachable.

The trial will run as follows:

  • The prosecution: arguments against Learning to Learn
  • The prosecution, cross-examined
  • The defence: arguments in favour of Learning to Learn
  • The defence, cross-examined

Court is in session: let battle commence!



Ladies and gentlemen of the jury. For as long as schools have existed, people have argued that we should do education differently – that instead of having a curriculum designed and taught by experts, we should teach children to teach themselves. In this way, it is believed, we will better prepare them for a life of learning beyond the school gates. As well as teaching subject content, therefore, schools should help children ‘learn how to learn’.

At face value, this is an attractive proposition. However, we are here to persuade you that any attempt to teach children how to learn is misguided – a mirage that rests on the false belief that it is possible to get better at learning in an abstract, or ‘generic’ way. To do this, we will draw on three lines of argument: knowledge is foundational; children are novices; and generic skills can’t be taught/don’t transfer. Let’s consider each in turn.



‘In this country, we are winning the argument
in favour of a knowledge-rich curriculum…’

 Nick Gibb, MP: Minister of State for School Standards (2017) [2]

In the 2000s, under the UK’s New Labour government, educational policy and practice were strongly guided by two big ideas: personalisation and skills. Schools were encouraged to personalise learning to suit the particular preferences and needs of individual pupils, through practices such as learning styles, differentiation and student voice. [3] And there was a strong emphasis on skills throughout this period, with the publication of frameworks such as the Personal Learning and Thinking Skills (PLTS) [4] and the Social and Emotional Aspects of Learning (SEAL). [5] The centrality of skills to educational policy during this period can even be seen in the fact that in 2001, the Department for Education and Employment was renamed the Department for Education and Skills.

Under the more recent coalition and Conservative governments, there has been a marked shift away from the personalisation/skills agenda and towards a ‘knowledge-rich curriculum’. However, support for this idea comes from across the political spectrum, and it should not be seen as a party-political concern. For example, E.D. Hirsch, an influential advocate of a ‘core knowledge’ curriculum, considers himself ‘practically a socialist’. [6] Hirsch’s work centres around the idea of ‘cultural literacy’, which he defines as:

‘the network of information that all competent readers possess. It is the background information, stored in their minds, that enables them to take up a newspaper and read it with an adequate level of comprehension, getting the point, grasping the implications, relating what they read to the unstated context which alone gives meaning to what they read.’ [7]

In his 1987 book Cultural Literacy, Hirsch included an appendix with a list of 5000 names, phrases and historical dates, under the heading What Literate Americans Know. [8] For Hirsch, a knowledge-rich curriculum is needed to provide young people with the minimum background information required to join what the philosopher Michael Oakeshott referred to as the great ‘conversation of mankind’. [9] In his 2007 book The Knowledge Deficit, Hirsch suggests that teaching children ‘a coherent, knowledge-based curriculum’ is the best way to close ‘the needlessly wide achievement gaps between ethnic and racial groups’, as well as between children from different socio-economic backgrounds. [10]

Alongside Hirsch, advocates of a knowledge-rich curriculum often cite the work of the cognitive scientist Daniel Willingham. In his book Why don’t students like school?, Willingham makes a powerful case that knowledge is foundational – it is the stuff that supposedly ‘higher order’ skills such as synthesis, analysis and critique are made of:

‘Trying to teach students skills such as analysis or synthesis in the absence of factual knowledge is impossible. Research from cognitive science has shown that the sorts of skills teachers want for their students – such as the ability to analyse and to think critically – require extensive factual knowledge. The cognitive principle that guides this chapter is: ‘Factual knowledge must precede skill’.’ [11]

To explain Willingham’s key insight in a little more detail: the most powerful determinant of whether an individual is able to think creatively or critically about a particular subject is how knowledgeable they are within that domain. By way of a thought experiment, try thinking creatively or critically about a difficult problem – how to get large multinational companies to pay more tax, say. It is unlikely that you will get very far without knowing a considerable amount about systems of taxation, ‘creative accountancy’ practices and how to write watertight legislation and regulations. Similarly, when your car breaks down, who would you rather call – a mechanic or a university professor? University professors are often extremely knowledgeable, but it’s not often the kind of knowledge that can help you diagnose and fix a faulty alternator.

In recent years, Willingham’s work – in particular, this idea that knowledge is foundational – has been widely embraced by teachers, researchers and politicians alike. For example, in 2012, Michael Gove – then Secretary of State for Education (in England) – said:

‘one of the biggest influences on my thinking about education reform has been the American cognitive scientist Daniel T. Willingham… [who] demonstrates brilliantly in his book, memorisation is a necessary precondition of understanding.’ [12]

Similarly, Nick Gibb, the current Minister for Schools Standards in England, twice invoked Willingham in a recent debate in which he argued in favour of the notion that ‘Learners’ heads should be filled with facts’:

‘Daniel Willingham talks about [how] an educated person has vast amounts of knowledge in his or her long-term memory which you can retrieve instantly… As Daniel Willingham has said, education is about… ensuring that we have facts and knowledge securely embedded in long-term memory…’ [13]

In recent years, some people have suggested that we don’t need to teach knowledge in the internet age, because when they need to know something, children can just ‘look it up on their smartphone’. However, this is mistaken for two main reasons. First – counterintuitively – you need knowledge in order to look something up in an accurate way. As Hirsch (2000) explains:

‘There is a consensus in cognitive psychology that it takes knowledge to gain knowledge. Those who repudiate a fact-filled curriculum on the grounds that kids can always look things up miss the paradox that de-emphasizing factual knowledge actually disables children from looking things up effectively. To stress process at the expense of factual knowledge actually hinders children from learning to learn. Yes, the internet has placed a wealth of information at our fingertips. But to be able to use that information – to absorb it, to add it to our knowledge – we must already possess a storehouse of knowledge. That is the paradox disclosed by cognitive research.’ [14]

As Hirsch argues powerfully, when it comes to thinking critically or creatively about something, it is vastly preferable to have the relevant knowledge stored in your long-term memory, rather than in your smartphone. Indeed, it is having relevant knowledge stored in your long-term memory that allows you to both comprehend and critique the search results that appear on your screen. The implications for Learning to Learn are clear: teach them knowledge instead. This brings us to our second argument…



Children, it will not surprise you to learn, are young and inexperienced. This means that, by definition, children are usually novices, rather than experts – especially in the school context, where much of what they learn, they are encountering for the first time. As we will see, this has important consequences for how we educate young people. To understand why, we need to acquaint ourselves with cognitive load theory, described by Dylan Wiliam as ‘the single most important thing for teachers to know’. [15]

Cognitive load theory is based on the ‘multistore model of memory’ first proposed by Atkinson and Shiffrin (1968). [16] A simplified version of this model – described by Willingham as ‘just about the simplest model of the mind possible’ [17] – can be seen in Figure 1.

Figure 1. The working memory model of the mind. [18]

In this scheme, working memory is defined as the ‘site of awareness and thinking’. This is often thought of as the window of mental space we ‘live in’, or through which we attend to the world. There are two ways in which information can enter your working memory: by paying attention to something in your immediate environment (e.g. teacher, book, screen), or by recalling information stored in your long-term memory.

In a seminal 1956 paper, George Miller proposed what would become known as Miller’s Law – that the number of ‘bits’ of information that can be held in the working memory is limited to ‘the magical number seven, plus or minus two’. [19] While the precise value has been contested over the years (others have suggested the limit is more like four, plus or minus one [20]), the central idea of Miller’s Law is widely accepted. Working memory – the window of human consciousness through which we attend to the world and manipulate mental objects – operates within a fairly narrow bandwidth. It’s easy to demonstrate this by asking someone to hold a novel string of random numbers in their head for say 20 seconds, without rehearsal. 2473 is pretty doable. 24739638194… not so much.

Building on Miller’s Law, cognitive load theory states that because working memory has a limited capacity, if a student is presented with too much information or an overly complex task, their working memory becomes overloaded and they can’t learn effectively.

John Sweller, the founder of cognitive load theory, highlights a crucial point about the limited capacity of working memory. That is, the limit on the number of ‘bits’ of information that can be held in working memory applies ‘when dealing with novel information, and only when dealing with novel information’. [21]

In other words, the limitations of working memory are especially pertinent to the education of children – novices who are almost always dealing with novel information. In experts, information stored in the long-term memory is organised into schemata – dynamic networks of knowledge and understanding that guide our beliefs and behaviours. Together, networks of schemata act as a kind of ‘index’ which allows knowledgeable experts to recall and retrieve relevant information instantly from long-term memory, without burdening the working memory. [22]

To place this in a classroom context, suppose a student is working on a complex maths problem and that as part of the solution, they need to work out 7×6. If they have to stop what they’re doing to find a calculator, or count it out on their fingers, this would take up valuable working memory capacity. In contrast, had the student memorised their times tables, they would know in an instant that the answer is 42 and this would allow them to think about the wider problem. Thus, by storing and automating information in the long-term memory, we can bypass the limits of Miller’s Law and free up our working memory to attend to the problem in hand. So when people say ‘there’s no point teaching knowledge any more, people can just look stuff up on their smartphone’, they would do well to look up cognitive load theory on their smartphone and then reflect a little longer.

Cognitive load theory suggests that if we want people to be able to think creatively and critically – and everyone seems to agree that this is a desired goal of education – we need to view creativity and critical thinking as the endpoint, and not as the method by which we get there. The way to get there is to teach a knowledge-rich curriculum. If you aren’t persuaded by the cognitive science, you can arrive at the same conclusion using common sense. If you want a student to think creatively about something – to write a song for a musical, say – it stands to reason that they’ll write a better song if they know a lot about music theory and have seen lots of musicals, than if they just thrash away on a detuned ukulele in a state of blissful ignorance. Likewise, if you want your students to think critically about why a science experiment produced unexpected results, they’ll stand a better chance if they know their dependent variable from their control variable.

Here, we can see how cognitive load theory connects the fact that children are novices with our first argument, that knowledge is foundational. Working memory acts as a kind of ‘bottleneck’ – a narrow passage through which information passes into an individual’s mind. When we try to force too much information through this bottleneck, it can become overloaded and this can prevent the individual from learning effectively. However, the bottleneck can be bypassed by having knowledge stored in the long-term memory, freeing up working memory to attend to the task in hand. The difference between a novice and an expert on a given topic is the amount of relevant information stored in the long-term memory, and storing knowledge and building schemata in long-term memory is precisely how novices turn into experts.

When we are dealing with the education of novices – and that, by definition, is what schools exist to do – we need to think carefully about how to teach in such a way that the learning is remembered over the long term. In schools, time is finite, and every decision we make about how to spend that time comes with an opportunity cost: what else might we have achieved, had we spent that time differently? If we want to furnish children’s minds with ‘the best of what has been thought and said’, which is likely to be the most efficient use of time – to have a knowledge-rich curriculum, designed and taught by experts, or to teach novice children to ‘learn how to learn’ in the hope that they will learn it by themselves?



There are two related arguments here. First, so-called ‘generic skills’ such as creativity and critical thinking are not really generic at all – they are highly subject-specific – and therefore they cannot be taught in a generic way. And second, learning is situated, which is to say, knowledge and skills tend to remain rooted in the context in which they were developed. Therefore, even if we could teach such skills, it is unlikely that they would meaningfully transfer to other lessons. We will now consider these arguments in turn. 

Perhaps the best-known example of the argument that generic skills such as creativity and critical thinking cannot be taught is an article by Tricot and Sweller (2014). [23] Following the evolutionary psychologist David Geary, [24], [25] Tricot and Sweller draw a distinction between ‘biologically primary’ and ‘biologically secondary’ knowledge. [26] They define biologically primary knowledge as that which ‘we have evolved to acquire over many generations’; this includes things like ‘learning to listen and speak, learning to recognise faces, engage in social relations, basic number sense or learning to use a problem-solving strategy’. [27] In contrast, biologically secondary knowledge has no evolutionary precedent; this is defined as things like ‘reading, writing and arguably, all other content taught in modern educational establishments’. [28]

According to Tricot and Sweller, biologically primary knowledge ‘is acquired easily, unconsciously and without explicit tuition. Barring learning deficits such as those associated with autism, it will be acquired automatically simply as a consequence of membership of a normal society’. [29] Tricot and Sweller consider biologically primary knowledge to be synonymous with ‘generic skills’, and because biologically primary knowledge is ‘acquired easily’ and ‘without explicit tuition’, this is presented as ‘an alternative to the perspective that teaching generic skills is important’. [30]

Elsewhere, Sweller has written: ‘If children are not explicitly taught to read and write in school, most of them will not learn to read and write. In contrast, they will learn to listen and speak without ever going to school’. [31] This idea that written literacy needs to be taught, while people usually learn to speak and listen even in the absence of schooling, is supported by historical literacy rates. In 1820, only 12% of the world could read and write. Following the invention of schools, this pattern was swiftly reversed: in 2015, the global literacy rate was around 86%. [32] In contrast, one assumes at least, the proportion of people capable of holding a conversation in 1820 was very much the same as it is today.

We will now turn to the second part of the argument – that even when people do develop so-called higher-order skills such as creativity and critical thinking, they tend to be quite specialised to particular situations and do not transfer easily from one context to another. There is a considerable body of research literature on situated learning, ‘the idea that much of what is learned is specific to the situation in which it is learned’. [33] To return to our car mechanic: though they may be good at solving problems to do with alternators, that doesn’t mean will they be equally good at solving chess problems, or solving the problem of tax avoidance by multinational corporations.

Similarly, skills like creativity are highly context-specific, and do not transfer easily from one domain to another. Kaufman and Baer (2002) made this argument forcefully when they asked ‘could Stephen Spielberg manage the New York Yankees?’ and concluded, having considered the cognitive science, that the answer ‘appears to be no. Just as Joe Torre [the Yankees coach at the time] should probably restrict his camera activity to birthday parties, Spielberg should probably only enter Yankee Stadium as a fan.’ [34]

Learning to Learn is based on two assumptions: first, that is possible to teach generic skills; and second, that skills transfer easily from one context to another. If both of these assumptions are false – and we hope to have persuaded you that they are – then the foundations on which Learning to Learn is built begin to look decidedly shaky.

Summing up, the prosecution declares: ‘Ladies and gentlemen of the jury. This is an open and shut case. We have seen that knowledge is foundational. We have seen that children are novices. And we have seen that generic skills cannot be taught, and do not transfer easily from one context to another. Our opponents will try to persuade you that all of this is blindingly obvious – that “of course knowledge is important, nobody ever said it wasn’t”. But in recent years, those on the progressive side of the aisle have claimed precisely this. They say things like ‘you can just Google it’, and ‘we are preparing children for jobs that don’t exist yet, so what’s the point in teaching knowledge?’ You may very well shake your head sir, but I kid you not! Thank goodness the teaching profession has come to its senses. The whole idea of Learning to Learn rests on the assumption that learning skills are generic – that once you become an effective learner, you are able to learn anything, in any context. But as we have just heard from our key witnesses, generic skills can’t be taught! It’s a misunderstanding, based on an ignorance of the cognitive architecture of the human mind. To be perfectly honest, I’m not really sure what Learning to Learn is; I doubt very much that proponents of Learning to Learn really understand it either. But I do know one thing: it doesn’t sound very much like teaching knowledge. And as we have heard: teaching knowledge is the thing! Members of the jury: let us not step back into the dark ages. It is incumbent upon you to lock up Learning to Learn – whatever it may be – and throw away the key!’

To find out how we defend Learning to Learn against these arguments… well, you’ll have to buy the book!



[1] Feynman, R. (2007) What Do You Care What Other People Think? Further adventures of a curious character. London: Penguin. p117-118.

[2] Gibb, N. (2017) The importance of vibrant and open debate in education. Speech at the researchED National Conference, London, September 9. Available at:

[3] Hargreaves, D. (2006). Personalising learning 6: the final gateway: school design and organisation. London: Specialist Schools and Academies Trust.

[4] Qualifications and Curriculum Authority (2007) Personal, Learning and Thinking Skills Framework. London: QCA.

[5] Department for Children, Schools and Families (2007). Social and emotional aspects of learning for secondary schools. Nottingham: DCSF Publications.

[6] Tyre, P. (2014). ‘I’ve Been a Pariah for So Long’. Politico Magazine. Available at: See also the #leftytrad hashtag on Twitter.

[7] Hirsch, E.D. (1987). Cultural Literacy: What Every American Needs to Know. Boston, MA. Houghton. Mifflin Company.

[8] Hirsch, E. D. 1987. Cultural Literacy: What Every American Needs to Know. Boston, MA: Houghton Mifflin.

[9] Oakeshott, M. (1959) The Voice of Poetry in the Conversation of Mankind. London: Bowes and Bowes.

[10] Hirsch, E. D. 2007. The Knowledge Deficit: Closing the Shocking Education Gap for American Children. New York, NY: Houghton Mifflin, p.xi–22.

[11] Willingham, D. (2009) Why don’t students like school? San Francisco, CA: Jossey-Bass, p. 19.

[12] Gove, M. (2012). Secretary of State for Education Michael Gove gives speech to IAA, November 14. Available at:

[13] Should we fill 21st Century learners heads with pure facts? Debate from the 2017 Global Schools and Education Forum. Available at:

[14] Hirsch, E. D. (2000). ‘You can always look it up’ … or can you? American Educator (Spring), 4–9.

[15] Wiliam, D. (2017) ‘I’ve come to the conclusion Sweller’s Cognitive Load Theory is the single most important thing for teachers to know.‘ Twitter, January 26. Available at:

[16] Baddeley, A., & Hitch, G. J. (1974). Working memory. In G. A. Bower (Ed.), Recent Advances in Learning and Motivation (Vol. 8, pp. 47-90). New York: Academic Press.

[17] Willingham, D. (2009) Why don’t students like school? San Francisco, CA: Jossey-Bass, p. 11.

[18] From Willingham, D. (2009) Why don’t students like school? San Francisco, CA: Jossey-Bass, p. 42.

[19] Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63, 81–97, p81.

[20] Cowan, N. (2001). The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behavioral and Brain Sciences, 24, 87–114.

[21] Sweller, J. (2009). What human cognitive architecture tells us about constructivism. In S. Tobias & T. M. Duffy (Eds.), Constructivist instruction: Success or failure? (p. 127–143). Routledge/Taylor & Francis Group.

[22] Larkin, J., McDermott, J., Simon, D.P. & Simon, H.A. (1980). Expert and novice performance in solving physics problems. Science, 208(4450), p1342.

[23] Tricot, A., & Sweller, J. (2014). Domain-specific knowledge and why teaching generic skills does not work. Educational Psychology Review, 26(2), 265-283.

[24] Geary, D. C. (2008). An evolutionarily informed education science. Educational Psychologist, 43, 279-295.

[25] Geary, D. C. (2012). Evolutionary educational psychology. In K. R. Harris, S. Graham, & T. Urdan, (Editors-in-chief), Educational psychology handbook: Vol. 1. Theories, constructs, and critical issues (pp. 595-620). Washington, DC: American Psychological Association.

[26] Tricot, A., & Sweller, J. (2014). Domain-specific knowledge and why teaching generic skills does not work. Educational Psychology Review, 26(2), 265-283.

[27] ibid., p269.

[28] ibid., p269.

[29] ibid., p268.

[30] ibid., p265.

[31] Sweller, J. (2016). Story of a Research Program. In S. Tobias, J. D. Fletcher, & D. C. Berliner (Series eds.), Acquired Wisdom Series. Education Review, 23, February 10, p.12.


[33] Anderson, J.R., Reder, L.M & Simon, H.A. (1996) Situated Learning and Education. Educational Researcher, 25(4), 5-11.

[34] Kaufman, J. C., & Baer, J. (2002). Could Steven Spielberg manage the Yankees?: Creative thinking in different domains. Korean Journal of Thinking & Problem Solving, 12, 5-14. (p12)