Notes to Associationist Theories of Thought

1. Empiricists who have wanted more than one type of learning mechanism have tended to be constructivists. The basic constructivist position is to posit a single mental process, the ability to associate ideas, and to construct new processes out of the single innate process (see, Fodor 1983 for discussion). On pain of regress, no theorist, regardless of their orientation, can have every mental process be learned. Some must be innate, and some must be "built into the architecture" (e.g., not explicitly represented; see Quilty-Dunn and Mandelbaum 2018).

2. Though many later associationists, such as Pavlov and the behaviorists, had only one mental process, Hume also had the imagination. For discussion on how the imagination meshes with Hume’s empiricism and associationism see Fodor (2003).

3. That said, one can detect aspects of associationism in earlier writers, such as Descartes when discussing memory and Spinoza when discussing the emotions (see the entry on Descartes and on 17th and 18th Century Theories of Emotions.

4. Although Hume is generally acknowledged as laying the theoretical foundation of associationism, there is some evidence that Francis Hutcheson’s use of associations greatly influenced him. See the entry on Scottish Philosophy in the 18th Century.

5. “All our simple ideas in their first appearance are deriv’d from simple impressions, which are correspondent to them, and which they exactly represent” (T 1.1.1.7/4).

6. This is a bit of a loose formulation. Strictly speaking, impressions themselves don’t instantiate any associative relation, rather the contents of the Impressions do. For example, it isn’t that one’s Impression (understood as a vehicle of thought) of chickens resembles roosters; rather it’s the content of one’s impressions resemble one another. Presumably, all Impressions qua vehicles of thought resemble one another merely by being Impressions. What differs between Impressions is, e.g., whether the content they represent resembles other represented content. This distinction between vehicle and content is important for Hume’s overall architecture: it’s not the vehicle of the Impression that gets copied into an Idea, but rather the content of that vehicle. That said, the vast majority of associationist theories range over associated contents and not associated vehicles (even though there is no theoretical reasons vehicles can't be associated, and some reason to think they sometimes are, see, e.g., Luka and Barsalou 2005). To ease exposition the distinction between vehicles and contents is elided in the main text except where it is important to distinguish.

7. Although some contemporary associationist views still retain all three original Humean associative relations, the resemblance relation has come under the most scrutiny and is the least popular of the three. For discussions of the problem of the resemblance criterion see Field and Davey (1999), and De Houwer (2009). In the canonical Rescorla Wagner model (Rescorla and Wagner 1972), both contiguity and resemblance are superseded by the contingency requirement. However, allowing cause and effect (as such) to be analyzed as associative greatly complicates the theory no longer allowing for it to be clearly any simpler than computational or propositional theories

8. A variation on classical conditioning is evaluative conditioning, where one tries to transfer the valence of the US onto the CS (see, e.g., De Houwer et al. 2001 for an overview). For instance, one might pair a favorable flavor (e.g., sugar) with a novel neutral face stimulus, in order to transfer the positive valence to the previously neutral face.

9. There are many different ways of construing the details of Pavlovian conditioning. For example, some would restrict the usage further by arguing that the US must be biologically significant, or widen the usage, as Rescorla does (see section 7). Some anti-associationists even believe that Pavlovian conditioning is real, but not predicated on associations (Mitchell et al. 2009).

10. Classical conditioning also had some consequences that were a bit unpalatable for empiricists: if all learning was to be given as forming associative bonds between USs, CSs, and responses, then all of our learning had to bottom out in some behaviors that were preprogrammed to correspond to certain stimuli: in other words, certain instinctual patterns of behavior were innately set to be elicited by certain stimuli. Even more problematically, such instinctual patterns were apt to be species specific, so not generalizable to humans. Most problematically, the theory just doesn't seem true, as responses to CSs are often different than the responses to USs. When bells are swapped for food dogs may still salivate, though not necessarily to the same degree as the actual food (a fact which Pavlov himself knew, see Pavlov 1927, Lecture III). Moreover, if given the opportunity, dogs will try to eat the food but they won't try to eat the bell.

11. Note how Thorndike does not hesitate to speak of mental states like satisfaction and dissatisfaction, as opposed to the most famous practitioner of operant conditioning, the radical behaviorist B.F. Skinner (see the behaviorism entry).

12. From this level of abstraction, Pavlov and Skinner were united. Here’s Garcia’s on Skinnerian learning:

Any stimulus applied immediately after the response which, by empirical test, would increase response production was deemed a reinforcer…The general procedures were said to be applicable to any and all reflexes, in any and all organisms. There was no need to concern ourselves with species differences, with brain differences, or with reinforcer differences. The payoff schedule’s the thing wherein we’d capture control of the organism. (Garcia 1981: 155)

13. Some even question whether evaluative conditioning is a true form of learning, or is instead a version of propositional learning; see De Houwer 2018, and section 8 below.

14. Talk of storage implies the existence of memory. For associationists use of memory has been, at times, a tricky issue. Since memory is a cognitive faculty, if associations need to have memory then one cannot be an associationist while also denying the existence of mental processes, or minds for that matter (yet another problem for the radical behaviorist position). Insofar as associative learning implies memory (and it seems to even on 'model-free' learning models) then some unintuitive conclusions may follow, as even plants have appeared to engage in associative learning (Gagliano et al. 2016).

15. Radical behaviorists such as Skinner (e.g., 1953) would deny this claim, but only because of their ontological objections to reifying mental states. But Eliminativism of the mental is a different thesis than associationism, although both fit together well (see section 6).

16. Hereafter I will use the forward slash to denote an associative bond between the entities on either side of the slash. Additionally, expressions written in small caps will be used to denote concepts, and I will assume that the concepts’ structural descriptions are given by the expressions. Thus red bird is taken to be a complex concept consisting of two meaningful parts, the concept red and the concept bird. However, bird will be assumed to be a simple concept with no semantically decomposable parts. The structural descriptions are stipulated for exegetical reasons and without commitment to the actual structure of the corresponding concepts.

17. The mediation parenthetical can get a bit complicated to state, for one might want to claim that, e.g., wrench and hammer are associated, even if the association is mediated via a link between those concepts and tool. In which case, it’s best to say that two concepts form a basic associative structure if the activation of one concept brings on the activation of another without there being any other mediating psychological variable.

18. This claim should be qualified in a few ways. First, the mapping might not be a full mapping of a single thinker as opposed to a subsystem of a single thinker (such as their intramodular representation of their lexicon, see Fodor 1983). Secondly, the mapping needn’t be between concepts per se, and can instead be between mental representations that for some reason or another one needn’t bestow the honorific of “concepts” to (because, for example, the mental representations are intramodular and thus not properly “general”, see Evans 1982).

19. “Experiencing Xs and Ys” generally means something such as “having formed representations of Xs and Ys based on their appearance in the ambient environment,” but needn’t necessarily mean that. If one just happened to keep thinking x followed by y for any reason, even though Xs and Ys weren’t given in experience, that too could change the associative strength of the x/y bond. Additionally, some theories allow “piggybacking” associations—associations formed from activated propositional structures. For example, constantly having the propositional thought molly owns a dog could affect the associative bond between molly and dog (see Mandelbaum 2016 for discussion).

20. Although bare-boned associationism provides a good approximation of Hume and Pavlov, it doesn’t quite capture the full theory of those working in operant conditioning paradigms for it doesn’t involve any notion of reinforcement, or updating one’s associative structure based on consequences. This isn’t accidental: how to square cognitive updating (i.e., association-based or belief-based updating) based on consequences with the spartan tenets of associationism has often been a point of difficulty (see, e.g., Festinger and Carlsmith 1959).

21. Curiously, it appears that extinction isn’t very effective in evaluative conditioning paradigms, though counterconditioning is (see De Houwer 2011 for many citations, such as Diaz et al. 2005 and Vansteenwegen et al. 2006).

22. Technically, reinstatement is the reappearance of the CR upon reexposure to the US after successful extinction, whereas spontaneous recovery is the name for the return of the associative pairing just due to the passage of time. Thus one is due to changes in spatial context, the other changes in temporal context. Both reinstatement and spontaneous recovery are related, and both provide difficulties for the traditional view of extinction.

23. Interestingly, Locke also seemed to understand the nature of taste aversions (see section 9.4):

A grown person surfeiting with honey no sooner hears the name ofA grown person surfeiting with honey no sooner hears the name of it, but his fancy immediately carries sickness and qualms to his stomach, and he cannot bear the very idea of it; other ideas of dislike, and sickness, and vomiting, presently accompany it, and he is disturbed; but he knows from whence to date this weakness, and can tell how he got this indisposition. Had this happened to him by an over-dose of honey when a child, all the same effects would have followed; but the cause would have been mistaken, and the antipathy counted natural. (Locke 1690: 2.23.7).

24. In the example of associative transitions offered above, we used associations between propositions. But of course a pure associationist view would not allow propositional structures. It is thus a bit more difficult for a pure associationist to distinguish associative transitions from associative structures. For the pure associationist, all transitions are associative transitions among associative structures, for association is the only available mental process and associative structures the only available mental structure. Thus, for the pure associationist, the only possible difference between an associative structure and an associative transition is a contingent temporal one (where an associative structure is ideally contemporaneous whereas an associative transition unfolds over time).

25. The situation is similar to what arises in numerical cognition. When we are children, we may explicitly add 2 to 2 to get 4, but over time 2 + 2 = 4 becomes an associated string, more phonetic than arithmetic, similar to the truths of the multiplication table. After memorizing the multiplication table we don't need to think to give the answer to 5 x 5. Compare 2 + 4 or 5 x 5 to calculating either 9 + 16 = 25 or 55 x 5. In these latter cases we don't have any rote overlearning so we generally have to calculate the responses instead of merely parroting stored ones. Evidence for the distinction between these two ways of answering numerical questions comes from patients who lose access to their faculty of numerical reasoning (Dehaene 2011; Mandelbaum 2013b). These people can't do basic numerical tasks (e.g., tell you if 30 is between 20 and 40 on a number line, visually distinguish which sets have more members than others, longhand calculate arithmetical questions, etc.) but they can still answer previously memorized equations, like the multiplication tables. In this sense, knowledge of the multiplication tables are more similar to one's knowledge of the state capitals--both just species of semantic memories--than they are to mathematical reasoning. In essence answering via semantic memory is quicker and easier because the answers are associated with questions in a way most expressions of mathematical truths are not. Similarly, overlearned inferences may cease to be inferences and instead become associative strings because of the overlearning (e.g., perhaps this is true for some for the old chestnut All men are mortal, Socrates is a man, so Socrates is mortal).

26. The question of how many levels of explanation one allows in their cognitive architecture is a wholly separate question of whether any of those architectures are associationistic. Generalizations here vary wildly from theorist to theorist. For example, many theorists, roughly following Marr (1982), assume there is just one algorithmic (psychological/representational) level which is then instantiated in a physical (neurological) level (see, e.g., Mitchell et al. 2009). Others generally assume that there are multiple psychological levels. For instance, Fodor writes, “psychological faculties at the nth level are typically implemented by psychological faculties at the n−1th level” (2003: 132; cf. Danks 2013).

27. In this context, “subsymbolic” just means that the node on its own has no semantic value. In other words, a single node wouldn’t represent any content.

28. What's seen as structural from one vantage point is seen as structural from another (Lycan 1990). Dendrites are structural from the vantage of intentional psychology, but functional from the vantage of particle physics.

29. There are no domain-specific associationists because associative learning is incompatible with domain specificity. Domain specificity assumes different mental processes for different domains, and associative learning presupposes the same learning mechanism regardless of domain (Mandelbaum 2017, 2019).

30. For example, in a “default-interventionist” model System 2 processes are not always engaged though they are in “parallel competitive” models (both models include the constant automatic engagement of System 1). See Evans and Stanovich 2013 for discussion.

31. Systematicity arguments aren't about language, or even humans, per se. One can run the same style argument for animals and learning: an animal that can learn that a is on top of b can learn that b is on top of a; an animal that can learn to associate a green light with a shock and a blue light with food, can learn to associate a green light with food and a blue light with a shock. Of course there are well-known domain specific constraints to this type of learning (see the Garcia effect discussion) but they are merely exceptions to the rule.

32. Gallistel and King (2009: 239) argue that there is no such window. Instead they argue that what matters for learning in place of contiguity is a ratio of the time between the presentation of the CS and the appearance of the US as compared to the time between different US presentations (in a given context). For example, speeding up the CS/US connection by a factor of two reduces the amount of US presentations one needs by half.

33. It appears that content specificity of associations needn’t just be based on innate dispositions. For example, in an evaluative conditioning paradigm using odors as USs and faces as CSs, the evaluative conditioning only commenced when the odors were interpreted as plausibly human (Todrank et al. 1995). But “plausibly human” included learned information (such as the odors associated with soap). When the odors were typically associated with objects and not humans, no learning transpired. Additionally, there appears to be content-specific differences in associative learning at a greater level of abstraction: there is evidence that negative US/CS pairings are learned more quickly, and form stronger bonds than positive US/CS pairings (Rozin 1986, Baeyens et al. 1990.)

34. Blocking has been observed in humans (see Dickinson et al. 1984) but one needn’t delve into the empirical literature to feel the pull of the phenomenon. Imagine you’ve eaten an orange and immediately have an allergic reaction. If in your next meal you eat an orange and an apple and have the allergic reaction, you will be less likely to think the apple caused the reaction than you would were you to have never experienced the allergic reaction after eating the orange.

35. More problematically for associationists, blocking doesn’t always work, but when it doesn’t isn’t predictable by associative theory. For example, if a weak odor is paired with a strong taste and the pairing is followed by gastrointestinal distress, the taste magnifies the sensitivity of the odor as a signal (Rusiniak et al. 1979). Relatedly, if a hawk eats a black mouse and gets sick, the hawk won’t just avoid black mice but will avoid all mice. However, if the black mouse tastes different than a white mouse, then the hawk will continue to eat white mice even after black mice make it sick (Brett et al. 1976).

36. Oddly enough, evaluative conditioning does not seem as sensitive to base rates or as susceptible to “occasion setting” as classical conditioning is. See De Houwer et al. 2001).

37. The problem metastasizes depending on how one interprets "location". For example, if the testing facility is in New Jersey, or the east coast, or on Earth, or in the Milky Way, why isn't that information also associated? Of course, the natural thing to say here is that the animal has the concept testing location but doesn't have the concept new jersey. This response is blocked off from behaviorists, but not associationists per se, though the latter still have to explain why these concepts remain unlearned.

38. The more one looks into how locational properties become associated, the more problems seem to mount. For example, if a rat has a strong preference for a particular drink but gets shocked while ingesting that drink, the rat will not change its preference of the flavor. Instead, the rat will just learn to avoid the drink when it encounters it in the experimental location. But when the rat is given a chance to ingest the drink anywhere else (e.g., back in its home cage) it will still continue to ingest the drink. Furthermore, in the case where the rat gets shocked while drinking the highly desirable flavor in the Skinner box on trial N, the rat will increase how much of the drink it will intake on trial N+1. This is a reasonable strategy, one that seems to indicate rational thought: assuming that one knows they are going to get shocked, they might as well intake as much as possible while getting shocked. For more on these points, see Garcia (et al. 1970).

39. In other versions of the problem it is understood as the problem the organism faces in trying to figure out which of its behaviors produced the environmental change that interests the organism. It also appears in problems in Artificial Intelligence (see Minksy 1963).

40. For a pure associationist, one would phrase this as the organism learning to associate the lack of CS with the US. How the pure associationist analyzes the absence of a CS while using only associative structures can also be a tricky issue.

Copyright © 2020 by
Eric Mandelbaum <eric.mandelbaum@gmail.com>

Open access to the SEP is made possible by a world-wide funding initiative.
The Encyclopedia Now Needs Your Support
Please Read How You Can Help Keep the Encyclopedia Free