learning under all theoretical assumptions (Yu & Smith, 2012). The prior empirical study
used a “looking while listening” paradigm in which infants were presented with a series of
visual scenes and co-occurring words as illustrated in Figure 1. On one trial, the infant might
hear the words “regli” and “toma” in the context of seeing object a and object b. Without
other information, the hypotheses that “regli” refers to object a and that “toma” refers to
object b versus the hypotheses that “regli” refers to b and “toma” refers to a cannot be
decided. However, if the next trial presents the referents of b and c in the context of the
words “regli” and “gasser” and if the learner can remember the co-occurrences trial-to-trial
and can combine the conditional probabilities of co-occurrences across trials, the learner
could be more certain that “regli” refers to object b because b is the only candidate referent
that has co-occurred with “regli” on both trials. In the first experiment using this method
(Smith & Yu, 2008), 12- and 14-month old infants were presented with a randomly ordered
stream of 30 such trials with 6 objects and 6 words to be learned across the trials. At the end
of this experience, infants were tested: two visual objects were presented in the context of
one spoken word and looking time was measured. The results showed that 12-and 14-month
old infants looked more to the correct referent than the foil. To do this, they must have
attended to, stored and statistically evaluated the information across the individually
ambiguous training trials.
Yu and Smith (2010) added eye-tracking methodology and in this way tracked learning as it
occurred, examining the object to which the infant attended when each word was heard
during the ambiguous training trials. This method revealed marked individual differences in
looking behavior that were strongly related to whether or not individual infants learned the
underlying correspondences. At the beginning of training, looking was similar for all infants,
with many rapid shifts of attention from one object to the other within a trial and little
systematicity. Diffuse looking is potentially relevant to statistical learning, since infants
might benefit from an initially broad sampling of the data on the pairings. However, on later
looking trials, the looking patterns of infants who actually learned the word-referent
associations as measured at test became more focused and different from those of
nonlearners. More specifically, by the middle of the training trials, the learners’ looking
patterns were systematic, selective, and sustained on individual objects and they were often
-- though not always -- directed toward the correct referent for the just heard word.
However, the learners’ attention but the nonlearners –at least as the learning trials
progressed –became more controlled by the heard words whereas nonlearners’ looking
behavior did not. Looking at an object in the context of a heard word is both the means
through which infants pick up information about the word-object correspondences and also
the behavior experimentalists use to measure that learning. Because the differences in
looking behavior during the training emerged across those trials, these differences most
likely reflect differences in what infants had learned from the early trials about the word-
referent correspondences. However, because this early learning organizes visual attention
within trials, it may be essential to learning during later trials, for example, to the correction
of spurious correlations, and thus to the overall success of statistical learning.
Importantly, both the infants who ultimately learned the correspondences and those who did
not looked at the objects on all trials, but the looking behavior was different. This fact
suggests that looking and listening is not enough to ensure statistical learning and raises the
possibility that different forms of visual attention are differentially supportive of statistical
word-referent learning. Recent advances in both theory and research suggest fundamentally
different forms of attention (see Talsma, Senkowski, Soto-Farao & Woldorff, 2010, for
review) that operate over different time scales (see Smith, Colunga & Yoshida, 2010, for
review) and that support different cognitive functions (see, Talsma et al, 2010; Wright &
Ward, 2008). In particular, studies of both adults (e.g., Fiebelkorn, Foxe & Molholm, 2010)
and infants (Wu & Kirkham, 2010; Benitez & Smith, 2012) suggest that association-based
Smith and Yu
Page 2
Lang Learn Dev. Author manuscript; available in PMC 2014 January 06.
NIH-PA Author Manuscript NIH-PA Author Manuscript NIH-PA Author Manuscript