Learning to focus: How Stockholm and Skåne Swedish children produce and comprehend contrastive intonation
People use prosody – the melody and rhythm of speech – in order to highlight the most important part (focus) of an utterance, and listeners rely on prosody in order to process and comprehend a message. Prosodic focusing takes different forms in different languages or dialects, and this project investigates effects of such differences on children’s development toward adult mastery of focus prosody.
In this project we center on the relation between how a child produces focus prosody, and how it can make use of it in speech comprehension. Is one of these skills acquired before the other? And is the acquisition of these skills in some way influenced by the melodic shape of the focus prosody in a particular language variety? The project will add central missing pieces to our general understanding of how properties of the input affect native language acquisition. Particularly phonological properties have so far received only limited attention in this ongoing discussion.
We will elicit and analyze speech recordings from three and five year-old children (and adult controls) speaking Stockholm or Skåne Swedish, and test the same children’s (and adults ) comprehension of focus prosody in their respective variety, using the visual world eye tracking paradigm. Comparing Stockholm and Skåne Swedish makes a particularly good test case because the two varieties differ in prosodic typology with respect to the focus tone, while keeping other important linguistic features constant.
Final report
In this basic research project, we investigated Swedish preschool children’s ability to pronounce words with contrastive focal stress and to interpret such contrastive stress in the spoken language they hear. By contrastive focal stress (or focus prosody), we refer to the prosodic (i.e., melodic and rhythmic) means that speakers use to emphasize a word contrastively, as in phrases like “the GREEN ball” (“not the RED one”). The project aimed to deepen our understanding of children’s first language acquisition—particularly their ability to express and understand information structure through prosodic means. Our focus was on (1) the relationship between speech production and comprehension and (2) how prosodic-typological factors can influence the development of focus prosody in production and comprehension. Prosodic-typological factors refer to language- or dialect-specific ways of using prosodic cues.
To achieve this goal, we planned to record and analyze the speech of three- to five-year-old children speaking either the Scanian or Stockholm dialect and, using eye-tracking, examine how these children perceive focus prosody in their respective Swedish varieties. An adult control group for each dialect was also included. Comparing Stockholm and Scanian dialects is particularly relevant here, as they differ in prosodic typology concerning focus prosody while sharing other essential linguistic features. In short, we know that focus prosody in the Stockholm dialect is realized through a distinct melodic pattern that creates a categorical contrast between a focused and an unfocused word. In contrast, focus prosody in Scanian has been described as a more gradual phenomenon: a focused word is pronounced with essentially the same melodic pattern as an unfocused word but with a stronger acoustic realization, such as increased pitch range.
We planned two experiments for each participant (one speech production and one speech comprehension experiment). As an initial step, these experiments had to be developed in detail. This resulted in an improvement compared to the original project plan, as we replaced the initially sketched picture book setup for the production test with a game involving packing a (physical) toy suitcase with objects (picture cards) displayed on a screen, which the child had to describe to the experiment leader (e.g., “a green ball and a green lamp”). The comprehension experiment was developed as planned as an eye-tracking experiment, illustrated below in connection with the results. Additionally, children were tested with an existing language development test (The New Reynell Developmental Language Scales, NRDLS) to relate our experimental results to the children's language development level in both comprehension and production. Choosing NRDLS was another development from the original project plan, which initially intended to use two separate, independent tests for comprehension and production. NRDLS provides both in a cohesive test package with an official Swedish version.
Another instrument used was a questionnaire developed during the project's initial phase. This questionnaire focused primarily on participants' linguistic backgrounds (with different versions for children and adults) and was used for recruitment, and will be used both in the analysis and to describe the participant group. It was also decided that data collection with children would take place in preschools—an unplanned decision at the project's approval stage—which was deemed logistically simpler than inviting children and guardians to a lab.
Nevertheless, the project's implementation was logistically challenging in several respects. The experiments and language tests took up to about 90 minutes per child, requiring between two to four meetings per child during preschool hours, often spread over several days. The project was also organizationally demanding as it involved data collection in two regions of Sweden, and recruiting preschool children as participants required multiple stages of written and oral contact with preschool management, staff, and guardians. This process typically took several weeks per preschool. Furthermore, the COVID-19 pandemic, which broke out during the recruitment of the first preschools, initially halted data collection. Later in 2020, we attempted to recruit participants outside the preschool setting, which proved difficult. However, during this attempt, we collected data from a few adult participants and refined the data collection procedure for children. The pandemic continued for an extended period, and even after it subsided, recruitment remained challenging, causing data collection to extend from 2022 to 2024. In addition to preschool collaborations, adult participants were recruited via social media and recruitment platforms, with data collection primarily taking place at Stockholm University and Lund University.
Despite these challenges, we successfully collected approximately the amount of data we had planned (a total of 130 children from 13 preschools and 48 adult participants). However, due to the significant delays caused by COVID-19, not all data have been analyzed at the time of writing. For the same reason, we have not yet published any results in scientific articles. Publication of preliminary analyses would not have been appropriate for this project, as all planned analyses must be based on a complete dataset for at least one of the investigated dialects, and no dataset was completed early enough. However, analysis has been ongoing alongside data collection and will continue after this final report, with results to be published in article form in due course.
We found our collaboration with preschools to be highly rewarding. We encountered great interest and willingness to participate in our research project, along with strong support from staff during implementation. To frame this collaboration as a mutually beneficial exchange, we offered to engage in preschool development work, for example, through lectures or professional development sessions for staff. For ethical reasons, we only proposed this opportunity after the preschool had agreed to participate in the study. In total, the project members based in Sweden conducted five short training sessions for preschool staff.
One preliminary result that has already emerged is that adult participants exhibit the same comprehension patterns as observed in eye-tracking data, regardless of dialect. For example, both Scanian and Stockholm-dialect-speaking participants are misled by an inappropriate focus prosody. In one of the four experimental conditions in the eye-tracking experiment, participants first heard a question like, “Where is the white lamp?” (six images are displayed on the screen, and the participant points to the correct one). Then, they heard the question, “And where is the GREEN ball?” with contrastive focus on “GREEN,” followed by the unexpected noun “ball.” If a participant correctly interprets this contrastive focus, they will typically anticipate that the next word will be “lamp,” just as in the first question (“white lamp” – “green lamp”). The expectation is that if focus prosody is interpreted “correctly,” it will lead to an eye movement toward the image of a green lamp—followed by a correction to the green ball, which is actually mentioned in the question (but is contextually inappropriate given the focal stress on the adjective). The fact that adult participants displayed this corrective eye movement after being misled by contrastive focus was expected, as contrastive focus prosody is assumed to exist in both examined dialects. This result thus also confirms the validity of the experimental design.
Regarding children, preliminary results suggest a difference between dialects: While children from the Stockholm region at around five years old exhibit comprehension patterns similar to those described for adults, and even 3- to 4-year-olds show signs of developing this pattern, no such signs are yet seen in Scanian-speaking children, even at age five. This may indicate that language- or dialect-specific phonological features (in this case, the phonetic realization of focus prosody) can develop at different rates and may influence how early in development children can interpret contrast and, consequently, information structure marked by prosody.
For speech production, a preliminary analysis of adult data revealed an unexpected result: contrastive focus prosody in Scanian is not always realized as gradually as anticipated. Instead, a more distinct realization involving post-focal deaccentuation also occurs relatively frequently, resulting in a clearer contrast between the focused and unfocused words. In our preliminary analyses of children's speech production data, we have already observed examples of contrastive focus prosody in some of the youngest children, but the analyses do not yet allow conclusions regarding general patterns, dialectal differences, or the relationship between speech production and comprehension. In our continued analyses, we will also integrate children's NRDLS test scores. Preliminary findings from this project have been presented at several internationally recognized peer-reviewed conferences.
To achieve this goal, we planned to record and analyze the speech of three- to five-year-old children speaking either the Scanian or Stockholm dialect and, using eye-tracking, examine how these children perceive focus prosody in their respective Swedish varieties. An adult control group for each dialect was also included. Comparing Stockholm and Scanian dialects is particularly relevant here, as they differ in prosodic typology concerning focus prosody while sharing other essential linguistic features. In short, we know that focus prosody in the Stockholm dialect is realized through a distinct melodic pattern that creates a categorical contrast between a focused and an unfocused word. In contrast, focus prosody in Scanian has been described as a more gradual phenomenon: a focused word is pronounced with essentially the same melodic pattern as an unfocused word but with a stronger acoustic realization, such as increased pitch range.
We planned two experiments for each participant (one speech production and one speech comprehension experiment). As an initial step, these experiments had to be developed in detail. This resulted in an improvement compared to the original project plan, as we replaced the initially sketched picture book setup for the production test with a game involving packing a (physical) toy suitcase with objects (picture cards) displayed on a screen, which the child had to describe to the experiment leader (e.g., “a green ball and a green lamp”). The comprehension experiment was developed as planned as an eye-tracking experiment, illustrated below in connection with the results. Additionally, children were tested with an existing language development test (The New Reynell Developmental Language Scales, NRDLS) to relate our experimental results to the children's language development level in both comprehension and production. Choosing NRDLS was another development from the original project plan, which initially intended to use two separate, independent tests for comprehension and production. NRDLS provides both in a cohesive test package with an official Swedish version.
Another instrument used was a questionnaire developed during the project's initial phase. This questionnaire focused primarily on participants' linguistic backgrounds (with different versions for children and adults) and was used for recruitment, and will be used both in the analysis and to describe the participant group. It was also decided that data collection with children would take place in preschools—an unplanned decision at the project's approval stage—which was deemed logistically simpler than inviting children and guardians to a lab.
Nevertheless, the project's implementation was logistically challenging in several respects. The experiments and language tests took up to about 90 minutes per child, requiring between two to four meetings per child during preschool hours, often spread over several days. The project was also organizationally demanding as it involved data collection in two regions of Sweden, and recruiting preschool children as participants required multiple stages of written and oral contact with preschool management, staff, and guardians. This process typically took several weeks per preschool. Furthermore, the COVID-19 pandemic, which broke out during the recruitment of the first preschools, initially halted data collection. Later in 2020, we attempted to recruit participants outside the preschool setting, which proved difficult. However, during this attempt, we collected data from a few adult participants and refined the data collection procedure for children. The pandemic continued for an extended period, and even after it subsided, recruitment remained challenging, causing data collection to extend from 2022 to 2024. In addition to preschool collaborations, adult participants were recruited via social media and recruitment platforms, with data collection primarily taking place at Stockholm University and Lund University.
Despite these challenges, we successfully collected approximately the amount of data we had planned (a total of 130 children from 13 preschools and 48 adult participants). However, due to the significant delays caused by COVID-19, not all data have been analyzed at the time of writing. For the same reason, we have not yet published any results in scientific articles. Publication of preliminary analyses would not have been appropriate for this project, as all planned analyses must be based on a complete dataset for at least one of the investigated dialects, and no dataset was completed early enough. However, analysis has been ongoing alongside data collection and will continue after this final report, with results to be published in article form in due course.
We found our collaboration with preschools to be highly rewarding. We encountered great interest and willingness to participate in our research project, along with strong support from staff during implementation. To frame this collaboration as a mutually beneficial exchange, we offered to engage in preschool development work, for example, through lectures or professional development sessions for staff. For ethical reasons, we only proposed this opportunity after the preschool had agreed to participate in the study. In total, the project members based in Sweden conducted five short training sessions for preschool staff.
One preliminary result that has already emerged is that adult participants exhibit the same comprehension patterns as observed in eye-tracking data, regardless of dialect. For example, both Scanian and Stockholm-dialect-speaking participants are misled by an inappropriate focus prosody. In one of the four experimental conditions in the eye-tracking experiment, participants first heard a question like, “Where is the white lamp?” (six images are displayed on the screen, and the participant points to the correct one). Then, they heard the question, “And where is the GREEN ball?” with contrastive focus on “GREEN,” followed by the unexpected noun “ball.” If a participant correctly interprets this contrastive focus, they will typically anticipate that the next word will be “lamp,” just as in the first question (“white lamp” – “green lamp”). The expectation is that if focus prosody is interpreted “correctly,” it will lead to an eye movement toward the image of a green lamp—followed by a correction to the green ball, which is actually mentioned in the question (but is contextually inappropriate given the focal stress on the adjective). The fact that adult participants displayed this corrective eye movement after being misled by contrastive focus was expected, as contrastive focus prosody is assumed to exist in both examined dialects. This result thus also confirms the validity of the experimental design.
Regarding children, preliminary results suggest a difference between dialects: While children from the Stockholm region at around five years old exhibit comprehension patterns similar to those described for adults, and even 3- to 4-year-olds show signs of developing this pattern, no such signs are yet seen in Scanian-speaking children, even at age five. This may indicate that language- or dialect-specific phonological features (in this case, the phonetic realization of focus prosody) can develop at different rates and may influence how early in development children can interpret contrast and, consequently, information structure marked by prosody.
For speech production, a preliminary analysis of adult data revealed an unexpected result: contrastive focus prosody in Scanian is not always realized as gradually as anticipated. Instead, a more distinct realization involving post-focal deaccentuation also occurs relatively frequently, resulting in a clearer contrast between the focused and unfocused words. In our preliminary analyses of children's speech production data, we have already observed examples of contrastive focus prosody in some of the youngest children, but the analyses do not yet allow conclusions regarding general patterns, dialectal differences, or the relationship between speech production and comprehension. In our continued analyses, we will also integrate children's NRDLS test scores. Preliminary findings from this project have been presented at several internationally recognized peer-reviewed conferences.