Sofie Johansson

Measuring second language vocabulary


An increasing number of students in Swedish schools today have other first languages than Swedish and may therefore have difficulties following instruction in Swedish. Textbooks and other written sources play an important role for learning at school and knowledge of the vocabulary encountered in schoolbook texts is therefore crucial.


Many researchers even claim that mastering school related vocabulary appropriate to the subject matters in school, is the single most important factor for school success. As for vocabulary development, considerable differences have been observed between children from different socio-economic and language backgrounds. Lexical limitations may thus be an impediment for learning also for students with Swedish as a mother-tongue.


The aim of this project is to develop web-based tests for the assessment of Swedish school related academic vocabulary in terms of receptive as well as productive language skills with reference to quantitative and qualitative aspects of lexical knowledge. Such tests are of great relevance for research on lexical development in first and second languages and also have important educational applications in the planning and provision of a more needs-based, effective and systematic language training.


The lack of relevant diagnostic tools for school-related language assessment is a well documented problem in Swedish schools and the assessment of second language proficiency is not always carried out in a reliable way. The development of vocabulary tests for a more valid and reliable assessment of school related vocabulary knowledge in Swedish schools is therefore of great urgency.
Final report

Scientific report of the project "Assessing the vocabulary in a second language"/ ”Mätning av ordförrådet i andraspråket” (MOA) 
Diarienummer: P2008-0185:1-E 

This project focuses on evidence-based assessment of school-related vocabulary. The project took place between 2009-2013, including an extended outline of one year. The project is as relevant today as it was 10 years ago and its accumulated knowledge has been further developed and given similar research project Science And Literacy Teaching (SALT) -funded by VR, Content and Language Integration in Swedish Schools (CLISS) - funded by VR, accumulated knowledge.

Purpose, background and development
The purpose of the project was to develop and test web-based tools for multidimensional diagnostic assessment of school-related vocabulary in the second language. The main research question has been to find out what lexical challenges second-language students face in the school context with regard to a subject-neutral and a subject-related vocabulary in eight school subjects and its subject texts in upper secondary school. Lexical challenges in the language refer to what knowledge a student possesses in terms of depth and breadth in the vocabulary. The linguistic knowledge model on which the project's work and approach is based is inspired by Nation (2013). Additional aspects that have been explored are also lexical networks and relationships between words, so-called lexical cohesion, inspired by Hoey (1991).
The results of the completed tests of the developed word tests showed several differences in performance, both between L1 students and L2 students. In addition, significant differences were shown between boys and girls.

Implementation
The project was implemented in a number of different steps. The empirical text material used to create evidence-based assessment of the vocabulary in a school context consisted of a text corpus of textbook texts totaling just over 1 million words created in the research project OrdIL (Lindberg & Johansson Kokkinakis eds. 2007). The texts represented teaching materials in eight different school subjects, geography, history, religion, social studies, mathematics, physics, chemistry and biology. Based on this body of text, representative words were compiled and extracted, both subject-neutral and subject-related words, which appeared with a certain frequency in the texts and subjects.
Five different types of tests were created. It was 1) frequency-based vocabulary test (FOT) that tests the size / width of the vocabulary, 2-3) test on vocabulary depth (TORD) that tests the depth in the vocabulary (2 pcs), 4) tools for analysis of lexical profile (VALP) and Lexical cohesion for measuring productive quantitative aspects of lexical competence (Collex).
With regard to the theoretical framework that we have used to describe what is being tested, how it is done and how results are to be interpreted, we have used the Read & Chapelle framework for assessing the vocabulary in the second language. (Read & Chapelle 2001).  

The three most important results of the project are the following:
1) Word knowledge can not be assessed from a single perspective but should be multidimensional. Therefore, both the breadth and depth of the vocabulary should be analyzed.
2) Word knowledge can be seen from different perspectives, namely receptive and productive ability, knowledge of the form, meaning and use of words (Nation 2013), which means that the dimensions described in the previous point can be seen as different types of skills within one or more dimensions.
3) There are no homogeneous groups of students, which is more relevant today than ever. The results may differ between boys and girls, between students who have another mother tongue and those who have Swedish as their mother tongue, and between students who have parents with higher education or not.

Conclusions drawn
Given today's heterogeneous school context, greater demands are placed on teachers in language and subject teaching to adapt the level to the student's knowledge than ever before. Therefore, it is important to use assessment instruments as a tool in the school context to facilitate the production of facts about where the student is in his or her language and knowledge development.
Since the language in society and in the school context is constantly undergoing changes, assessment instruments are also a fresh product. In this project, therefore, the developed assessment instruments have been updated both in terms of content and appearance for a long time.
Having access to empirical data in the form of teaching material corpora has been crucial in order to be able to study the school-related language and vocabulary. The intention is to continue working in this direction with new updates.

New research questions
We mainly have three issues that have arisen during the project.
1) The project uses a model to describe lexical knowledge within which the tests developed in the project can also be placed. The model consists of a matrix with two axes that will describe different extremes: productive-receptive and quantitative-qualitative. These concepts can be defined in different ways and there are probably additional aspects to add to such a model and matrix as e.g. development and change over time. The model could then be considered as three-dimensional.
2) To reason about how vocabulary and lexical knowledge can be described is a very complex issue. What is a word? We have spent a lot of time defining our views on words and concepts. But in collaboration with other researchers both on a national and an international level, we can state that concepts such as "word families" are often used to group words that can be associated with each other, e.g. morphologically related or derivatives and compositions. We prefer to use a more detailed description and view of the word concept in the form of different lemmas and lexem which are words in basic form that differ in word class / inflectional pattern resp. meaning. Internationally, the more general term "word families" is often used, e.g. when describing and calculating the size of an individual's vocabulary. We have come to the conclusion that it is not appropriate to use on non-advanced learners of a language due to the fact that one can not assume knowledge of morphological knowledge, etc. 
3) How can we create a better collaboration with teachers who use assessment instruments in an efficient and time-saving way in their teaching? Teachers' constant lack of time and high workload are problematic when new methods and working methods are to be introduced.

Dissemination of research and results
The results of the project are used continuously in teaching at the undergraduate and postgraduate levels, both through our increased subject competence and actual empirically based assessment instruments.
We have presented the project at a number of different national, Nordic and international conferences, with the following contributions:
•    Johansson Kokkinakis S., Lindberg I., Eurosla conference 2009, Cork, Ireland, “Cross-disciplinary and Disciplinary-specific vocabulary in Secondary School Textbooks” 
•    Lindberg I., Johansson Kokkinakis S., Den 9:e konferensen om Nordens språk som andraspråk, 2009, Helsingör, Danmark, ”Diagnostiskt test av skolrelaterat ordförråd” 
•    Lindberg I., Mätning av ordförrådet i andraspråket, Symposium 2009, Genrer och funktionellt språk i teori och praktik, Stockholms universitet. 
•    Johansson Kokkinakis S., Carlund C. Konferensen ”Forum för textforskning 5” Lund 7-8 juni 2010, ”Grundskolans ämnestexter, ordförståelse och frekvensbaserat ordförrådstest” 
•    Lindberg I., Johansson Kokkinakis S., “Identification of lexical cohesive ties in secondary school textbooks”, The 16th World Congress of Applied Linguistics (AILA), 2011. 

In May 2010, we arranged a workshop together with colleagues in Roman languages at Stockholm University with similar research interests. We each presented our contribution from the project. We had also invited four researchers with an international background: Tom Cobb, Birgit Henriksen, Henrik Gyllstad and James Milton.
The workshop was attended by many researchers and students and we received a very good response, especially from the international guests. Workshopen ”Developing multidimensional methods for vocabulary assessment” at Stockholm university 20-21 May, 2010. Invited guests: 
•    Tom Cobb, Université du Québec à Montréal, “DDL 2.0: The challenge of homographs and frequent multi-words” 
•    Birgit Henriksen, University of Copenhagen, “Collocational knowledge - a challenge for learners but also for SLA researchers” 
•    Henrik Gyllstad, University of Lund, ”L2 vocabulary knowledge constructs and assessment - what is the way forward?” 
•    James Milton, University of Swansea, “Comparing aural and orthographic word recognition” 
•    Camilla Bardel, Stockholm University, “How many and which words are known at the CEFR levels?” 
•    Christina Lindqvist, Stockholm University, “Creating frequency bands for spoken French and Italian” 
•    Sofie Johansson, University of Gothenburg, “Lexical profiling as means for characterising subject-specific school book texts” 
•    Inger Lindberg, Stockholm University, “Assessing depth of vocabulary knowledge in secondary school” 

References
Hoey, Michael. Patterns of Lexis in Text. Oxford (UK): Oxford University Press, 1991.
Lindberg, I., Johansson Kokkinakis, S., Järborg, J., & Holmegaard, M. (2007). OrdiL–en korpusbaserad kartläggning av ordförrådet i läromedel för grundskolans senare år. rapport nr.: ROSA 8.
Nation, I. S. (2013). Learning vocabulary in another language Google eBook. Cambridge University Press.
Read & Chapelle ”A framwork for second language vocabulary assessment, Language Testing, 2001, 18 (1), 1-32.

Grant administrator
University of Gothenburg
Reference number
P2008-0185:1-E
Amount
SEK 3,745,000
Funding
RJ Projects
Subject
General Language Studies and Linguistics
Year
2008