Labour's Memory. Digitization of annual and financial reports of blue-collar worker unions 1880-2020
The labour movement has played a crucial role in the development of Swedish democracy. Trade unions were especially important in building the welfare state and in shaping the structure of the labour market. The Swedish trade union federations and their confederation, Landsorganisationen, also play an important role internationally. Their archives offer a unique potential to create comprehensive time series on the mobilization and organization of the largest organization in Sweden’s civil society. In order to accomplish this goal, the material needs to be digitized and made searchable.
This project will create a collection of digitised annual and financial reports from local, regional, national and international trade union organisations from 1880 to 2020. The digitised material will be indexed and made searchable in collaboration with labour history experts and computational linguists. The database will be developed, stored and made available for researchers by the Swedish Labour Movement’s Archive and Library, the Popular Movements’ Archive in Uppsala, the Archiv der sozialen Demokratie and the International Institute of social history. These four archival institutions will make the data easily and permanently available to the research community. Combining holdings enables us to create a unique infrastructure that connects the local, regional, national and international levels.
Final report
Final Report, Labour’s Memory (IN20-0040)
PURPOSE & DEVELOPMENT
The aim of this infrastructure project was to make historical source material from the trade union movement accessible, ranging from local organisations in Sweden to global federations. The number of organisations within this social movement is vast, and the archival material is complex. Through this infrastructure, we are creating access points to this material and enabling systematic searches that were previously unfeasible. These open up opportunities for both new research questions and access to historical source documents for trade union members and the general public. Furthermore, the aim has been to develop archival institutions’ workflows for the interpretation of handwritten text using language models (Large Language Models – LLMs) and computational linguistics methods for search optimisation.
PROJECT RESULTS TO DATE
Four archival institutions with varying technical capabilities participated in the project. The International Institute for Social History (IISH) in Amsterdam and the Archiv der sozialen Demokratie (AdSD) in Bonn both had workflows for digitising and making archive material accessible integrated into their operations. The Labour Movement Archive and Library (ARAB) in Huddinge and the Folkrörelsearkivet för Uppsala län (FAC) in Uppsala lacked this when the project began. One result is therefore that this approach is now part of the day-to-day work of these two institutions. Technical equipment, including workstations, servers, and software for processing the material using Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR), packaging and archiving it, and publishing it on the internet, is now in place.
The project also developed methods and software for handwriting recognition and HTR interpretation of image material, as well as lemmatisation and named entity recognition in large text corpora. These results have been presented at conferences and, to some extent, published. The open-source code is published on GitHub.
Furthermore, the project has resulted in a large number of digitised annual reports from the trade union movement. IISH and AdSD, whose archives hold material from international trade union organisations, have digitised all the material they intended to contribute to the project. They have published parts of it and will complete this final phase in spring 2026. FAC, which holds the Swedish local and regional material from Uppsala County, has digitised (re-photographed) all the material identified at the start of the project. The handwritten material has been transcribed and published; the typewritten and stencilled material is being transcribed on an ongoing basis. ARAB, which holds the archives of the Swedish trade union organisations at the national level, has focused on digitising and publishing material from a selection of organisations to broaden the collection's scope. Work on finalising all the material in ARAB’s collections will continue on an ongoing basis, as part of the institution’s activities.
The platform laboursmemory.org, which publishes material from the four archives, along with search tools and instructions, was launched on 7 November 2025.
USAGE & INITATED RESEARCH
The research initiated to date using the infrastructure has primarily concerned methods for manuscript interpretation and computational linguistic methods for language normalisation, as well as Named Entity Recognition (NER). Furthermore, user studies have been conducted to assess the infrastructure’s potential user groups. Work on the project has been presented at conferences on an ongoing basis, partly to inform potential users of the infrastructure’s existence and partly to gather feedback to fine-tune it. Furthermore, articles have been published by project participants, and a number of master ’s-level student projects have been completed. A measurement of traffic to the web service since its launch shows that the material is being used, as evidenced by user reads. Interaction with users also takes place regarding technical issues or requests for improvement.
Problems and deviations from the plan
Work on the project got properly underway in 2021, but had to be rescheduled due to the COVID-19 pandemic. The initial workshops planned to bring the project group together and develop common working methods had to be held online. This meant that the project initially suffered from a degree of silo working. Project participants worked within their areas of expertise, but coordination between them suffered as joint meetings could not be arranged as planned, and there was not yet much experience of working online. This improved over time, but it initially affected the project work.
The work of scanning and processing the image material has required more staff resources than was planned in the application. Furthermore, personal circumstances, such as sickness and parental leave, led to a reallocation of resources within the project from the budget item ‘Salaries/LKP’ to ‘Operations’. Additionally, greater input from technical consultants was required to develop the server solutions we chose, which also affected the costs for ‘Operations’. However, this has not affected the project's actual objective. The material has been digitised, the infrastructure is in place and is being used.
The solution involving an IIIF viewer on a shared platform (Omeka) that calls upon local IIIF servers at the archives was not envisaged in the project application, which was based on local e-archiving and the delivery of display copies to a shared platform. The chosen solution was a discovery made along the way, and a better solution. It decentralises responsibility for what is presented to the individual archives that own the material or are accountable to the depositors. At the same time, it is a flexible solution that allows more archival institutions to join, provided they manage their own IIIF server.
One methodological problem that had, admittedly, been foreseen was the interpretation of the GDPR by individual archives and the handling of sensitive personal data (including information on trade union membership). This remained an issue longer than expected, as the National Archives’ clarifying regulation was issued only in the final phase of the project in autumn 2024 (RA-FS 2024:8) and is still being implemented across the archives. As a precaution, a 90-year time limit was set for the publication of clubs’ and branches’ annual reports on laboursmemory.org, and 70 years for national-level unions. However, the more recent material will be accessible via a login account for those who have been granted permission to view it. FAC exports to the Enskilda e-arkivet platform and ARAB is establishing a similar solution. The international material has been assessed by the respective responsible archival institution as not containing data subject to the GDPR and has therefore been published in its entirety.
INTEGRATION INTO THE ORGANISATION AND LONG-TERM MAINTENANCE
The work of digitisation, archiving and distribution of archive packages is now an integral part of the participating archival institutions’ operations. Each institution is responsible for its own collection, and through the IIIF standard (International Image Interoperability Framework, a standard for the transport and display of images on the web), it is made available on the web service laboursmemory.org. Through IIIF, the digitised material can also be made available in other contexts if the institutions so wish. All material is produced and published in accordance with the FAIR principles for open data. This ensures the project’s long-term viability.
The web service, containing the collection of material from the four participating institutions, is operated and updated by ARAB and is now part of the institution’s long-term work to make archival material available for research.
OPEN ACCESS & AVAILABILITY
The digital material and all metadata are published openly on laboursmemory.org. The platform is built on the open-source software Omeka S. The archiving and publishing system used by ARAB and FAC is the open-source software Archivematica. The IIIF server used to distribute the archive packages is Archival-IIIF, which is also open-source. LMming, the in-house developed software for metadata management, language normalisation and NER processes, is open-source and published on GitHub.
INTERNATIONAL COLLABORATIONS
From the outset, the project was an international collaboration, with both the IISH in Amsterdam and AdsD in Bonn forming part of the project group. Furthermore, during the course of the work, important lessons have been learned from the collaboration that ARAB has with other labour movement archives around the world, through the International Association of Labour History Institutions (IALHI). This resulted, amongst other things, in two of the archivists participating in the project being able to take part in a knowledge exchange with the Amsab-Instituut voor Sociale Geschiedenis in Ghent on behalf of the project.
PUBLICATIONS
See attached list of publications
LINKS TO WEB SERVICES DEVELOPED BY THE PROJECT
www.laboursmemory.org (The published collection and search service)
https://backstage.laboursmemory.org (user resource for ARAB and FAC staff).
https://iiif.arbark.se (ARAB’s IIIF server)
https://iiif.fauppsala.se (FAC’s IIIF server)
PURPOSE & DEVELOPMENT
The aim of this infrastructure project was to make historical source material from the trade union movement accessible, ranging from local organisations in Sweden to global federations. The number of organisations within this social movement is vast, and the archival material is complex. Through this infrastructure, we are creating access points to this material and enabling systematic searches that were previously unfeasible. These open up opportunities for both new research questions and access to historical source documents for trade union members and the general public. Furthermore, the aim has been to develop archival institutions’ workflows for the interpretation of handwritten text using language models (Large Language Models – LLMs) and computational linguistics methods for search optimisation.
PROJECT RESULTS TO DATE
Four archival institutions with varying technical capabilities participated in the project. The International Institute for Social History (IISH) in Amsterdam and the Archiv der sozialen Demokratie (AdSD) in Bonn both had workflows for digitising and making archive material accessible integrated into their operations. The Labour Movement Archive and Library (ARAB) in Huddinge and the Folkrörelsearkivet för Uppsala län (FAC) in Uppsala lacked this when the project began. One result is therefore that this approach is now part of the day-to-day work of these two institutions. Technical equipment, including workstations, servers, and software for processing the material using Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR), packaging and archiving it, and publishing it on the internet, is now in place.
The project also developed methods and software for handwriting recognition and HTR interpretation of image material, as well as lemmatisation and named entity recognition in large text corpora. These results have been presented at conferences and, to some extent, published. The open-source code is published on GitHub.
Furthermore, the project has resulted in a large number of digitised annual reports from the trade union movement. IISH and AdSD, whose archives hold material from international trade union organisations, have digitised all the material they intended to contribute to the project. They have published parts of it and will complete this final phase in spring 2026. FAC, which holds the Swedish local and regional material from Uppsala County, has digitised (re-photographed) all the material identified at the start of the project. The handwritten material has been transcribed and published; the typewritten and stencilled material is being transcribed on an ongoing basis. ARAB, which holds the archives of the Swedish trade union organisations at the national level, has focused on digitising and publishing material from a selection of organisations to broaden the collection's scope. Work on finalising all the material in ARAB’s collections will continue on an ongoing basis, as part of the institution’s activities.
The platform laboursmemory.org, which publishes material from the four archives, along with search tools and instructions, was launched on 7 November 2025.
USAGE & INITATED RESEARCH
The research initiated to date using the infrastructure has primarily concerned methods for manuscript interpretation and computational linguistic methods for language normalisation, as well as Named Entity Recognition (NER). Furthermore, user studies have been conducted to assess the infrastructure’s potential user groups. Work on the project has been presented at conferences on an ongoing basis, partly to inform potential users of the infrastructure’s existence and partly to gather feedback to fine-tune it. Furthermore, articles have been published by project participants, and a number of master ’s-level student projects have been completed. A measurement of traffic to the web service since its launch shows that the material is being used, as evidenced by user reads. Interaction with users also takes place regarding technical issues or requests for improvement.
Problems and deviations from the plan
Work on the project got properly underway in 2021, but had to be rescheduled due to the COVID-19 pandemic. The initial workshops planned to bring the project group together and develop common working methods had to be held online. This meant that the project initially suffered from a degree of silo working. Project participants worked within their areas of expertise, but coordination between them suffered as joint meetings could not be arranged as planned, and there was not yet much experience of working online. This improved over time, but it initially affected the project work.
The work of scanning and processing the image material has required more staff resources than was planned in the application. Furthermore, personal circumstances, such as sickness and parental leave, led to a reallocation of resources within the project from the budget item ‘Salaries/LKP’ to ‘Operations’. Additionally, greater input from technical consultants was required to develop the server solutions we chose, which also affected the costs for ‘Operations’. However, this has not affected the project's actual objective. The material has been digitised, the infrastructure is in place and is being used.
The solution involving an IIIF viewer on a shared platform (Omeka) that calls upon local IIIF servers at the archives was not envisaged in the project application, which was based on local e-archiving and the delivery of display copies to a shared platform. The chosen solution was a discovery made along the way, and a better solution. It decentralises responsibility for what is presented to the individual archives that own the material or are accountable to the depositors. At the same time, it is a flexible solution that allows more archival institutions to join, provided they manage their own IIIF server.
One methodological problem that had, admittedly, been foreseen was the interpretation of the GDPR by individual archives and the handling of sensitive personal data (including information on trade union membership). This remained an issue longer than expected, as the National Archives’ clarifying regulation was issued only in the final phase of the project in autumn 2024 (RA-FS 2024:8) and is still being implemented across the archives. As a precaution, a 90-year time limit was set for the publication of clubs’ and branches’ annual reports on laboursmemory.org, and 70 years for national-level unions. However, the more recent material will be accessible via a login account for those who have been granted permission to view it. FAC exports to the Enskilda e-arkivet platform
INTEGRATION INTO THE ORGANISATION AND LONG-TERM MAINTENANCE
The work of digitisation, archiving and distribution of archive packages is now an integral part of the participating archival institutions’ operations. Each institution is responsible for its own collection, and through the IIIF standard (International Image Interoperability Framework, a standard for the transport and display of images on the web), it is made available on the web service laboursmemory.org. Through IIIF, the digitised material can also be made available in other contexts if the institutions so wish. All material is produced and published in accordance with the FAIR principles for open data. This ensures the project’s long-term viability.
The web service, containing the collection of material from the four participating institutions, is operated and updated by ARAB and is now part of the institution’s long-term work to make archival material available for research.
OPEN ACCESS & AVAILABILITY
The digital material and all metadata are published openly on laboursmemory.org. The platform is built on the open-source software Omeka S. The archiving and publishing system used by ARAB and FAC is the open-source software Archivematica. The IIIF server used to distribute the archive packages is Archival-IIIF, which is also open-source. LMming, the in-house developed software for metadata management, language normalisation and NER processes, is open-source and published on GitHub.
INTERNATIONAL COLLABORATIONS
From the outset, the project was an international collaboration, with both the IISH in Amsterdam and AdsD in Bonn forming part of the project group. Furthermore, during the course of the work, important lessons have been learned from the collaboration that ARAB has with other labour movement archives around the world, through the International Association of Labour History Institutions (IALHI). This resulted, amongst other things, in two of the archivists participating in the project being able to take part in a knowledge exchange with the Amsab-Instituut voor Sociale Geschiedenis in Ghent on behalf of the project.
PUBLICATIONS
See attached list of publications
LINKS TO WEB SERVICES DEVELOPED BY THE PROJECT
www.laboursmemory.org (The published collection and search service)
https://backstage.laboursmemory.org (user resource for ARAB and FAC staff).
https://iiif.arbark.se (ARAB’s IIIF server)
https://iiif.fauppsala.se (FAC’s IIIF server)