A complete dataset of roll-call votes in the Swedish parliament, 1925-2022
The purpose of the project is to make all roll-call votes in the Swedish Parliament from 1925 to 1993 accessible to researchers and to the public. Today, roll-call votes are available via the parliament's website for all yearly sessions since 1993/94. Roll-call votes before 1993 are kept in formats that makes systematic analysis impossible. During the bicameral parliament (until 1970), we will register roll-call votes at individual level in both the lower and upper house. From 1971, we will do the same in the unicameral parliament. We will also collect individual information about sex, year of birth and partisanship of the members of parliament, as well as specific information for each vote. As for the roll-call votes, we will use the Parliaments Library's digital archive, where votes are stored in four different formats and use computer-assisted MM and OCR techniques to compile the information. Data about the members of the Riksdag will come from bibliographic dictionaries and from the Riksdag's database "Rixlex". When this project is completed, researchers, journalists and the general public will get answers to questions about Swedish political history that previously have not been possible to answer. For example, it will be possible to study the government's agenda-setting power in the legislative process under different political conditions. Moreover, we will get better answers to questions about party discipline, ideological ideal positions, and polarization.
Final report
Purpose and Development of the Infrastructure
Each year, the Swedish Parliament (Riksdagen) makes approximately 2,500 decisions, about 25% of which follow a roll-call vote. Since 1925, each individual MP's vote—yes, no, or abstention, as well as whether the MP was absent—has been publicly recorded. Today, voting data at the individual MP level is available via the Parliament's website for the parliamentary sessions from 1993/94 onwards. However, voting records prior to 1993 are stored in formats that make systematic analysis impossible.
The purpose of this infrastructure project has been to make data from all votes in the Swedish Parliament between 1925 and 1992/93 accessible to the research community and the general public. In addition to data on how each individual MP voted, the project aimed to compile data on the subject of the vote (committee reports, etc.), as well as information on party affiliation, gender, and birth year of each MP.
Results Achieved and Analysis
In our application, we outlined four different "work packages" (WPs). The first WP involved manually encoding votes from the period 1925-1934, as these were recorded in handwritten ledgers unsuitable for systematic data reading. This task has been completed. The second WP involved collecting data on party affiliation, gender, birth year, and seat/bench numbers for all MPs from 1925 to 1993. This work is ongoing in collaboration with the RJ-funded infrastructure project Swerik and has been completed for the period 1971-1993. The third WP aimed to convert votes from 1935-1993 into database format through computer-assisted automated reading. This task was divided into three sub-packages: (a) OCR reading for the period 1983-1993, which has been completed; (b) machine reading of photographed voting boards from 1935-1983, which was outsourced to the external contractor Chalmers Teknologkonsulter AB (CTK); and (c) manual correction of errors in the automated data reading. Resources only allowed for these manual corrections to be carried out for the period 1971-1993. Finally, the fourth WP focused on validating and compiling the data. Validation based on sample data has already been performed for the 1971-1993 data with good results (see below).
In summary, the above means that we will soon be able to publish the complete roll-call data for the period 1971-1993, providing a comprehensive database for all votes in the unicameral Parliament. In total, we have compiled approximately 8,000,000 individual votes from about 22,000 roll-calls, along with metadata on what the votes concerned (committee reports, etc.). The voting data for the period 1925-1970 (the bicameral period) remains incomplete at the individual MP level due to common reading errors or incomplete linkage to individual data on party affiliation, gender, and birth year. However, a version of the bicameral votes containing totals for yes, no, abstentions, and absentees, as well as information on the subject of the vote, will also be published soon.
Use of the Infrastructure and Research Initiated
Since the infrastructure has not yet been made publicly available, no research has yet been conducted using it. However, two project participants have applied for additional funding based on the use of this data. Teorell has already been granted funding from the Swedish Research Council for a comparative study on how different institutional conditions influence legislative work in parliamentary democracies, while Holmgren is currently applying for funding to investigate the effects of government power on the legislative process in Sweden. Neither of these research proposals would have been possible without the preparatory work carried out under this infrastructure project, and we expect that the voting data will form the basis for many more future projects, both by us and other researchers.
Unforeseen Technical and Methodological Problems and Deviations from the Original Plan
Two of the project's work packages (the second and third) faced greater technical and methodological challenges than expected. Firstly, automated data reading, both from text (1983-1993) and from photographed voting boards (1971-1983), resulted in a high error rate. Based on random sampling, the error rate was as high as 14-15% for 1935-1970, around 4.5% for 1971-1983, and approximately 2.4% for 1983-1993. This refers to the percentage of individual votes with errors, which translates into a significant number of misclassified votes. For instance, between 1971 and 1983, the Parliament conducted approximately 13,000 votes. With 349-350 MPs voting in each, this resulted in about 4,500,000 individual votes. A 4.5% error rate would thus amount to around 200,000 misclassified votes in this period alone.
To correct these errors, we had to switch to manual correction using two different methods. For 1983-1993, where vote totals could not be OCR-read, we estimated how the majority of a party's MPs voted and manually checked any votes that deviated from this "party line." A new sample test after applying this method showed an error rate of only about 0.13% with an approximate 95% confidence interval of .03%. For 1971-1983, where known vote totals were available, votes with significant discrepancies were manually reviewed in their entirety, while smaller discrepancies were corrected using the same party-line method. A new sample test after applying this method showed zero (0) remaining errors, though with a 95% confidence interval of up to 0.33%.
The second unexpected challenge was systematically linking the Swerik MP database, which contains information on party affiliation, gender, and birth year, to seat/bench numbers. These seat/bench numbers were only partially available in database format on the Parliament's website and were incomplete regarding substitutes and mid-term changes. A significant manual effort was required to enter and correct this seating data, meaning resources only sufficed for the 1971-1993 period (the unicameral era).
Availability of the Infrastructure and Open Science Compliance
The plan remains to make the complete voting data for the unicameral period (1971-1993) available through the Parliament's open data platform. The incomplete voting data for 1925-1970 (the bicameral period) will instead be published on a project website at Stockholm University (see below). All data collected within this project will be made freely available to the public.
Integration Within the Organization and Long-term Maintenance
The data for the unicameral period is considered complete and requires no future maintenance beyond possible future error corrections. How this will be managed is to be discussed with those responsible for the Parliament's open data. Completing the data for the bicameral period will require new funding and manual corrections, including linking to MP data via seat/bench numbers, following the same model used for the unicameral period. However, it is worth noting that the Parliament was significantly less active during the bicameral period. For example, the period 1935-1970 covers only about 2,500,000 individual votes, whereas 1971-1993 covers approximately 8,000,000 votes. Consequently, the remaining correction work is significantly smaller in scale and should be manageable within a standard research project.
International Collaborations
None at present.
Publications Resulting from Research Using the Infrastructure
None at present.
Links to Project Websites
https://www.su.se/english/research/research-projects/h-data/datasets-1.610144
Each year, the Swedish Parliament (Riksdagen) makes approximately 2,500 decisions, about 25% of which follow a roll-call vote. Since 1925, each individual MP's vote—yes, no, or abstention, as well as whether the MP was absent—has been publicly recorded. Today, voting data at the individual MP level is available via the Parliament's website for the parliamentary sessions from 1993/94 onwards. However, voting records prior to 1993 are stored in formats that make systematic analysis impossible.
The purpose of this infrastructure project has been to make data from all votes in the Swedish Parliament between 1925 and 1992/93 accessible to the research community and the general public. In addition to data on how each individual MP voted, the project aimed to compile data on the subject of the vote (committee reports, etc.), as well as information on party affiliation, gender, and birth year of each MP.
Results Achieved and Analysis
In our application, we outlined four different "work packages" (WPs). The first WP involved manually encoding votes from the period 1925-1934, as these were recorded in handwritten ledgers unsuitable for systematic data reading. This task has been completed. The second WP involved collecting data on party affiliation, gender, birth year, and seat/bench numbers for all MPs from 1925 to 1993. This work is ongoing in collaboration with the RJ-funded infrastructure project Swerik and has been completed for the period 1971-1993. The third WP aimed to convert votes from 1935-1993 into database format through computer-assisted automated reading. This task was divided into three sub-packages: (a) OCR reading for the period 1983-1993, which has been completed; (b) machine reading of photographed voting boards from 1935-1983, which was outsourced to the external contractor Chalmers Teknologkonsulter AB (CTK); and (c) manual correction of errors in the automated data reading. Resources only allowed for these manual corrections to be carried out for the period 1971-1993. Finally, the fourth WP focused on validating and compiling the data. Validation based on sample data has already been performed for the 1971-1993 data with good results (see below).
In summary, the above means that we will soon be able to publish the complete roll-call data for the period 1971-1993, providing a comprehensive database for all votes in the unicameral Parliament. In total, we have compiled approximately 8,000,000 individual votes from about 22,000 roll-calls, along with metadata on what the votes concerned (committee reports, etc.). The voting data for the period 1925-1970 (the bicameral period) remains incomplete at the individual MP level due to common reading errors or incomplete linkage to individual data on party affiliation, gender, and birth year. However, a version of the bicameral votes containing totals for yes, no, abstentions, and absentees, as well as information on the subject of the vote, will also be published soon.
Use of the Infrastructure and Research Initiated
Since the infrastructure has not yet been made publicly available, no research has yet been conducted using it. However, two project participants have applied for additional funding based on the use of this data. Teorell has already been granted funding from the Swedish Research Council for a comparative study on how different institutional conditions influence legislative work in parliamentary democracies, while Holmgren is currently applying for funding to investigate the effects of government power on the legislative process in Sweden. Neither of these research proposals would have been possible without the preparatory work carried out under this infrastructure project, and we expect that the voting data will form the basis for many more future projects, both by us and other researchers.
Unforeseen Technical and Methodological Problems and Deviations from the Original Plan
Two of the project's work packages (the second and third) faced greater technical and methodological challenges than expected. Firstly, automated data reading, both from text (1983-1993) and from photographed voting boards (1971-1983), resulted in a high error rate. Based on random sampling, the error rate was as high as 14-15% for 1935-1970, around 4.5% for 1971-1983, and approximately 2.4% for 1983-1993. This refers to the percentage of individual votes with errors, which translates into a significant number of misclassified votes. For instance, between 1971 and 1983, the Parliament conducted approximately 13,000 votes. With 349-350 MPs voting in each, this resulted in about 4,500,000 individual votes. A 4.5% error rate would thus amount to around 200,000 misclassified votes in this period alone.
To correct these errors, we had to switch to manual correction using two different methods. For 1983-1993, where vote totals could not be OCR-read, we estimated how the majority of a party's MPs voted and manually checked any votes that deviated from this "party line." A new sample test after applying this method showed an error rate of only about 0.13% with an approximate 95% confidence interval of .03%. For 1971-1983, where known vote totals were available, votes with significant discrepancies were manually reviewed in their entirety, while smaller discrepancies were corrected using the same party-line method. A new sample test after applying this method showed zero (0) remaining errors, though with a 95% confidence interval of up to 0.33%.
The second unexpected challenge was systematically linking the Swerik MP database, which contains information on party affiliation, gender, and birth year, to seat/bench numbers. These seat/bench numbers were only partially available in database format on the Parliament's website and were incomplete regarding substitutes and mid-term changes. A significant manual effort was required to enter and correct this seating data, meaning resources only sufficed for the 1971-1993 period (the unicameral era).
Availability of the Infrastructure and Open Science Compliance
The plan remains to make the complete voting data for the unicameral period (1971-1993) available through the Parliament's open data platform. The incomplete voting data for 1925-1970 (the bicameral period) will instead be published on a project website at Stockholm University (see below). All data collected within this project will be made freely available to the public.
Integration Within the Organization and Long-term Maintenance
The data for the unicameral period is considered complete and requires no future maintenance beyond possible future error corrections. How this will be managed is to be discussed with those responsible for the Parliament's open data. Completing the data for the bicameral period will require new funding and manual corrections, including linking to MP data via seat/bench numbers, following the same model used for the unicameral period. However, it is worth noting that the Parliament was significantly less active during the bicameral period. For example, the period 1935-1970 covers only about 2,500,000 individual votes, whereas 1971-1993 covers approximately 8,000,000 votes. Consequently, the remaining correction work is significantly smaller in scale and should be manageable within a standard research project.
International Collaborations
None at present.
Publications Resulting from Research Using the Infrastructure
None at present.
Links to Project Websites
https://www.su.se/english/research/research-projects/h-data/datasets-1.610144