Development and assessment of tests for education: Analysing PISA data with Rstudio

PISA – Programme for International Student Assessment

The Organisation for Economic Co-operation and Development (OECD) commissions the PISA study every three years (“PISA,” n.d.). This is an international test in which the skills and level of 15- year olds regarding reading, mathematics, and science are tested (“PISA,” n.d.). Students all over the world participate in this study, which makes it possible to compare the (quality of) education of different countries and track the development of education (Feskens, 2020). Every year, there is one main topic which means there are more items on either reading, mathematics, or science (“PISA (Programma for International Student Assessment,” n.d.). The test usually has only a few items since it is a low stakes test, no personal consequences based on the result, and to keep the motivation and attention high. Next to the test part, there is also a survey which is filled out by the parents of the students taking part in the test to learn about the background of the test takers (Feskens, 2020).

On the results that were gathered in 2018, a few analyses were done. Firstly, the data were loaded, then classical test theory was performed to get the results of the item and test statistics, the results of the Netherlands and Germany were compared, and finally the performance of all countries were compared.

Loading data in R

Before any analysis can be done, the data need to be loaded in the program. The program that is used to analyse the data is Rstudio. Figure 1 shows how the necessary libraries are loaded, the working directory is set, the files are given a variable name so that they can more easily be referred to. Lastly, a dexter project is started with the scoring rules and item responses that are needed for the analysis. Additionally, previews are shown of the data file and the items responses.

Figure 1. Code showing data being loaded in RStudio

CTT analysis

The classic test theory analysis is a first framework used to analyse test data (Feskens, 2020). Test and item statistics are shown in Figure 2. These statistics show the number of items (nItems), the alpha value, the mean p-, rit-, and rir-value, the maximum test score, and the number of responses (N). The test statistics show the average values of the test, while the item statistics show the values for each item in the data set. The Cronbach’s alpha shows the reliability of the test and a value of 0.83 means that the test is reliable.

Figure 2. CTT analysis of data

Comparing results

The data set can be analysed in general, however, also specific countries can be analysed separately or compared. In this example, the test statistics of the Netherlands and Germany are compared. As shown in Figure 3, there is a difference in the alpha value between the Netherlands and Germany. The alpha value for the Netherlands is 0.82, the alpha value for Germany is 0.67. This means that the results from the Netherlands are more reliable than the results from Germany.

Figure 3. Test statistics of the Netherlands and Germany

Comparing performances

To make a ranking of the performances of all countries that participated in the PISA study, the test scores need to be compared. Figure 4 shows a part of the individual test scores of this PISA study. Figure 5 shows a part of the test scores per country and Figure 6 shows a graph of the test scores of all countries that participated in this PISA study. This last graph shows that Japan performed best.

Figure 4. Individual test scores

Figure 5. Test score per country (partly)

Figure 6. Test scores of all countries

References

Feskens, R. (2020, June 8). Programme for International Student Assessment [Slides]. Retrieved from https://canvas.utwente.nl/courses/5049/pages/pisa?module_item_id=148084

PISA. (n.d.). Retrieved June 15, 2020, from https://www.oecd.org/pisa/

PISA (Programma for International Student Assessment. (n.d.). Retrieved June 12, 2020, from https://www.cito.nl/kennis-en-innovatie/onderzoek/in-opdracht/internationaal-pisa/

5 opmerkingen:

NIVI16 juni 2020 om 21:42
Hi Birgit,

I really liked reading your blogpost. I thought it was succinct and shows your understanding of the topic well. A possible area of area improvement in Exercise B could be elaborating on what the different test and item statistics shown mean (Rit, Rir, p).

Regards,
Niveditha
BeantwoordenVerwijderen
Reacties
Nikola17 juni 2020 om 19:57
Hi Birgit,

this is very great post! It clearly shows that you understand the topic well and you know how to work in R. I personally appreciate that you included all the codes and computed results in your post.

Well done!
BeantwoordenVerwijderen
Reacties
Atayo's World20 juni 2020 om 23:00
I enjoyed reading your post. It was straight to the point and you gave an indication that you mastered the art of conducting analysis on large scale within the CTT framework using R. Good job.
BeantwoordenVerwijderen
Reacties
Anoniem22 juni 2020 om 21:01
Hi Birgit,

You've written a nice blog where you show insight.
I see that we end up with the same p-value, ride and rir values, only I see that we end up with a different country in the best scoring countries. Because I found the program RStudio itself very difficult to handle, I dare not say if this is correct.
A small detail: you have visualized the scores of each country nicely. However, this is very small and difficult to see. You could use text to name the lowest and highest scores.

In general: nice blog!

Kind regards,
Sjanne
BeantwoordenVerwijderen
Reacties

Reactie toevoegen

Development and assessment of tests for education

maandag 15 juni 2020

Analysing PISA data with Rstudio

5 opmerkingen:

Defining educational measurement and describing its innovations and future

Zoeken in deze blog