vrijdag 24 april 2020

Analysing data with item response theory


To gain more insight into how a test can measure someone’s knowledge it is useful to analyse the relationship between the questions of a test and the ability of respondents. This can be done with Item Response Theory (IRT). IRT represents the relationship between items in a test and the latent traits (e.g. someone’s ability in math). There are different models to describe this relationship and for this assignment three were examined, namely the Rasch, 2PL (parameter logistic), and 3PL model. The Rash model only takes the difficulty of the items into account. The 2PL model adds discrimination and the 3PL model considers difficulty, discrimination, and guessing. To learn how to find a fitting model for data, a Graduate Management Admission Test (GMAT) dataset of ShinyItemAnalysis was used (https://shiny.cs.cas.cz/ShinyItemAnalysis/)
At this website firstly, at the Data tab the GMAT2 dataset was loaded. Secondly, at the IRT tab, the subtab ‘Rasch model’ was selected. At this page, the item characteristic curves, item information curves, test information function, table of estimated parameters, ability estimates, scatter plot of factor scores and standardized total scores, and wright map are shown. This page was inspected to learn about the characteristics of the items. The same was done for the 2PL and 3PL models. See table 1 -3 for the estimated item parameters of the various models. 
Lastly, the subtab ‘model comparison’ was selected to view the comparison of ShinyItemAnalysis of the three models that were taken a closer look at. This page shows a table of comparison statistics (see figure 1) in which the models are compared, and the best-fitting model is shown. In four out of five cases the table indicates that the 2PL model has the best fit with the data based on the comparison statistics of ShinyItemAnalysis. So, the 2PL model fits the data best. As said before, difficulty and discrimination are the two parameters of the 2PL model. Discrimination is defined as how well an item able to differentiate between people with higher and lower ability than the difficulty of the item. In figure 2, the item characteristic curves (ICC’s) are shown of the 20 items when they are analysed with the 2PL model. The ICC’s show the relationship between the difficulty in the various items and the chance a person answers the items correctly. The steeper the graph in the ICC, the more an item discriminates and is thus more informative.

Table 1

Item parameters for the Rasch model

a
SE(a)
b
SE(b)
c
SE(c)
Item 1
1.00
-
-0.11
0.07
0.00
-
Item 2
1.00
-
-0.39
0.07
0.00
-
Item 3
1.00
-
-0.93
0.07
0.00
-
Item 4
1.00
-
-1.31
0.08
0.00
-
Item 5
1.00
-
-1.49
0.08
0.00
-
Item 6
1.00
-
-0.58
0.07
0.00
-
Item 7
1.00
-
-0.65
0.07
0.00
-
Item 8
1.00
-
-0.51
0.07
0.00
-
Item 9
1.00
-
-0.32
0.07
0.00
-
Item 10
1.00
-
-0.15
0.07
0.00
-
Item 11
1.00
-
-0.75
0.07
0.00
-
Item 12
1.00
-
-0.41
0.07
0.00
-
Item 13
1.00
-
-1.26
0.08
0.00
-
Item 14
1.00
-
0.34
0.07
0.00
-
Item 15
1.00
-
0.00
0.07
0.00
-
Item 16
1.00
-
0.29
0.07
0.00
-
Item 17
1.00
-
0.17
0.07
0.00
-
Item 18
1.00
-
0.51
0.07
0.00
-
Item 19
1.00
-
0.09
0.07
0.00
-
Item 20
1.00
-
0.23
0.07
0.00
-


Table 2

Item parameters for the 2PL model

a
SE(a)
b
SE(b)
c
SE(c)
Item 1
0.70
0.11
-0.17
0.10
0.00
-
Item 2
0.82
0.12
-0.51
0.10
0.00
-
Item 3
0.25
0.10
-3.58
1.36
0.00
-
Item 4
0.41
0.11
-3.15
0.8
0.00
-
Item 5
0.67
0.12
-2.29
0.38
0.00
-
Item 6
0.69
0.11
-0.87
0.15
0.00
-
Item 7
0.44
0.10
-1.45
0.33
0.00
-
Item 8
0.49
0.10
-1.02
0.23
0.00
-
Item 9
0.23
0.09
-1.32
0.56
0.00
-
Item 10
0.44
0.09
-0.34
0.17
0.00
-
Item 11
0.49
0.10
-1.52
0.32
0.00
-
Item 12
0.35
0.09
-1.14
0.34
0.00
-
Item 13
0.46
0.11
-2.73
0.62
0.00
-
Item 14
0.76
0.11
0.47
0.11
0.00
-
Item 15
0.47
0.09
0.01
0.14
0.00
-
Item 16
0.75
0.11
0.41
0.11
0.00
-
Item 17
0.29
0.09
0.59
0.28
0.00
-
Item 18
1.03
0.13
0.56
0.09
0.00
-
Item 19
0.74
0.11
0.13
0.10
0.00
-
Item 20
0.32
0.09
0.70
0.28
0.00
-


Table 3

Item parameters for the 3PL model

a
SE(a)
b
SE(b)
c
SE(c)
Item 1
0.86
0.40
0.28
0.85
0.14
2.22
Item 2
0.82
0.12
-0.50
0.13
0.00
10.24
Item 3
0.83
0.87
1.69
0.76
0.62
0.51
Item 4
1.42
1.19
1.01
0.52
0.70
0.37
Item 5
0.66
0.13
-2.30
0.47
0.01
10.46
Item 6
1.17
0.61
0.36
0.67
0.37
0.81
Item 7
0.45
0.15
-1.28
1.76
0.04
10.57
Item 8
0.52
0.17
-0.86
1.50
0.03
11.29
Item 9
0.60
0.62
2.04
0.97
0.44
0.77
Item 10
1.25
0.84
1.37
0.32
0.41
0.40
Item 11
0.80
0.66
0.10
2.00
0.36
1.93
Item 12
0.35
0.10
-1.09
0.70
0.01
10.82
Item 13
0.47
0.12
-2.59
1.31
0.03
10.58
Item 14
1.07
0.46
0.88
0.36
0.15
1.05
Item 15
0.53
1.09
0.39
6.63
0.09
19.26
Item 16
0.74
0.11
0.43
0.14
0.00
10.41
Item 17
0.32
0.11
0.66
1.21
0.02
10.08
Item 18
2.84
1.57
0.95
0.12
0.22
0.31
Item 19
2.11
1.02
0.96
0.17
0.32
0.29
Item 20
2.62
1.92
1.72
0.22
0.40
0.12





Figure 1. Screenshot of the comparison statistics table.



Figure 2. Item characteristic curves, 2PL model.

5 opmerkingen:

  1. Dear Birgit,

    I think your description is very general about what to do for the assignment instead of describing the assignment itself. What is the difference between the three models and how do they differentiate to describe the dataset? I see some tables and figures, but I miss to see a description in words of what the table and figure are telling. In the end, I still don't have an idea of what this dataset is about, for which ability level students it is for, etc.

    Best regards,
    Lin

    BeantwoordenVerwijderen
  2. Hi Birgit,

    I agree with the comment of Lin. I miss a description of the figures and tables. Additionally, I also miss APA references (e.g. when you explain the theory).

    A few questions arised when I was reading your post:

    - What means a,b,c in the tables?
    - (for example) What means when difficulty of an item is 0,7? What is the lowest required ability of the student in order to answer the item correctly?

    Have a nice day!
    Nikola

    BeantwoordenVerwijderen
  3. Deze reactie is verwijderd door de auteur.

    BeantwoordenVerwijderen
  4. Hi Birgit!

    I read your blog. Overall, it is a nice compact blog.
    But still miss some elements in your blog. While reading your blog, I can see you used the article and shinyitemananalysis, but i can't find it in your use of APA.
    Furthermore, i would like to have a little bit more information about the figures, how to read these, the information they give en what this information means.

    Good luck with your next blog!

    Kind regards,
    Sjanne

    BeantwoordenVerwijderen
  5. Hi Birgit,

    Your blog looks great and it is well written. The information you provide about the analyses you performed is very informative. You might have described the sample more into detail and it would have been interesting if you would have described all the tables and figures more into detail. Now, I missed the interpretation of them. Finally, just a brief remark, names of tables and figures are usually written using capitals.

    Best, Bernard Veldkamp

    BeantwoordenVerwijderen

Defining educational measurement and describing its innovations and future

In the last eight weeks, I have learned about different topics regarding educational measurement as can be seen in the previous six blogpo...