To
gain more insight into how a test can measure someone’s knowledge it is useful to
analyse the relationship between the questions of a test and the ability of
respondents. This can be done with Item Response Theory (IRT). IRT represents
the relationship between items in a test and the latent traits (e.g. someone’s ability
in math). There are different models to describe this relationship and for this
assignment three were examined, namely the Rasch, 2PL (parameter logistic), and
3PL model. The Rash model only takes the difficulty of the items into account.
The 2PL model adds discrimination and the 3PL model considers difficulty,
discrimination, and guessing. To learn how to find a fitting model for data, a Graduate Management Admission Test (GMAT) dataset of
ShinyItemAnalysis was used (https://shiny.cs.cas.cz/ShinyItemAnalysis/).
At this website firstly, at the Data tab the GMAT2
dataset was loaded. Secondly, at the IRT tab, the subtab ‘Rasch model’ was
selected. At this page, the item characteristic curves, item information
curves, test information function, table of estimated parameters, ability
estimates, scatter plot of factor scores and standardized total scores, and
wright map are shown. This page was inspected to learn about the
characteristics of the items. The same was done for the 2PL and 3PL models. See
table 1 -3 for the estimated item parameters of the various models.
Lastly, the
subtab ‘model comparison’ was selected to view the comparison of
ShinyItemAnalysis of the three models that were taken a closer look at. This
page shows a table of comparison statistics (see figure 1) in which the models
are compared, and the best-fitting model is shown. In four out of five cases
the table indicates that the 2PL model has the best fit with the data based on
the comparison statistics of ShinyItemAnalysis. So, the 2PL model fits the data
best. As said before, difficulty and discrimination are the two parameters of
the 2PL model. Discrimination is defined as how
well an item able to differentiate between people with higher and lower
ability than the difficulty of the item. In
figure 2, the item characteristic curves (ICC’s) are shown of the 20 items when
they are analysed with the 2PL model. The ICC’s show the relationship between
the difficulty in the various items and the chance a person answers the items
correctly. The steeper the graph in the ICC, the
more an item discriminates and is thus more informative.
Table
1
Item
parameters for the Rasch model
|
a
|
SE(a)
|
b
|
SE(b)
|
c
|
SE(c)
|
Item
1
|
1.00
|
-
|
-0.11
|
0.07
|
0.00
|
-
|
Item
2
|
1.00
|
-
|
-0.39
|
0.07
|
0.00
|
-
|
Item
3
|
1.00
|
-
|
-0.93
|
0.07
|
0.00
|
-
|
Item
4
|
1.00
|
-
|
-1.31
|
0.08
|
0.00
|
-
|
Item
5
|
1.00
|
-
|
-1.49
|
0.08
|
0.00
|
-
|
Item
6
|
1.00
|
-
|
-0.58
|
0.07
|
0.00
|
-
|
Item
7
|
1.00
|
-
|
-0.65
|
0.07
|
0.00
|
-
|
Item
8
|
1.00
|
-
|
-0.51
|
0.07
|
0.00
|
-
|
Item
9
|
1.00
|
-
|
-0.32
|
0.07
|
0.00
|
-
|
Item
10
|
1.00
|
-
|
-0.15
|
0.07
|
0.00
|
-
|
Item
11
|
1.00
|
-
|
-0.75
|
0.07
|
0.00
|
-
|
Item
12
|
1.00
|
-
|
-0.41
|
0.07
|
0.00
|
-
|
Item
13
|
1.00
|
-
|
-1.26
|
0.08
|
0.00
|
-
|
Item
14
|
1.00
|
-
|
0.34
|
0.07
|
0.00
|
-
|
Item
15
|
1.00
|
-
|
0.00
|
0.07
|
0.00
|
-
|
Item
16
|
1.00
|
-
|
0.29
|
0.07
|
0.00
|
-
|
Item
17
|
1.00
|
-
|
0.17
|
0.07
|
0.00
|
-
|
Item
18
|
1.00
|
-
|
0.51
|
0.07
|
0.00
|
-
|
Item
19
|
1.00
|
-
|
0.09
|
0.07
|
0.00
|
-
|
Item
20
|
1.00
|
-
|
0.23
|
0.07
|
0.00
|
-
|
Table
2
Item
parameters for the 2PL model
|
a
|
SE(a)
|
b
|
SE(b)
|
c
|
SE(c)
|
Item
1
|
0.70
|
0.11
|
-0.17
|
0.10
|
0.00
|
-
|
Item
2
|
0.82
|
0.12
|
-0.51
|
0.10
|
0.00
|
-
|
Item
3
|
0.25
|
0.10
|
-3.58
|
1.36
|
0.00
|
-
|
Item
4
|
0.41
|
0.11
|
-3.15
|
0.8
|
0.00
|
-
|
Item
5
|
0.67
|
0.12
|
-2.29
|
0.38
|
0.00
|
-
|
Item
6
|
0.69
|
0.11
|
-0.87
|
0.15
|
0.00
|
-
|
Item
7
|
0.44
|
0.10
|
-1.45
|
0.33
|
0.00
|
-
|
Item
8
|
0.49
|
0.10
|
-1.02
|
0.23
|
0.00
|
-
|
Item
9
|
0.23
|
0.09
|
-1.32
|
0.56
|
0.00
|
-
|
Item
10
|
0.44
|
0.09
|
-0.34
|
0.17
|
0.00
|
-
|
Item
11
|
0.49
|
0.10
|
-1.52
|
0.32
|
0.00
|
-
|
Item
12
|
0.35
|
0.09
|
-1.14
|
0.34
|
0.00
|
-
|
Item
13
|
0.46
|
0.11
|
-2.73
|
0.62
|
0.00
|
-
|
Item
14
|
0.76
|
0.11
|
0.47
|
0.11
|
0.00
|
-
|
Item
15
|
0.47
|
0.09
|
0.01
|
0.14
|
0.00
|
-
|
Item
16
|
0.75
|
0.11
|
0.41
|
0.11
|
0.00
|
-
|
Item
17
|
0.29
|
0.09
|
0.59
|
0.28
|
0.00
|
-
|
Item
18
|
1.03
|
0.13
|
0.56
|
0.09
|
0.00
|
-
|
Item
19
|
0.74
|
0.11
|
0.13
|
0.10
|
0.00
|
-
|
Item
20
|
0.32
|
0.09
|
0.70
|
0.28
|
0.00
|
-
|
Table
3
Item
parameters for the 3PL model
|
a
|
SE(a)
|
b
|
SE(b)
|
c
|
SE(c)
|
Item
1
|
0.86
|
0.40
|
0.28
|
0.85
|
0.14
|
2.22
|
Item
2
|
0.82
|
0.12
|
-0.50
|
0.13
|
0.00
|
10.24
|
Item
3
|
0.83
|
0.87
|
1.69
|
0.76
|
0.62
|
0.51
|
Item
4
|
1.42
|
1.19
|
1.01
|
0.52
|
0.70
|
0.37
|
Item
5
|
0.66
|
0.13
|
-2.30
|
0.47
|
0.01
|
10.46
|
Item
6
|
1.17
|
0.61
|
0.36
|
0.67
|
0.37
|
0.81
|
Item
7
|
0.45
|
0.15
|
-1.28
|
1.76
|
0.04
|
10.57
|
Item
8
|
0.52
|
0.17
|
-0.86
|
1.50
|
0.03
|
11.29
|
Item
9
|
0.60
|
0.62
|
2.04
|
0.97
|
0.44
|
0.77
|
Item
10
|
1.25
|
0.84
|
1.37
|
0.32
|
0.41
|
0.40
|
Item
11
|
0.80
|
0.66
|
0.10
|
2.00
|
0.36
|
1.93
|
Item
12
|
0.35
|
0.10
|
-1.09
|
0.70
|
0.01
|
10.82
|
Item
13
|
0.47
|
0.12
|
-2.59
|
1.31
|
0.03
|
10.58
|
Item
14
|
1.07
|
0.46
|
0.88
|
0.36
|
0.15
|
1.05
|
Item
15
|
0.53
|
1.09
|
0.39
|
6.63
|
0.09
|
19.26
|
Item
16
|
0.74
|
0.11
|
0.43
|
0.14
|
0.00
|
10.41
|
Item
17
|
0.32
|
0.11
|
0.66
|
1.21
|
0.02
|
10.08
|
Item
18
|
2.84
|
1.57
|
0.95
|
0.12
|
0.22
|
0.31
|
Item
19
|
2.11
|
1.02
|
0.96
|
0.17
|
0.32
|
0.29
|
Item
20
|
2.62
|
1.92
|
1.72
|
0.22
|
0.40
|
0.12
|
Figure
1. Screenshot of the comparison
statistics table.
Figure
2. Item characteristic curves, 2PL
model.
Dear Birgit,
BeantwoordenVerwijderenI think your description is very general about what to do for the assignment instead of describing the assignment itself. What is the difference between the three models and how do they differentiate to describe the dataset? I see some tables and figures, but I miss to see a description in words of what the table and figure are telling. In the end, I still don't have an idea of what this dataset is about, for which ability level students it is for, etc.
Best regards,
Lin
Hi Birgit,
BeantwoordenVerwijderenI agree with the comment of Lin. I miss a description of the figures and tables. Additionally, I also miss APA references (e.g. when you explain the theory).
A few questions arised when I was reading your post:
- What means a,b,c in the tables?
- (for example) What means when difficulty of an item is 0,7? What is the lowest required ability of the student in order to answer the item correctly?
Have a nice day!
Nikola
Deze reactie is verwijderd door de auteur.
BeantwoordenVerwijderenHi Birgit!
BeantwoordenVerwijderenI read your blog. Overall, it is a nice compact blog.
But still miss some elements in your blog. While reading your blog, I can see you used the article and shinyitemananalysis, but i can't find it in your use of APA.
Furthermore, i would like to have a little bit more information about the figures, how to read these, the information they give en what this information means.
Good luck with your next blog!
Kind regards,
Sjanne
Hi Birgit,
BeantwoordenVerwijderenYour blog looks great and it is well written. The information you provide about the analyses you performed is very informative. You might have described the sample more into detail and it would have been interesting if you would have described all the tables and figures more into detail. Now, I missed the interpretation of them. Finally, just a brief remark, names of tables and figures are usually written using capitals.
Best, Bernard Veldkamp