Psychometrics is a field of study connected with Psychology and Statistics concerned with the measurement of psychological constructs, such as intelligence, abilities, attitudes, personality, executive functions, etc.

Why study Psychometrics?

To investigate the manifestation of psychological characteristics and traits in a systematic and standardised way, producing technical and scientific knowledge.

How does psychometrics benefit society?

It provides subsidies for development, administration, analysis and interpretation of psychological assessment tools. Psychological testing is an objective, standardised, and statistically representative measure of a reference group, which represents the current population of interest.

Classical Test Theory (CTT) vs Item Response Theory (IRT)

CTT measures the overall score on a test (t). In other words, manifest behavior is the unique representation of a construct, with no consideration to latent traits. Therefore, it aims to elaborate strategies (statistics) to control or evaluate the magnitude of the error (E). In CTT, the unit of analysis is the whole test (item sum or mean).

For IRT, the answer a subject gives to an item depends on his or her level on the latent trait, that is, the magnitude of his or her theta. IRT proposes the validation of items and not of tests. This favours the composition of large groups of independent items that can be used to create (or customize) different tests for different purposes.

Imagine that six items were with different levels of complexity were ordered by difficulty and then administered to 15 students. According to CTT, all students would have scored the same as two out of six items were responded correctly by all of them. Conversely, IRT would take into account the interactions between the student’s response patterns with the item parameters in order to calculate the probability of answering an item correctly and then estimate their scores on the latent trait (theta).

Embretson & Reise (2000)

Dichotomous IRT models

For dichotomous items, four different logistic models can be used to estimate the probability of answering an item correctly, being the most common the three-parameter logistic model (3-PL model). For such a model, IRT presumes that the interactions of a person with test items can be adequately represented by the following mathematical expression:

The equation above derives the following item characteristic curve, which describes the relationship between the individual’s latent trait and the performance on a test item:

The three item parameters are: discrimination (a), the maximum slope of the curve, difficulty (b), the item location on the x-axis, and pseudo-guessing (c) or the probability that low ability individuals respond the item correctly by chance.

Polytomous IRT models

Different IRT models can be deployed when polytomous (e.g., Likert scale) items are under assessment, such as the graded response model (GRM). One of GRM’s main advantage is to reduce the number of items in a systematic, statistically appropriate way. GRM assess the quality of an instrument based on two parameters mentioned above, which are calculated for each item: (a) discrimination, which represent how discriminative an item is to differentiate between individuals with higher or lower agreement toward a construct (e.g., organisational climate) and b) locations, which represent how difficult an item’s category is to be accepted by the respondents. Items that fit the GRM appropriately display ordered response category curves, which means that all the points on the Likert scale have been endorsed. Disordered category curves indicate poorly-functioning category response. Low slopes suggest that the items are not discriminating accurately between persons with low and higher attitudes.

When applied to organisational settings, modern psychometric techniques can contribute to the development and validation of better quality instruments to measure organisational behaviours, which in turn can favour HRM practices such as recruitment and selection, and performance assessment.