“How hard could it be?”
We hear that a lot. Having understood the importance of job fit and culture fit, the companies we talked to were always eager to start using psychometric assessments for hiring or development.
But there are some who believe that it would be easier for companies to just make their own assessments internally, instead of having to deal with external providers.
After all, many big companies developed their own tests.
How hard could it be?
How to build an assessment from scratch - properly
Anyone can copy down questions from tests online or in books, but that’s not the right way to create an assessment.
The risk of using poorly-made assessments range from bad hires to legal repercussions. So it’s important for companies to use (or build) proper assessments.
Building a proper, valid, and reliable assessment is a monumental undertaking that involves a lot of investment and risks. Trust us, we’ve been doing this for years — we know. It takes a lot of time, resources, and patience to get it right.
To show just how difficult it is, here is a much simplified version of how we create our assessments in Dreamtalent (not including the hundreds of hours doing statistics, gathering thousands of samples, getting it wrong, throwing everything out and starting over from zero).
1. Test Conceptualization
First, establish why you want to make an assessment. What is it that you’re trying to measure? Is it personality, intelligence, a specific skillset or behavior relevant to the job?
A lot of preliminary research is required. Has this been done before? Is there even a need for this assessment to be created? Without answering these questions, you might end up creating a test that is ultimately useless.
Then there’s the literature review. Which theory is the best to base this test on? How do other frameworks compare? Here you’ll spend many hours reading lots and lots of academic journals and books.
And that’s not all. There are many other questions to consider. What’s the format going to be, true-or-false or an essay? Why? Will it be administered individually or in groups? Is there a need for different versions? Will administrators and interpreters need special training?
Once you’ve done all the research to answer these questions, it’s not uncommon that you discovered there was no need to create that assessment and all your research is for nothing.
But if the concept passes, you can get started with the pilot study. This is basically a pre-testing of your assessment. Come up with a bunch of items (questions), then get ahold of a sample group (usually a few hundred) to test them on and see if your items are on the right track to measure the purpose of this assessment.
Then, you can get started with making the prototype.
2. Test Construction
After solidifying the philosophical and theoretical bases, you can begin constructing the test itself. In this stage you’ll need to do at least 3 things.
Choose the right scaling method depending on your assessment’s conceptualization. Should it be unidimensional or multidimensional? Comparative or categorical? A popular scaling method is the Likert scale where respondents select an answer from ‘Strongly Disagree’ to ‘Strongly Agree’. If you choose this scale too, what’s the reasoning behind it?
Then you need to write up the actual questions in item creation, which entails lots of considerations as well. What area should one question measure? How many questions should there be? What’s the format: multiple choice or essay? Do similar questions do well during the pilot test?
Finally, there’s the scoring system to think about. How can you translate their answers into measurable, meaningful results? Should it be cumulative, categorical, or ipsative?
With all that done, you have created the very first rough draft of your assessment. The next step is to see if this prototype can withstand real testing conditions.
3. Test Tryout
In this stage, you will see if the scaling method, scoring system, and everything you’ve built so far will actually work in a real testing environment.
To conduct a proper test tryout, you first need to create the testing environment. Who should your subjects be? How many people do you need for a sufficient analysis?
The environment of the test administration should be consistent as well. Are they monitored? Is it a sit-down test or a group discussion? Emulate the actual conditions this test aims for.
Passing this stage means your test works as a test, that the instructions are clear and the format is right. But this doesn’t say anything about the quality of your items (the questions).
4. Item Analysis
How can you tell if your items are any good? The only way is doing lots of rigorous statistical calculations on every item. Since we can’t possibly cover that without digressing to Statistics 101, here’s the gist of what item analysis covers:
- Difficulty: Is this question too easy or too hard?
- Reliability: Are these questions measuring the same thing? Reliability concerns internal consistency to make sure your test is focused.
- Validity: The questions may indeed measure the same thing, but is it the correct thing that you want to measure? We don’t want a personality test to measure intelligence instead.
- Item Discrimination: Are these questions descriptive enough? Can they differentiate high scorers with low scorers?
Those were only the qualitative factors, not to mention other concerns that may affect an item’s quality such as guessing, bias, time limit, culture, language, length of questions, the test taker’s mental state, etc.
By now all the flaws and shortcomings in your assessment prototype will be revealed. The questions could be confusing or misunderstood, expert panels that you interviewed might have some disagreements, the validity and reliability numbers were too low, or your theory base in the conceptualization stage may be updated with latest research.
As time goes, your questions will also need to be revised as culture evolves and words carry on different meanings.
What do you then? Scrap it and go back to square one, and do all this all over again.
Hopefully it’s clear by now that making assessments is a monumental feat.
The Hidden Costs
Making an assessment requires expertise. You’ll need a dedicated team of researchers and psychologists to be qualified to design a psychological measurement tool, and you’ll need someone with a PhD to validate it.
Even if you somehow manage to form this team and have them work full time, it’ll take at least years to construct one test. And it’s never finished as there will always be revisions and updates to undertake.
Needless to say that running a team of psychology experts and financing focus groups would cost a fortune.
We often hear stories of those who were tempted to skip all the hassle and just copy someone else’s test as their own. Not only that is illegal, it’s also not valid and not reliable.
The risk? At worst, the misleading tests will cause bad hires that destroy your company’s culture and performance.
At equally worst, you might face legal repercussions for running tests that are biased and discriminating.
Why reinvent the wheel?
It’s up to you to decide if all the effort, time, resources, and risks of creating your own pre-employment assessment is worth it, especially when there’s a much simpler, better solution: leave it to the experts.
We’ve done all the work for you. Our assessments in Dreamtalent were built with years of research, study, re-research by our expert psychology team, and they’re scientifically proven to be valid, reliable, and predictive for pre-employment, development, and your other HR needs.
Why reinvent the wheel when we’ve got the entire car here ready for you?