4. Is e-assessment valid and reliable?
5. What are the advantages of e-assessment?
6. What are the disadvantages of e-assessment?
7. What types of assessment exist?
8. Can e-assessment be used for all types of assessment?
10. Can e-assessment be used for anything other than multiple choice questions?
11. Can e-assessment be used to measure higher order skills?
12. Are there quality assurance issues?
13. What is an item bank?
14. What’s the difference between an question’s “demand” and its “difficulty”?
16. What is blended assessment?
18. What new skills are needed?
19. Are there any standards relating to online assessment?
20. What is meant be
interoperability?
21. What are the ‘e-assessment myths’?
Version 1.41, September 2003
This FAQ is maintained by Bobby Elliott (in a personal capacity).
It is located at http://www.bobbyelliott.com/faq.htm
Comments, suggestions and corrections should be notified to the author.
Assessment is the process of measuring a person’s knowledge or skills. It’s not a science; it doesn’t prove anything – but passing a test or completing a practical task implies a certain level of competency. A special type of assessment (called formative assessment) is used to aid the learning process (this is called “assessment for learning”).
E-assessment is any assessment activity which involves the use of computers. This broad definition includes simple uses of computers (such as word processing an essay) and complex uses (such as the use of simulation systems). Any use of IT for the purpose of assessment is ‘e-assessment’.
Usually,
nothing. There are pedantic distinctions but most people use these terms
interchangeably. Avoid the term “computer-based testing” since ‘
Is e-assessment valid and reliable?
There is nothing inherently invalid or unreliable about e-assessment – but the limited applications of e-assessment to date has resulted in the validity of e-assessment being questioned. The focus on multiple choice questions has meant that certain types of knowledge (the higher order cognitive skills) are sometimes neglected – but most types of knowledge and understanding can be assessed using objective questions. In fact, used properly, e-assessment can be more valid than traditional assessment (see below).
E-assessment can improve reliability. Research has shown that machine marking is more reliable than human marking – and much more consistent from year-to-year. In general, more care goes into the construction of e-assessments than their paper counter-parts due, in part, to the greater scrutiny that e-assessment is subject to than traditional forms of assessment.
What are the advantages of e-assessment?
1. Improved validity: Well designed e-assessment can be more valid than traditional assessment. The use of text, sound and video can make questions clearer and more realistic than the narrative descriptions imposed by paper assessments. Research has shown that the reduced emphasis on language skills (through the increased use of multimedia) improves the performance of male candidates.
2. Increased reliability: E-assessments can be more reliable than paper assessments. There is more year-on-year standardisation in the questions and less variance in marking standards. The reliability of traditional assessment has been a recurring problem to awarding bodies, centres and candidates.
3. Improved feedback: Feedback is improved both in terms of its immediacy and level of detail. It is normally possible to provide instant feedback on candidate performance. Also, detailed analysis of candidate performance can usually be provided. This makes e-assessment particularly well suited to formative assessment when quick feedback can be used for remedial purposes.
4. More flexibility: Traditional assessment is synchronous – question papers, exam rooms and candidates’ attendance all have to be co-ordinated at specific times; e-assessment is asynchronous – candidates can undertake their assessment any-place-any-time. Once an appropriate infrastructure has been established, e-assessment is ideally suited to assessment-on-demand. This is particularly appropriate for diagnostic and formative assessment – and fits well with the flexible nature of e-learning.
5. Lower cost: In the short to medium term, traditional assessment is less expensive to produce and use. But in the longer term, e-assessment is less expensive since it can be re-used from year to year (assuming that tests are drawn from an item bank). E-assessment is expensive to produce but, once created, is relatively inexpensive to operate and mark – and has the potential to be re-used, year after year.
6. Increased motivation: The use of multimedia makes e-assessment more engaging and more realistic than paper-based assessment. E-assessment also has the potential to be adaptive – that is, it can be modified in the light of previous responses to present more appropriate questions to candidates. The inherent privacy of this medium can reduce candidate’s embarrassment and increase their confidence to attempt assessments.
7. More time to teach: E-assessment should give teachers more time to teach and less time spent assessing and marking – whilst simultaneously providing more precise feedback on students’ progress.
8. No alternative: Awarding Bodies are hardly coping with current demands for testing. The advent of the “learning society” will increase learning which, in turn, will increase the demand for assessment. The rigid nature of current assessment systems is inappropriate for this “any place, any time” learning.
What are the disadvantages of e-assessment?
1. E-assessments take time and money to produce: Traditional assessments are quick and simple to produce – but slow and expensive to use. E-assessment shifts the effort to the front of the process. They’re time-consuming and expensive to create – and quick and simple to use. The pay-back for e-assessment is long term – but educational budgets are short/medium term.
2. E-assessment cannot be used for every type of assessment: Although e-assessment can be used to assess most types of knowledge and understanding, it has limitations. For example, essay marking is currently not well done by a computer; e-assessment is (currently) not very good at measuring creative skills.
3. E-assessment requires more
support:
Centres need to be ‘tooled-up’ for online assessment. The infrastructure for
traditional assessment is already in place; the infrastructure for e-assessment
is not. Depending on the specific e-assessment system in use, centres may
require dedicated assessment centres and there is a requirement for on-going
technical support throughout the assessment period.
4. E-assessment is new: New things take time to be accepted – and new skills need to be developed to produce high quality e-assessments. Many teachers fear that e-assessment will de-skill their profession. Some teachers suspect the motives for introducing e-learning and e-assessment (“weapons of mass instruction”).
What types of assessment exist?
Assessment that is designed for measuring achievement is called summative assessment. Summative assessment is used to measure progress and to grade students (for entry to university or employment). “Assessment for learning” is formative assessment. Formative assessment is used to aid the learning process by providing feedback to the student and their teacher – which should subsequently alter their learning. Self-assessment is a type of formative assessment. Diagnostic assessment sits between summative and formative assessment. A diagnostic assessment is used to measure a student’s current level of knowledge – with the purpose of identifying a suitable programme of learning. Diagnostic assessment is often used as a form of pre-entry test. Adaptive testing is a form of testing which changes as the test progresses. For example, the candidate’s response to the first few questions may alter the subsequent questions. Adaptive testing can be used for formative, diagnostic or (less commonly) summative purposes.
Can e-assessment be used for all types of assessment?
Yes. In fact, e-assessment is ideally suited to formative and diagnostic assessment due to the detailed feedback made possible by machine marking.
However, the design of summative and formative e-assessment is different (since the purpose of each type of assessment is so different) – and it is time-consuming and expensive to create high quality formative assessments.
Objective testing is a form of assessment where each question has one (and only one) correct answer – and there is no ambiguity about what that correct answer should be. It encompasses a wide range of question types – including true/false questions, multiple choice questions (MCQs) and multiple response questions (MRQs). Objective questions are contrasted with subjective questions. For example, open-ended questions which require an extended response (such as an essay) are subjective since a wide range of responses can be expected and it’s not possible to define the “correct” answer in advance (for this reason, “marking guidelines” are normally provided). Computers are very limited at interpreting answers so most e-assessment systems use objective questions.
Can e-assessment consist of anything other than multiple choice questions?
You would be forgiven for thinking that it can’t. The vast majority of existing e-assessment is multiple choice – but this reflects the software tools that have historically been used to generate e-assessments rather than any inherent limitations in technology. Contemporary e-assessment systems can generate a wide variety of (objective) question types including true/false, multiple selection, matching and ordering – and they add some unique formats of their own (such as hot-spots and drag-and-drop). Extended response questions cannot be reliably assessed using machines at this time – although progress has been made with short-response questions.
Can e-assessment be used for higher order skills?
Yes. Well designed objective questions can assess high order skills. There is a myth that objective testing is only appropriate for low-level skills. Most of Bloom's hierarchy of cognitive skills can be assessed using objective questions. However, it’s not easy to create objective questions that assess the higher order skills (such as synthesis and analysis) – and there are few (good) examples of such questions.
Are there quality assurance issues?
Yes. Security is probably the biggest problem. Over the years, current systems of assessment have developed fairly robust (if not perfect) distribution and delivery systems. E-assessment has not yet developed similarly robust systems. So there are concerns about the storage and delivery of online assessments.
Authentication is another problem. Authentication means proving that the person who sat the assessment is the person they claim to be. While this can be easily overcome in face-to-face assessment (by invigilation), it makes it difficult to conduct remote online assessment. There have been cases of poor authentication procedures which have damaged the reputation of e-assessment.
Plagiarism has always existed – but it’s more acute with e-assessment because the medium makes it easier to cut-and-paste material from various digital sources and pass it off as your own.
An item bank is a paper or electronic repository (database) of test items. Each item includes: (1) the question; (2) the answer; (3) the metadata. An item bank does not include the delivery system (an item bank plus a delivery system is an e-assessment system). The item includes all of the resources required to present the question (such as text, sound and graphics).
Metadata is used to classify questions. Questions need to be classified so that they can be selected to create tests. To create a test you normally need to know (at a minimum) the following information about each question: (1) what part of the syllabus it covers; (2) what cognitive competence it relates to; (3) how difficult it is.
Each question in an item bank has to relate to an associated syllabus; there must be a clear mapping from the question to the specific outcome or performance criterion to which it relates. In addition to knowing the source of the question, you also need to know the type of cognitive skill that the question is seeking to assess. For example, a question might assess factual recall of knowledge (the lowest level of Bloom’s hierarchy) or it might require the candidates to evaluate a product or process (a much higher level skill). The question’s level of difficulty is called its facility value (FV) which is a numerical measure of the proportion of candidates who normally answer it correctly (for example, an FV of 0.6 means that 60% of the candidates would be expected to answer it correctly).
Although there is no global system for classifying items, some standards are beginning to emerge – such as the QTI Specification.
What’s the difference between a question’s “difficulty” and its “demand”?
A
question’s demand relates to the type of intellectual challenge it
poses. It’s often (but not always) linked to a formal taxonomy (such as Bloom’s). For example, a factual
recall question has a lower demand than an evaluative question. A question’s difficulty
is a measure of the question’s complexity (or how “hard” the question
is). Demand and difficulty are not the same things. You can have an easy
high-demand question and a difficult low-demand question. For example,
describing the stages in the fetch-execute cycle (which relates to factual
recall) is more difficult than comparing the features of two keyboards (which
relates to evaluation). So there is a spectrum ranging from low-demand, low-difficulty
(“Which city is the capital of the
An electronic portfolio (“e-portfolio”) is an online repository of assessment material. This material might consist of word processed documents or electronic spreadsheets or computer databases. The repository (which is really a customised database) will be tailored to the requirements of a particular syllabus and will reflect the evidence requirements of that syllabus. Here is an example of an e-portfolio product.
In
addition to storing assessment evidence, the portfolio will record when the
material was uploaded (by the candidate) and will provide access to assessors
(to judge the candidates) and moderators (to judge the assessors).
Many
awarding bodies use online tests to assess knowledge and understanding, and
e-portfolios to assess practical skills – thereby covering the full range of
knowledge and skills within a particular course.
Blended assessment is a mix of traditional assessment with online tools. Most contemporary e-assessment is actually blended assessment since the overall assessment scheme normally involves computer-assisted assessment and manual assessment.
What is Bloom’s Taxonomy?
Benjamin Bloom wrote a book (in 1956) entitled Taxonomy of Education Objectives which, over the years, has become widely adopted within the educational community as the de facto way of classifying cognitive competence. In spite of its reputation, Bloom’s Taxonomy is a simple way of categorising cognitive skills. The Taxonomy has six levels:
1. knowledge
3. application
4. analysis
5. synthesis
6. evaluation.
There is renewed interest in Bloom’s Taxonomy because of e-assessment. The Taxonomy provides a way of classifying the demand of a question – which is an important feature of an item bank.
Creating an assessment that is intended to be delivered online is not the same as creating one on paper. Although some of the skills are the same, creating a question for online delivery involves some new skills. The required skill-set includes:
Some specialist qualifications in e-assessment are beginning to appear. The Scottish Qualifications Authority offers a Diploma in E-Assessment which covers all of the above.
Are there standards relating to online assessment?
Standards are beginning to emerge. For example, the IMS Global Learning Consortium is working on an international standard for describing questions and tests using XML. The Question & Test Interoperability (QTI) Specification has been produced to allow the exchange of content within online assessment systems.
What is meant by ‘interoperability’?
In it’s simplest sense, ‘interoperability’ means the ability to take questions out of one system and re-use them in another. This is more important than sounds. The initial investment in creating an item bank can be large – it takes a lot of time and money to create a large number of high quality items – so it’s important that this investment is not lost if/when the supplier of the e-assessment system goes bust or you choose to move to a different platform. Emerging standards (such as QTI) aim to ensure that you can export your questions from one system and import them to another.
What are the ‘e-assessment myths’?
1. E-assessment is a fad: In spite of several false starts, online assessment has arrived and is not going to go away. Recent developments relating to communication infrastructure, improved hardware and software, and widespread access to the Internet will enable the rapid adoption of e-assessment during the next decade.
2. Paper will never be replaced: It will – but not anytime soon. The next few years will be characterised by blended assessment – that is, a mix of traditional (paper-based) assessment and electronic assessment. But, in time, all assessment will be digitised – made possible by sophisticated simulation systems.
3. E-assessment is invalid: Badly designed e-assessment is invalid – and there are numerous examples of bad e-assessments. But there are bad paper assessments too – and we don’t conclude that this medium is invalid. E-assessment is as valid as we make it. There is nothing inherent in the medium that makes it invalid.
4. E-assessment is insecure: That depends on how it’s done. Online tests conducted within assessment centres are as secure as traditional assessments – but remote online assessment is a problem since there is currently no reliable way of authenticating the identity of remote candidates nor any simple way of ensuring that remote candidates don’t cheat.
5. E-assessment is expensive: Online assessments are expensive to design and create – but they are cheaper to use. Once a large, high quality item bank has been created, it can be re-used time and again. Over an extended period of time, online tests are less expensive that traditional means of generating assessments.
6. E-assessment can only be used to measure low-level skills: There are numerous examples of online tests which assess advanced skills. Many professional examinations use online tests and many under-graduate courses use objective tests throughout the degree programmes.
7. Objective questions can’t assess higher order skills: Most of Bloom’s Taxonomy can be assessed through (well designed) objective questions. For example, an objective question can provide a scenario about which the candidate can be asked evaluative questions (the highest level of Bloom’s Taxonomy). These questions are not easy to create – and there are not many examples of ‘good’ objectives questions– but they can be done.
8. E-assessment encourages cheating: It does if candidates are simply sat at a computer and allowed to do as they please. Research has shown that students (even good students) will cheat in the right circumstances (interestingly, they will even cheat during formative assessment when the results are only seen by the candidates themselves!). But that’s true of all assessment – not just online. E-assessment is more prone to plagiarism and there are authentication issues with remote online assessment but, like-for-like, cheating is no more of a problem online than offline.
9. E-assessment will de-skill the teaching profession: This is the “weapons of mass instruction” argument. It’s argued that teachers’ role in the future will be to supervise the massed-ranks of students sitting at their PCs, cramming facts and figures, and being assessed every five minutes. Any teacher who has actually experienced existing e-learning systems will know that they needn’t put their dust-coats away yet. Current systems are crude and focus on low-level learning. In the longer term, e-learning/assessment holds out the promise of individualised learning – with each student receiving customised (online) learning and their progress measured through a variety of formative and summative e-assessments.
10. E-assessment will never deliver the expected benefits: Existing online assessments are characterised by blended assessment and crude forms of ‘pure’ online assessment. The present stage focuses on automating existing assessment systems and practices – and fails to realise many of the benefits of e-assessment. But it’s an important stage in the evolution of e-assessment since it will establish a technical infrastructure and (gradually) create an online friendly culture. Technical challenges are not the issue; in time, these will be resolved. The choice of e-assessment system is almost incidental. The real challenge is the creation of high quality item banks (in terms of their subject content, pedagogy and multimedia).
© Bobby Elliott, 2003
The contents of this document can be freely distributed so long as the source is acknowledged and the contents are not changed.