Language Assessment: Douglas Brown about “Designing language classroom test”

ASSIGMENT 5 "SUMARRY"
Halaman 42-62.

DESIGNING CLASSROOM LANGUAGE TESTS.

The previous chapters introduce a number of building blocks for designing language tests. You now have a sense of where tests belong in the large domain of assessment. You have sorted through differences between formal and informal tests, formative and summative tests, and norm and criterion references tests.

TEST TYPES.
We will look first at two test types that you will probably not have many opportunities to create as a classroom teacher-language aptitude test and language proficiency tests and three types that you will almost certainly need to create-placement test, diagnostic test, and achievement tests.

• Language aptitude tests.
One type of test although admittedly not a very common one predicts a person’s success prior to exposure to the second language; A language aptitude test is designed to measure capacity or general ability to learn a foreign language and ultimate success in that undertaking. Language attitude test are ostensibly designed to apply to the classroom learning of any language.

• Proficiency tests.
If your aim is to test global competence in a language, then you are, in conventional terminology, testing, proficiency. A proficiency test is not limited to any one course, curriculum, or single skill in the language; rather, it tests overall ability, proficiency test have traditionally consisted of standardized multiple-choice item on grammar, vocabulary, reading comprehension and aural tests also included oral production performance. A type example of a standardized proficiency test is the Test of English as a Foreign Language (TOEFL) produce by the educational testing service.

• Placement test.
Certain proficiency test can act in the role of placement test, the purpose of the which is to place a student into a particular level or section of a language curriculum or school. A placement test usually, but not always, included of sampling of the material to be covered in the various courses in a curriculum; a student performance on the test should indicate the point at which the student will find material neither too easy nor too difficult but appropriately challenging.

• Diagnostic test.
A diagnostic test is designed to diagnose specified of a language. A test in pronunciation, for example, might diagnose the phonological feature of English that are difficult for learners and should therefore become part of a curriculum. Usually, such test offer a checklist of feature for the administrator to use in pinpointing difficulties. A writing diagnostic would elicit a writing sample from student that would allow the teacher to identify those rhetorical and linguistic features on which the course needed to focus special attention.
A type diagnostic test of oral production was created by Clifford prator (1972) to accompany a manual of English pronunciation. Test-takers are directed to read a 150-word passage while they are tape-recorded. The test administrator then refers to an inventory of phonological items for analyzing a learner’s production. After multiple listening, the administrator produce a checklist of errors in five separate categories, each of which has several subcategories, the main categories included.
1. Stress and rhytm,
2. Intonation,
3. Vowels,
4. Consonants, and
5. Other factors.

• Achievement test.
An achievement test is related directly to classroom lessons, units, or even a total curriculum. Achievement test are (or should be) limited to particular material addressed in a curriculum within a particular time frame and are offered after a course has focused on the adjectives in question. Achievement test can also serve the diagnostic role of indicating what a student need to continue to work on in the future, but the primary role of an achievement test is to determine whether courses adjective have been met- and appropriate knowledge and skill acquired by the end of a period of instruction.
The specifications for an achievement tests should be determined by
• The objectives of the lesson, unit, or course being assessed,
• The relative importance (or weight) assigned to each objective,
• The task employed in classroom lesson during the unit of the time,
• Practicality issues, such as the time frame for the test and turnaround time, and
• The extent to which the test structure lends itself to formative washback.

SOME PRACTICAL STEPS TO TEST CONSTRUCTION.
The descriptions of types of test in the preceding section are intended to help you understand how to answer the first question posed in this chapter. What is the purpose of the test? It is unlikely that you would be asked to design an aptitude test or a proficiency test, but for the purpose of interpreting those tests, it is important that you understand their nature. However, your opportunities to design placement, diagnostic, and achievement tests-especially the latter- will be plentiful, in the remainder of this chapter, we will be on equipping you with the tools you need to create such classroom-oriented test.

• Assessing clear, unambiguous objectives.
In addition to knowing the purpose of the test you’re creating, you need to know as specifically as possible what it is you want to test. Sometimes teachers give tests simply because it’s Friday the third weeks of the course, and after hasty glances at the chapter covered during those three weeks , they dash off some test item so that student will have something to do during the class. this is no way to approach a test. Instead, being by taking a careful look at everything that you think you student should “know” or be able to “do”, based on the material that the student are responsible for. In other words, examine the objectives for the unit you are testing.

• Drawing up test specifications.
In the unit discussed above, your specifications will simply comprise (a) a broad of the test, (b) what skills you will test, and (c) what the items will look like.
(a)outline of the test and (b) skills to be included. Because of the constraints of your curriculum, your unit test must take no more than 30 minute. This is an integrated curriculum, so you need to test all four skills, since you have the luxury of teaching a small class(only 12 student), you decide to include an oral production component in the preceding period (taking one by one into a separate room while the rest of the class reviews the unit individually and completes workbook exercise).(c) items types and task. The next and potentially more complex choices involve the item types and task to use in this test. It is surprising that there are a limited number of models of eliciting responses (that is, prompting) and of responding on test of any kind.

These informal, classroom-oriented specification give you an indication of
• The topics (objectives) you will cover,
• The implied elicitation and response formats for items,
• The number of items in each section, and
• The time to be allocated for each.

• Devising the tasks.
You are now ready to draft other test items .to provide a sense of authenticity and interest, you have decide to conform your items to context of a recent TV sitcom that you used in class to illustrate certain discourse and from focused factors. The sitcom depicted a loud, noisy party with lots of small talk. As you devise your test items, consider such factors as how students will perceive them (face validity), the extent to which authentic language and contexts are present, difficulty caused by cultural schemata, the legth of the listening stimuli, how well a story line comes across, how things like the cloze testing format will work and other practicalities.

• Designing Multiple-Choice Test Items.
In the sample achievement test above, two of the five components (both of the listening sections) specified a multiple-choice format from item. This was a bold step to take. Multiple-choice items, which may appear to be the simplest kind of item to construct, are extremely difficult to design correctly. Hughes (2003, pp. 76-78) caution against a number of a weaknesses of multiple-choice items:
• The technique test only recognition knowledge.
• Guessing may have a considerable effect on test scores.
• The technique severely restricts what can be tested.
• It is very difficult to write successful item.
• Washback may be harmful.
• Cheating may be facilitated.
Since there will be occasions when multiple-choice items are appropriate, consider the following four guidelines multiple-choice items for both classroom based and large situation (adapted from Grondlund, pp.60-75, and J.D. Brown, 1996, pp.54-57).

1. Design each item to measure a specific objective.
The specific objective being tested here is comprehension of wh-question. Distractors (a) is designed to ascertain that the student know the differences between an answer to a wh-question and yes/no question, distractors (b) and (d), as well as the key item. (c), test comprehension of the meaning of where as opposed to why and when. The objective has been directly addressed.

2. State both steam and option as simply and directly as possible.
You might argue that first two sentence of this item give it some authenticity and accomplish a bit of schema setting. But if you simply want a student to identify the type of medical professional who deals with eyesight issue, those sentence are superfluous. Moreover, by lengthening the stem, you have introduced a potentially confounding lexical item, deteriorate, that could distract the student unnecessarily.

3. Make certain that the intended answer is clearly the only correct one.
A quick consideration of the distractor (d)reveals that it is a plausible answer, along with the intended key. (c) eliminating unintended possible answer is often the most difficult problem of designing multiple-choice items. With only a minimum of context in each stem, a wide variety of responses may be perceived as correct.

4. Use item indices to accept, discard, or revise items.
The appropriate selection and arrangement of multiple-choice items on a test can be accomplished by measuring items against three indices: item facility (or item difficulty), item discrimination (sometimes called items differentiation), and distractor analysis. Although measuring these factors on classroom tests would be useful, you probably will have neither the time nor the expertise to do this for every classroom test you create, especially one-time tests. But they are must for standardized norm-referenced test that are designed to be administered a number of time and/or administered in multiple froms.

a. Item facility (or IF) is the extent to which an item is easy or difficult for the proposed group of test-takers. You may wonder why that is important if in your estimation the item achieves validity. The answer is that an item that is too easy (say 99 percent of respondent get it right) or too difficulty (99 percent get it wrong) really does nothing to separate high-ability and low-ability test-takers. It is not really performing much “work” for you on a test.

b. Item discrimination (ID) is the extent to which an item differentiates between high and low ability student score equally well would have poor ID because it did not discriminate between the two groups. Conversely an item that garners correct responses from most of the high ability group and in-correct responses from most of the low ability group has good discrimination power.

c. Distractor efficiency is one more important measure of a multiple-choice item’s value in a test, and one that is related to item discrimination. The efficiency of distractor is the extent to which (a) the distractor “lure” a sufficient number of test takers, especially lower ability ones, and (b) those responses are somewhat evenly distributed across all distractors.

SCORING, GRANDING AND GIVING FEEDBACK.
• Scoring.
As you design a classroom test, you must consider how the test will be scored and graded. Your scoring plan reflects the relative weight that you place on each section and items in each section. The integrated-skills class that we have been using as an example focuses on listening and speaking skills with some attention to reading and writing.

• Grading.
Your first thought might be that assigning grades to student performance on this test would be easy: just give an “A” for 90-100 percent, a “B” for 80-89 percent, and so on. Not so fast! Grading is such a thorny issue that all of chapter 11 is devoted to the topic. How you assign latter grades to this test ia a product of
• The country, culture and context of this English classroom.
• Institutional expectations (most of them unwritten),
• Explicit and implicit definitions of grades that you have set forth,
• The relationship you have established with this class and
• Student expectations that have been engendered in previous rest and quizzes in this class.

• Giving Feedback.
A section and grading would not be complete without some consideration of the forms in which you will offer feedback to your student, feedback that you want to here- which is not unusual in the universe of possible formats for periodic classroom tests- consider the multitude of options. You might choose to return the test to the student with one of, or a combination of, any of the possibilities below:
1. A letter grade
2. A total score
3. Four subscores (speaking, listening, reading, writing)
4. For the listening and reading section
a. An indication of correct/incorrect responses
b. Marginal comments
5. For the oral ingterview
a. Scores for each element being rated
b. A checklist of areas needing work
c. Oral feedback after the interview
d. A post-interview conference to go over the result
6. On the esay
a. Scores for each element being rated
b. A checklist of areas needing work
c. Marginal and end of essay comment, suggestion
d. A post test conferences to go over work
e. A self-assessment
7. On all or selected part of the test, peer checking of result
8. A whole class discussion ofresult of the test
9. Individual conferences with each student to review the whole test.

REFERENCES:
Brown. 2004 . LANGUAGE ASSESSMENT “principles and classroom practice”. New York: Longman.

Language Assessment

Kamis, 26 Maret 2020

Douglas Brown about “Designing language classroom test”

Tidak ada komentar:

Posting Komentar

ASSESSING GRAMMAR AND ASSESSING VOCABULARY by James and John

Laporkan Penyalahgunaan