Selasa, 12 Mei 2020

ASSESSING GRAMMAR AND ASSESSING VOCABULARY by James and John

Assignment of meeting 15.

“SUMMARY ASSESSING GRAMMAR 1-291”

 

Differing notions of ‘grammar’ for assessment.

Grammar and linguistics

When most language teachers, second language acquisition (SLA) researchers and language testers think of ‘grammar’, they call to mind one of the many paradigms (e.g., ‘traditional grammar’ or ‘universal grammar’) available for the study and analysis of language. Such linguistic grammars are typically derived from data taken from native speakers and minimally constructed to describe well-formed utterances within an individual framework. These grammars strive for internal consistency and are mainly accessible to those who have been trained in that particular paradigm.

 

Form-based perspectives of language

One of the oldest theories to describe the structure of language is traditional grammar. Originally based on the study of Latin and Greek, traditional grammar drew on data from literary texts to provide rich and lengthy descriptions of linguistic form. Unlike some other syntax to centric theories, traditional grammar also revealed the linguistic meanings of these forms and provided information on their usage in a sentence (Celce-Murcia and Larsen-Freeman, 1999). Traditional grammar supplied an extensive set of prescriptive rules along with the exceptions.

 

Form- and use-based perspectives of language

The three theories of linguistic analysis described thus far have provided insights to L2 educators on several grammatical forms. These insights provide information to explain what structures are theoretically possible in a language. Other linguistic theories, however, are better equipped to examine how speakers and writers actually exploit linguistic forms during language use. For example, if we wish to explain how seemingly similar structures like I like to read and I like reading connote different meanings, we might turn to those theories that study grammatical form and use interfaces.

 

Communication-based perspectives of language

Many of the communication-based approaches to language have had an extensive impact on how L2 teachers and testers currently view grammar. Given the goals of this book, I will look at only Austin’s (1962) speech act theory and Halliday and Hasan’s (1976) systemic-functional grammar as they might relate to a definition of grammar for assessment. the early 1960s, linguistic philosophers looked at issues of language and meaning. Austin (1962) proposed that the action performed by pronouncing an utterance during interaction involved more than the literal conveyance of information. Utterances are also said to ‘do’ things in a language context; they have a language function. For example, when a person being to a dinner party says ‘I’ll be there at eight’, that person is not only conveying information on his or her expected arrival time (literal meaning), but also accepting the invitation and committing to do something (language function). Austin (1962) maintained that an utterance involves three related speech acts. First, the action of an utterance involves the production of a meaningful proposition, a locutionary act – the person’s arrival at eight.

 

What is pedagogical grammar?

A pedagogical grammar represents an eclectic, but principled description of the target-language forms, created for the express purpose of helping teachers understand the linguistic resources of communication. These grammars provide information about how language is organized and offer relatively accessible ways of describing complex, linguistic phenomena for pedagogical purposes.

 

Research on L2 grammar teaching, learning and assessment.

 

Research on L2 teaching and learning

Over the years, several of the questions mentioned above have intrigued language teachers, inspiring them to experiment with different methods, approaches and techniques in the teaching of grammar. To determine if students had actually learned under the different conditions, teachers have used diverse forms of assessment and drawn their own conclusions about their students. In so doing, these teachers have acquired a considerable amount of anecdotal evidence on the strengths and weaknesses of using different practices to implement L2 grammar instruction. These experiences have led most teachers nowadays to ascribe to an eclectic approach to grammar instruction, whereby they draw upon a variety of different instructional techniques, depending on the individual needs, goals and learning styles of their students.

 

Comparative methods studies

The comparative methods studies sought to compare the effects of different language-teaching methods on the acquisition of an L2. These studies occurred principally in the 1960s and 1970s, and stemmed from a reaction to the grammar-translation method, which had dominated language instruction during the first half of the twentieth century. More generally, these studies were in reaction to form-focused instruction (referred to as ‘focus on forms’ by Long, 1991), which used a traditional structural syllabus of grammatical forms as the organizing principle for L2 instruction. According to Ellis (1997), form-focused instruction contrasts with meaning-focused instruction in that meaning-focused instruction emphasizes the communication of messages (i.e., the act of making a suggestion and the content of such a suggestion) while formfocused instruction stresses the learning of linguistic forms.

 

Non-interventionist studies

While some language educators were examining different methods of teaching grammar in the 1960s, others were feeling a growing sense of dissatisfaction with the central role of grammar in the L2 curriculum. As a result, questions regarding the centrality of grammar were again raised by a small group of L2 teachers and syllabus designers who felt that the teaching of grammar in any form simply did not produce the desired classroom results. Newmark (1966), in fact, asserted that grammatical analysis and the systematic practice of grammatical forms were actually interfering with the process of L2 learning, rather than promoting it, and if left uninterrupted, second language acquisition, similar to first language acquisition, would proceed naturally.

 

Empirical studies in support of non-intervention

The non-interventionist position was examined empirically by Prabhu (1987) in a project known as the Communicational Teaching Project (CTP) in southern India. This study sought to demonstrate that the development of grammatical ability could be achieved through a task-based, rather than a form-focused, approach to language teaching, provided that the tasks required learners to engage in meaningful communication. In the CTP, Prabhu (1987) argued against the notion that the development of grammatical ability depended on a systematic presentation of grammar followed by planned practice. However, in an effort to evaluate the CTP program, Beretta andDavies (1985) compared classes involved in the CTP with classes outside the project taught with a structural-oral-situational method. They administered a battery of tests to the students, and found that the CTP learners outperformed the control group on a task-based test, whereas the non-CTP learners did better on a traditional structure test.

 

Possible implications of fixed developmental order to language assessment

The notion that structures appear to be acquired in a fixed developmental order and in a fixed developmental sequence might conceivably have some relevance to the assessment of grammatical ability. First of all, these findings could give language testers an empirical basis for constructing grammar tests that would account for the variability inherent in a learner’s interlanguage. In other words, information on the acquisitional order of grammatical items could conceivably serve as a basis for selecting grammatical content for tests that aim to measure different levels of developmental progression, such as Chang (2002, 2004) did in examining the underlying structure of a test that attempted to measure knowledge of the relative clauses.

 

Problems with the use of development sequences as a basis for assessment

Although developmental sequence research offers an intuitively appealing complement to accuracy-based assessments in terms of interpreting test scores, I believe this method is fraught with a number of serious problems, and language educators should use extreme caution in applying this method to language testing. This is because our understanding of natural acquisitional sequences is incomplete and at too early a stage of research to be the basis for concrete assessment recommendations (Lightbown, 1985; Hudson, 1993). First, the number of grammatical sequences that show a fixed order of acquisition is very limited, far too limited for all but the most restricted types of grammar tests.

 

Interventionist studies

Not all L2 educators are in agreement with the non-interventionist position to grammar instruction. In fact, several (e.g., Schmidt, 1983; Swain, 1991) have maintained that although some L2 learners are successful in acquiring selected linguistic features without explicit grammar instruction, the majority fail to do so. Testimony to this is the large number of non-native speakers who emigrate to countries around the world, live there all their lives and fail to learn the target language, or fail to learn it well enough to realize their personal, social and long-term career goals. In these situations, language teachers affirm that formal grammar instruction of some sort can be of benefit.

 

Empirical studies in support of intervention

Aside from anecdotal evidence, the non-interventionist position has come under intense attack on both theoretical and empirical grounds with several SLA researchers affirming that efforts to teach L2 grammar typically results in the development of L2 grammatical ability. Hulstijn (1989) and Alanen (1995) investigated the effectiveness of L2 grammar instruction on SLA in comparison with no formal instruction. They found that when coupled with meaning-focused instruction, the formal instruction of grammar appears to be more effective than exposure to meaning or form alone. Long (1991) also argued for a focus on both meaning and form in classrooms that are organized around meaningful and sustained communicative interaction.

 

Research on instructional techniques and their effects on acquisition

Much of the recent research on teaching grammar has focused on four types of instructional techniques and their effects on acquisition. Although a complete discussion of teaching interventions is outside the purview of this book (see Ellis, 1997; Doughty and Williams, 1998), these techniques include form- or rule-based techniques, input-based techniques, feedback-based techniques and practice-based techniques (Norris and Ortega, 2000). Form- or rule-based techniques revolve around the instruction of grammatical forms.

 

Grammar processing and second language development

In the grammar-learning process, explicit grammatical knowledge refers to a conscious knowledge of grammatical forms and their meanings. Explicit knowledge is usually accessed slowly, even when it is almost fully automatized (Ellis, 2001b). DeKeyser (1995) characterizes grammatical instruction as ‘explicit’ when it involves the explanation of a rule or the request to focus on a grammatical feature. Instruction can be explicitly deductive, where learners are given rules and asked to apply them, or explicitly inductive, where they are given samples of language from which to generate rules and make generalizations.

 

Implications for assessing grammar

The studies investigating the effects of teaching and learning on grammatical performance present a number of challenges for language assessment. First of all, the notion that grammatical knowledge structures can be differentiated according to whether they are fully automatized (i.e., implicit) or not (i.e., explicit) raises important questions for the testing of grammatical ability (Ellis, 2001b). Given the many purposes of assessment, we might wish to test explicit knowledge of grammar, implicit knowledge of grammar or both. For example, in certain classroom contexts, we might want to assess the learners’ explicit knowledge of one or more grammatical forms, and could, therefore, ask learners to answer multiple-choice or short-answer questions related to these forms.

 

 

          The role of grammar in models of communicative language ability         

The role of grammar in models of communicative competence

Every language educator who has ever attempted to measure a student’s communicative language ability has wondered: ‘What exactly does a student need to “know” in terms of grammar to be able to use it well enough for some real-world purpose?’ In other words, they have been faced with the challenge of defining grammar for communicative purposes. To complicate matters further, linguistic notions of grammar have changed over time, as we have seen, and this has significantly increased number of components that could be called ‘grammar’. In short, definitions of grammar and grammatical knowledge have changed over time and across context, and I expect this will be no different in the future.

 

Rea-Dickins’ definition of grammar

In discussing more specifically howgrammatical knowledge might be tested within a communicative framework, Rea-Dickins (1991) defined ‘grammar’ as the single embodiment of syntax, semantics and pragmatics. She argued against Canale and Swain’s (1980) and Bachman’s (1990b) multi-componential view of communicative competence on the grounds that componential representations overlook the interdependence and interaction between and among the various components. She further stated that in Canale and Swain’s (1980) model, the notion of grammatical competence was limited since it defined grammar as ‘structure’ on the one hand and as ‘structure and semantics’ on the other, but ignored the notion of ‘structure as pragmatics’. Similarly, she added that in Bachman’s (1990b) model, grammar was defined as structure at the sentence level and as cohesion at the suprasentential level, but this model failed to account for the pragmatic dimension of communicative grammar.

 

Larsen-Freeman’s definition of grammar

Another conceptualization of grammar that merits attention is Larsen- Freeman’s (1991, 1997) framework for the teaching of grammar in com municative language teaching contexts. Drawing on several linguistic theories and influenced by language teaching pedagogy, she has also characterized grammatical knowledge along three dimensions: linguistic

form, semantic meaning and pragmatic use. Form is defined as both morphology, or how words are formed, and syntactic patterns, or how words are strung together. This dimension is primarily concerned with linguistic accuracy. The meaning dimension describes the inherent or literal message conveyed by a lexical item or a lexico-grammatical feature. This dimension is mainly concerned with the meaningfulness of an utterance.

 

What is meant by ‘grammar’ for assessment purposes?

Regardless of the assessment purpose, if we wish to make inferences about grammatical ability on the basis of a grammar test or some other form of assessment, it is important to know what we mean by ‘grammar’ when attempting to specify components of grammatical knowledge for measurement purposes. With this goal in mind, we need a definition of grammatical knowledge that is broad enough to provide a theoretical basis for the construction and validation of tests in a number of contexts. At the same time, we need our definition to be precise enough to distinguish it from other areas of language ability.

 

 

Towards a definition of grammatical ability

Defining grammatical constructs

Although our basic underlying model of grammar will remain the same in all testing situations (i.e., grammatical form and meaning), what it means to ‘know’ grammar for different contexts will most likely change (see Chapelle, 1998). In other words, the type, range and scope of grammatical features required to communicate accurately and meaningfully will vary from one situation to another. For example, the type of grammatical knowledge needed to write a formal academic essay would be very different from that needed to make a train reservation. Given the

many possible ways of interpreting what it means to ‘know’ grammar, it is important that we define what we mean by ‘grammatical knowledge’ for any given testing situation. A clear definition of what we believe it means to ‘know’ grammar for a particular testing context will then allow us to construct tests that measure grammatical ability.

 

Definition of key terms

Before continuing this discussion, it might be helpful if I clarified some of the key terms.

 

Knowledge of phonological or graphological form and meaning

Knowledge of phonological/graphological form enables us to understand and produce features of the sound or writing system, with the exception of meaning-based orthographies such as Chinese characters, as they are used to convey meaning in testing or language-use situations.

 

Knowledge of lexical form and meaning

Knowledge of lexical form enables us to understand and produce those features of words that encode grammar rather than those that reveal meaning. This includes words that mark gender (e.g., waitress), countability (e.g., people) or part of speech (e.g., relate, relation). For example,

when the word think in English is followed by the preposition about before a noun, this is considered the grammatical dimension of lexis, representing a co-occurrence restriction with prepositions. One area of lexical form that poses a challenge to learners of some languages is word formation. This includes compounding in English with a noun + noun or a verb + particle pattern.

 

Knowledge of morphosyntactic form and meaning

Knowledge of morphosyntactic form permits us to understand and produce both the morphological and syntactic forms of the language. This includes the articles, prepositions, pronouns, affixes (e.g., -est), syntactic structures, word order, simple, compound and complex sentences, mood, voice and modality. A learner who knows the morphosyntactic form of the English conditionals would know that: (1) an if-clause sets up a condition and a result clause expresses the outcome; (2) both clauses can be in the sentence-initial position in English; (3) if can be deleted under certain conditions as long as the subject and operator are inverted; and (4) certain tense restrictions are imposed on if and result clauses.

 

Knowledge of cohesive form and meaning

Knowledge of cohesive form enables us to use the phonological, lexical and morphosyntactic features of the language in order to interpret and express cohesion on both the sentence and the discourse levels. Cohesive form is directly related to cohesive meaning through cohesive devices (e.g., she, this, here) which create links between cohesive forms and their referential meanings within the linguistic environment or the surrounding co-text. Halliday and Hasan (1976, 1989) list a number of grammatical forms for displaying cohesive meaning.

 

Knowledge of information management form and meaning

Knowledge of information management formallows us to use linguistic formsas a resource for interpreting and expressing the information struc ture of discourse. Some resources that help manage the presentation of information include, for example, prosody, word order, tense-aspect and parallel structures. These forms are used to create information management meaning

 

Knowledge of interactional form and meaning

Knowledge of interactional form enables us to understand and use linguistic forms as a resource for understanding and managing talk-ininteraction. These forms include discourse markers and communication management strategies. Discourse markers consist of a set of adverbs, conjunctions and lexicalized expressions used to signal certain language functions.

 

Designing test tasks to measure L2 grammatical ability.

How does test development begin?

Every grammar-test development project begins with a desire to obtain (and often provide) information about how well a student knows grammar in order to convey meaning in some situation where the target language is used. The information obtained from this assessment then

forms the basis for decision-making. Those situations in which we use the target language to communicate in real life or in which we use it for instruction or testing are referred to as the target language use (TLU) situations (Bachman and Palmer, 1996). Within these situations, the tasks or activities requiring language to achieve a communicative goal are called the target language use tasks.

 

What do we mean by ‘task’?

The notion of ‘task’ in language-learning contexts has been conceptualized in many different ways over the years. Traditionally, ‘task’ has referred to any activity that requires students to do something for the intent purpose of learning the target language. A task then is any activity

(i.e., short answers, role-plays) as long as it involves a linguistic or nonlinguistic (circle the answer) response to input. Traditional learning or teaching tasks are characterized as having an intended pedagogical purpose – which may or may not be made explicit; they have a set of instructions that control the kind of activity to be performed; they contain input (e.g., questions); and they elicit a response.

 

What are the characteristics of grammatical test tasks?

As the goal of grammar assessment is to provide as useful a measurement as possible of our students’ grammatical ability, we need to design test tasks in which the variability of our students’ scores is attributed to the differences in their grammatical ability, and not to uncontrolled or irrelevant variability resulting from the types of tasks or the quality of the tasks that we have put on our tests. As all language teachers know, the kinds of tasks we use in tests and their quality can greatly influence how students will perform.

 

The Bachman and Palmer framework

Bachman and Palmer’s (1996) framework of task characteristics represents the most recent thinking in language assessment of the potential relationships between task characteristics and test performance. In this framework, they outline five general aspects of tasks, each of which is characterized by a set of distinctive features. These five aspects describe characteristics of (1) the setting, (2) the test rubrics, (3) the input, (4) the expected response and (5) the relationship between the input and response.

 

Describing grammar test tasks

When language teachers consider tasks for grammar tests, they call to mind a large repertoire of task types that have been commonly used in teaching and testing contexts. We now know that these holistic task types constitute collections of task characteristics for eliciting performance

and that these holistic task types can vary on a number of dimensions. We also need to remember that the tasks we include on tests should strive to match the types of language-use tasks found in real-life or language instructional domains.

 

Selected-response task types

Selected-response tasks present input in the form of an item, and test takers are expected to select the response. Other than that, all other task characteristics can vary. For example, the form of the input can be language, non-language or both, and the length of the input can vary from a word to larger pieces of discourse. In terms of the response, selected response tasks are intended to measure recognition or recall of grammatical form and/or meaning.

 

Limited-production task types

Limited-production tasks are intended to assess one or more areas of grammatical knowledge depending on the construct definition. Unlike selected-response items, which usually have only one possible answer, the range of possible answers for limited-production tasks can, at times, be large – even when the response involves a single word.

 

Developing tests to measure L2 grammatical ability

What makes a grammar test ‘useful’?

We concluded in the last chapter that the goal of every grammar test was to obtain (and provide) information on how well a student knows or can use grammar to convey meaning in some situation where the target language is used. The responses to the test items can then be used as a basis for assigning scores and for making inferences about the student’s underlying grammatical ability. We discussed these responses in terms of inferences because it is not possible to observe a person’s grammatical ability directly; rather, we must infer the underlying ability from responses to questions or from samples of actual performance.

 

The quality of reliability

Similarly, the scores from tests or components of tests can also be characterized as being reliable when the tests provide the same results every time we administer them, regardless of the conditions under which they are administered.

 

The quality of construct validity

The second quality that all ‘useful’ tests possess is construct validity. Bachman and Palmer (1996) define construct validity as ‘the extent to which we can interpret a given test score as an indicator of the ability(ies), or construct(s), we want to measure. Construct validity also has to do with the domain of generalization to which our score interpretations generalize’ (p. 21). In other words, construct validity not only refers to the meaningfulness and appropriateness of the interpretations we make based on test scores, but it also pertains to the degree to which the score-based interpretations can be extrapolated beyond the testing situation to a particular TLU domain (Messick 1993).

 

The quality of authenticity

A third quality of test usefulness is authenticity, a notion much discussed in language testing since the late 1970s, when communicative approaches to language teaching were first taking root. Building on these discussions, Bachman and Palmer (1996) refer to ‘authenticity’ as the

degree of correspondence between the test-task characteristics and the TLU task characteristics. Given the framework for test-task characteristics discussed in Chapter 5, they provide a systematic way of matching test tasks with TLU tasks in terms of the features of the test setting,

rubrics, input, expected response and the relationship between the input and response.

 

The quality of interactiveness

A fourth quality of test usefulness outlined by Bachman and Palmer (1996) is interactiveness. This quality refers to the degree to which the aspects of the test-taker’s language ability we want to measure (e.g., grammatical knowledge, language knowledge) are engaged by the testtask

characteristics (e.g, the input response, and relationship between the input and response) based on the test constructs. In other words, the task should engage the characteristics we want to measure (e.g., grammatical knowledge) given the test purpose, and nothing else (e.g., topical

knowledge, affective schemata); otherwise, this may mask the very constructs we are trying to measure. In the case of grammar assessment, test tasks can be characterized as ‘interactive’ to the extent that they require individuals to draw on and manage their cognitive and metacognitive

strategies (i.e., their strategic competence) in order to use grammatical knowledge accurately and meaningfully.

 

The quality of impact

Testing plays an important role in society. Tests serve as gate-keeping devices or doors to opportunity. They can be used to punish and to praise. It is, therefore, important to recognize that tests reflect and represent the social, cultural and political values of any given society, and in the evaluation of test usefulness, we must take into consideration the possible consequences that may ensue from the decision to use test results for decision-making. Bachman and Palmer (1996) refer to the degree to which testing and test score decisions influence all aspects of society and the individuals within that society as test impact.

 

The quality of practicality

Scores from a grammar test could be highly reliable and provide a basis for making valid inferences, but at the same time completely lacking in practicality. It may be completely beyond our means with respect to the available human, material or time resources. Test practicality is not a quality of a test itself, but is a function of the extent to which we are able to balance the costs associated with designing, developing, administering, and scoring a test in light of the available resources (Bachman, personal communication, 2002).

 

Overview of grammar-test construction

Each testing situation is specific unto itself, with a specific purpose, a specific audience and a specific set of parameters that will affect the test design and development process. As a result, there is no one ‘right’ way to develop a test; nor are there any recipes for ‘good’ tests that could generalize to all situations. There are, however, several frameworks of test development that have been proposed (e.g., Alderson, Clapham and Wall, 1995; Bachman and Palmer, 1996; Brown, 1996; Davidson and Lynch, 2002) which serve to guide the test-development process so that the qualities of test usefulness will not be ignored.

 

Illustrative tests of grammatical ability

The First Certificate in English Language Test (FCE)

Purpose

The First Certificate in English (FCE) exam was first developed by the University of Cambridge Local Examinations Syndicate (UCLES, now Cambridge ESOL) in 1939 and has been revised periodically ever since. This exam is the most widely taken Cambridge ESOL examination with an annual candidature of over 270,000 (see http://www.cambridgeesol.org/ exam/index.cfm). The purpose of the FCE (Cambridge ESOL, 2001a) is to assess the general English language proficiency of learners as measured by their abilities in reading, writing, speaking, listening, and knowledge of the lexical and grammatical systems of English (Cambridge ESOL, 1995, p. 4). More specifically, the FCE is a level-three exam in the Cambridge main suite of exams, and consists of five compulsory subtests or ‘papers’: reading, writing, use of English, listening and speaking (Cambridge ESOL, 1996, p. 8). Students who pass the FCE are assumed to have sufficient proficiency to handle routine office jobs (clerical, managerial) and to take courses given in English (Cambridge ESOL, 2001a, p. 6). Given that the FCE can be used as certification of English language proficiency for certain types of jobs, it is considered a high-stakes test.

 

The Comprehensive English Language Test (CELT)

Purpose

The Comprehensive English Language Test (CELT) (Harris and Palmer, 1970a, 1986) was designed to measure the English language ability of nonnative speakers of English. The authors claim in the technical manual (Harris and Palmer, 1970b) that this test is most appropriate for students at the intermediate or advanced levels of proficiency. English language proficiency is measured by means of a structure subtest, a vocabulary subtest and a listening subtest. According to the authors, these subtests can be used alone or in combination (p. 1). Scores from the CELT have been used to make decisions related to placement in a language program, acceptance into a university and achievement in a language course (Harris and Palmer, 1970b, p. 1), and for this reason, it may be considered a high-stakes test. One or more subtests of the CELT have also been used as a measure of English language proficiency in SLA research.

 

Learning-oriented assessments of grammatical ability

What is learning-oriented assessment of grammar?

In reaction to conventional testing practices typified by large-scale, discrete- point, multiple-choice tests of language ability, several educators (e.g., Herman, Aschbacher and Winters, 1992; Short, 1993; Shohamy, 1995; Shepard, 2000) have advocated reforms so that assessment practices might better capture educational outcomes and might be more consistent with classroom goals, curricula and instruction.

 

Implementing learning-oriented assessment of grammar

Considerations from grammar-testing theory

The development procedures for constructing large-scale assessments of grammatical ability discussed in Chapter 6 are similar to those needed to develop learning-oriented assessments of grammar for classroom purposes with the exception that the decisions made from classroom assessments will be somewhat different due to the learning-oriented mandate of classroom assessment. Also, given the usual low-stakes nature of the decisions in classroom assessment, the amount of resources that needs to be expended is generally less than that required for large-scale assessment. In this section, without repeating what was discussed in Chapter 6, I will highlight some of the implications this mandate might have for test design and operationalization.

 

Considerations from L2 learning theory

Given that learning-oriented assessment involves the collection and interpretation of evidence about performance so that judgments can be made about further language development, learning-oriented assessment of grammar needs to be rooted not only in a theory of grammar testing or language proficiency, but also in a theory of L2 learning. What is striking in the literature is that models of language ability rarely refer to models of language learning, and models of language learning rarely make reference to models of language ability. In learning-oriented assessment, the consideration of both perspectives is critical.

 

Illustrative example of learning-oriented assessment

Let us now turn to an illustration of a learning-oriented achievement test of grammatical ability.

 

Making assessment learning-oriented

The On Target achievement tests were designed with a clear learning mandate. The content of the tests had to be strictly aligned with the content of the curriculum. This obviously had several implications for the test design and its operationalization. From a testing perspective, the primary purpose of the Unit 7 achievement test was to measure the students’ explicit as well as their implicit knowledge of grammatical form and meaning on both the sentence and discourse levels.

 

Challenges and new directions in assessing grammatical ability

The state of grammar assessment

In the last fifty years, language testers have dedicated a great deal of time to discussing the nature of language proficiency and the testing of the four skills, the qualities of test usefulness (i.e., reliability, authenticity), the relationships between test-taker or task characteristics and performance, and numerous statistical procedures for examining data and providing evidence of test validity

 

Challenge 1: Defining grammatical ability

One major challenge revolves around how grammatical ability has been defined both theoretically and operationally in language testing. As we saw in Chapters 3 and 4, in the 1960s and 1970s language teaching and language testingmaintainedastrongsyntactocentricviewoflanguagerootedlargely in linguistic structuralism. Moreover, models of language ability, such as those proposed by Lado (1961) and Carroll (1961), had a clear linguistic focus, and assessment concentrated on measuring language elements – defined in terms of morphosyntactic forms on the sentence level – while performing language skills.

 

Challenge 2: Scoring grammatical ability

A second challenge relates to scoring, as the specification of both form and meaning is likely to influence the ways in which grammar assessments are scored. As we discussed in Chapter 6, responses with multiple criteria for correctness may necessitate different scoring procedures. For example, the use of dichotomous scoring, even with certain selectedresponse items, might need to give way to partial-credit scoring, since some wrong answers may reflect partial development either in form or meaning. As a result, language educators might need to adapt their scoring procedures to reflect the two dimensions of grammatical knowledge.

 

Challenge 3: Assessing meanings

The third challenge revolves around ‘meaning’ and how ‘meaning’ in a model of communicative language ability can be defined and assessed. The ‘communicative’ in communicative language teaching, communicative language testing, communicative language ability, or communicative

competence refers to the conveyance of ideas, information, feelings, attitudes and other intangible meanings (e.g., social status) through language.

Challenge 4: Reconsidering grammar-test tasks

The fourth challenge relates to the design of test tasks that are capable of both measuring grammatical ability and providing authentic and engaging measures of grammatical performance. Since the early 1960s, language educators have associated grammar tests with discrete-point, multiple-choice tests of grammatical form. These and other ‘traditional’ test tasks (e.g., grammaticality judgments) have been severely criticized for lacking in authenticity, for not engaging test-takers in language use, and for promoting behaviors that are not readily consistent with communicative language teaching.

 

Challenge 5: Assessing the development of grammatical

ability

The fifth challenge revolves around the argument, made by some researchers, that grammatical assessments should be constructed, scored and interpreted with developmental proficiency levels in mind. This notion stems from the work of several SLA researchers (e.g. Clahsen, 1985; Pienemann and Johnson, 1987; Ellis, 2001b) who maintain that the principal finding from years of SLA research is that structures appear to be acquired in a fixed order and a fixed developmental sequence. Furthermore, instruction on forms in non-contiguous stages appears to be ineffective. As a result, the acquisitional development of learners, they argue, should be a major consideration in the L2 grammar testing.

 

Final remarks

Despite loud claims in the 1970s and 1980s by a few influential SLA researchers that instruction, and in particular explicit grammar instruction, had no effect on language learning, most language teachers around the world never really gave up grammar teaching. Furthermore, these claims have instigated an explosion of empirical research in SLA, the results of which have made a compelling case for the effectiveness of certain types of both explicit and implicit grammar instruction. This research has also highlighted the important role that meaning plays in learning grammatical forms.

 

 

“SUMMARY ASSESSING VOCABULARY 1-146”

 

The place of vocabulary in language assessment

Recent trends in language testing

However, scholars in the field of language testing have a rather different perspective on vocabulary-test items of the conventional kind. Such items fit neatly into what language testers call the discrete point approach to testing. This involves designing tests to assess whether learners have knowledge of particular structural elements of the language: word meanings, word forms, sentence patterns, sound contrasts and so on. In the last thirty years of the twentieth century, language testers progressively moved away from this approach, to the extent that such tests are now quite out of step with current thinking about how to design language tests, especially for proficiency assessment.

 

Three dimensions of vocabulary assessment

Up to this point, I have outlined two contrasting perspectives on the role of vocabulary in language assessment. One point of view is that it is perfectly sensible to write tests that measure whether learners know the meaning and usage of a set of words, taken as independent semantic units. The other view is that vocabulary must always be assessed in the context of a language-use task, where it interacts in a natural way with other components of language knowledge. To some extent, the two views are complementary in that they relate to different purposes of assessment.

 

Discrete - embedded

The first dimension focuses on the construct which underlies the assessment instrument. In language testing, the term construct refers to the mental attribute or ability that a test is designed to measure. In the case of a traditional vocabulary test, the construct can usually be labelled as `vocabulary knowledge' of some kind. The practical significance of defining the construct is that it allows us to clarify the meaning of the test results. Normally we want to interpret the scores on a vocabulary test as a measure of some aspect of the learners' vocabulary knowledge, such as their progress in learning words from the last several units in the course book, their ability to supply derived forms of base words (like scientist and scientific, from science), or their skill at inferring the meaning of unknown words in a reading passage. Thus, a discrete test takes vocabulary knowledge as a distinct construct, separated from other components of language competence.

 

Selective - comprehensive

The second dimension concerns the range of vocabulary to be included in the assessment. A conventional vocabulary test is based on a set of target words selected by the test-writer, and the test-takers are assessed according to how well they demonstrate their knowledge of the meaning or use of those words. This is what I call a selective vocabulary measure. The target words may either be selected as individual words and then incorporated into separate test items, or alternatively the test-writer first chooses a suitable text and then uses certain words from it as the basis for the vocabulary assessment.

 

Context-independent - context-dependent

The role of context, which is an old issue in vocabulary testing, is the basis for the third dimension. Traditionally contextualisation has meant that a word is presented to test-takers in a sentence rather than as an isolated element. From a contemporary perspective, it is necessary to broaden the notion of context to include whole texts and, more generally, discourse. In addition, we need to recognise that contextualisation is more than just a matter of the way in which vocabulary is presented. The key question is to what extent the test takers are being assessed on the basis of their ability to engage with the context provided in the test.

 

An overview of the book

The three dimensions are not intended to form a comprehensive model of vocabulary assessment. Rather, they provide a basis for locating the variety of assessment procedures currently in use within a common framework and, in particular, they offer points of contact between tests which treat words as discrete units and ones that assess vocabulary more integratively in a task-based testing context. At various points through the book I refer to the dimensions and exemplify them. Since a large proportion of work on vocabulary assessment to date has involved instruments which are relatively discrete, selective and context independent in nature, this approach may seem to be predominant in several of the following chapters. However, my aim is to present a balanced view of the subject, and I discuss measures that are more embedded, comprehensive and context dependent wherever the opportunity arises, and especially in the last two

chapters of the book.

Chapter 5 presents case studies of four vocabulary tests:

. Nation's Vocabulary Levels Test;

. Meara and Jones's Eurocentres Vocabulary Size Test;

. Paribakht and Wesche's Vocabulary Knowledge Scale; and

. the vocabulary items in the Test of English as a Foreign Language (TOEFL).

 

 

 

 

 

 

 

 

 

 

 

 

REFERENCES:

 

Purpura, james. 2004. ASSESSING GRAMMAR. United Kingdom: University Press Cambridge.

 

Read, John. 2000. ASSESSING VOCABULARY. United Kingdom: University Press Cambridge.

 

ASSESSING GRAMMAR AND ASSESSING VOCABULARY by James and John

Assignment of meeting 15. “SUMMARY ASSESSING GRAMMAR 1-291”   Differing notions of ‘grammar’ for assessment. Grammar and linguisti...