Critical Issue: Ensuring Equity with Alternative Assessments

ISSUE: If students are to be held responsible for achieving high educational standards, it is ethically imperative that educators develop assessment strategies that ensure equity in assessing and interpreting student performance. In order to protect students from unfair and damaging interpretations and to provide parents and communities with an accurate overall picture of student achievement, educators need to be aware of the promise and the challenges inherent in using alternative assessment practices for high-stakes decisions (such as student retention, promotion, graduation, and assignment to particular instructional groups), which have profound consequences for the students affected. Only then will educators be able to build and use an assessment system that is a vehicle for eliminating, as opposed to underscoring, educational inequities. Although alternative assessments can help ensure ethnic, racial, economic, and gender fairness, equity cannot be achieved by reforms to assessment alone. Change will result only from a trio of reform initiatives aimed at ongoing professional development in curriculum and instruction, improved pedagogy, and quality assessment.using testing as a mechanism for sorting and selecting students

OVERVIEW: One of the reasons for the current national disenchantment with standardized multiple-choice tests, secured tests, and other norm-referenced assessments has been the gross inequities that have resulted from inferences based solely on these tests. In many schools, districts, and states, interpretations based on a single test score have been used to place students in low-track classes, to require students to repeat grades, and to deny high school graduation diplomas. The negative personal and societal effects for students are well-documented: exposure to a less challenging curriculum, significantly increased dropout rates, and lives of unemployment and welfare dependency (Oakes, 1986a; Oakes, 1986b; Shepard & Smith, 1986; Jaeger, 1991). Clearly, for access to educational and economic opportunities is antithetical to achieving equity.

At all levels, educators are turning to alternative, performance-based assessments that are backed by criterion-referenced standards. Such assessments help educators gain a deeper understanding of student learning, and enable them to communicate evidence of that learning to parents, employers, and the community at large. These new alternative assessments and standards have been heralded as the answer to a whole host of education ills, including the apparent or real gap in performance between students of different ethnic, socioeconomic, and language backgrounds. Research on learning and assessment and on the prevailing practice of shaping instruction to meet test requirements help build the case for alternative assessment.

Findings from cognitive psychology on the nature of meaningful, engaged learning support the use of alternative assessments that are tied to curriculum and instruction and that emphasize higher-order thinking skills and authentic tasks. Alternative assessments often have high fidelity for the goals of instruction and require students to solve complex, real-life problems. Some educators believe that alternative assessments motivate students to show their best performance--performance that may have been masked in the past by standardized fixed-response tests and by unmotivating content. However, the biggest mistake that schools, districts, and states can make is thinking that exchanging one high-stakes tests for another will result in equitable assessment or elimination of the performance gap between students. Darling-Hammond (1994) believes that if new forms of assessment are to support real and lasting reforms and to close--as opposed to accentuate--the achievement gap between students, they must be developed carefully and used for different purposes than the norm-referenced tests that have preceded. These purposes must be made explicit before the assessment system is built.

top

It is true that new forms of assessment are powerful tools for understanding student performance, particularly in areas that require critical thinking and complex problem solving. However, until high expectations for success, sufficient opportunity to learn, and challenging instruction are the standard educational fare for all children, some evidence (Elliott, 1993; LeMahieu, Eresh, & Wallace, 1992) suggests that alternative assessments may reveal even greater achievement gaps than standardized assessments.

One of the most exciting and liberating things about the current interest in assessment is the recognition that numerous assessment tools are available to schools, districts, and states that are developing new assessment systems. These tools range from standardized fixed-response tests to alternatives such as performance assessment, exhibitions, portfolios, and observation scales. Each type of assessment brings with it different strengths and weaknesses to the problem of fair and equitable assessment. Recognizing the complexity of understanding performance or success for individuals, it is virtually impossible that any single tool will do the job of fairly assessing student performance. Instead, the National Center for Research on Evaluation, Standards, and Student Testing (1996) suggests that an assessment system made up of multiple assessments (including norm-referenced or criterion-referenced assessments, alternative assessments, and classroom assessments) can produce "comprehensive, credible, dependable information upon which important decisions can be made about students, schools, districts, or states." Koelsch, Estrin, and Farr (1995) note that multiple assessment indicators are especially important for assessing the performance of ethnic-minority and language-minority students. The real challenge comes in selecting or developing a combination of assessments that work together as part of a comprehensive assessment system to assess all students equitably within the school community.

The first and most critical step in assessing with equity is determining the purposes for assessing and clarifying whether those purposes are low stakes or high stakes (Winking & Bond, 1995). In many cases, schools, districts, and states have not a single purpose, but multiple purposes--some low stakes and some high stakes--for assessing student performance.

In the low-stakes case of classroom-based assessment, where the primary purpose is determining content coverage and conceptual understanding or diagnosing learning styles, teachers are able to take into account the student's culture, prior knowledge, experiences, and language differences. When preparing and administering assessments, teachers can follow guidelines for equitable assessment in the classroom and make use of accommodations and adaptations to the assessment to ensure that all students have an equal opportunity to demonstrate their abilities and achievement. Teachers also are able to make inferences about student performance and how they must refine their instruction to increase or maintain high performance without calling into question the technical adequacy of the assessment.

However, when tests have high-stakes consequences (such as student retention, promotion, or graduation), it is important to understand ways to maximize equity while not compromising the technical quality of alternative assessments. In high-stakes situations, the technical adequacy of the assessment affects the validity of inferences made regarding the performance of all students. When alternative tests are used for high-stakes purposes, schools--in addition to being concerned about equity when selecting or developing assessments--must take advantage of methods for maximizing fairness in administering and scoring them. Of utmost importance is ensuring that students have had adequate opportunity to learn the material on which they are being tested.

 

Regardless of the level of the assessment effort, equity will never be achieved as long as everyone involved in educating children sees the assessment tools themselves as responsible for ensuring fairness. It is not just the tools, but also the curriculum, instruction, professional development, parent and community involvement, and leadership practices that affect the fairness of assessments and the inferences based on them. Using alternative assessment to assess with equity requires the comprehensive inclusion of each of these elements of the equity equation. Without these supporting systems, new forms of assessment are likely to maintain and perhaps magnify educational inequities.
top

IMPLEMENTATION PITFALLS:

Some types of alternative assessment require teachers to devote considerable time to planning and administering the assessment as well as interpreting student achievement.

Schools may think that the substitution of one high-stakes test for another will result in equitable assessment or the elimination of performance gaps. Yet performance gaps are likely to continue if teaching and assessment strategies remain unchanged. Linn, Baker, and Dunbar (1991) note:

"Gaps in performance among groups exist because of difference in familiarity, exposure, and motivation on the tasks of interest. Substantial changes in instructional strategy and resource allocation are required to give students adequate preparation for complex, time-consuming, open-ended assessments." (p. 18)

Schools may develop and use alternative assesssments with the expectation that a better monitoring system or new forms of assessment alone will address inequitable learning outcomes for students. In actuality, assessment must be integrated with curriculum and instruction in order to promote equity in student learning.

In an effort to address higher-order cognitive skills, schools may develop assessments that have ambiguous performance tasks or requirements. Such tasks or requirements may be interpreted very differently by different cultural groups.

Schools may attempt to use alternative assessments for sorting and classifying students according to ability level instead of for improving instruction and raising student achievement. Darling-Hammond (1994) notes that in order to close the achievement gap, new forms of assessment must be developed carefully and be used for different purposes than norm-referenced tests.

Schools and districts may fail to develop policies for using alternative assessment information to improve instruction. They also may not provide ongoing professional development in alternative assessment for teachers. Winfield and Woodard (1994) note: "Merely setting high standards and developing a new assessment system will not ensure changes in teacher behavior or student performance unless professional development activities and capacity building at the school level are given equal priority" (p. 8).

Bond, Moss, and Carr (1996) caution that assessments--even those deemed to be unbiased--may be used to support a policy or program that does not promote equity:

"Concerns about equity spill over the consensual bonds of validity and bias to include questions about the educational system in which the assessment was used. It is possible for an assessment to be considered unbiased in a technical sense--in the sense that the intended interpretation is equally valid across various groups of concern--and yet be used in service of a policy that fails to promote equity....The question for assessment evaluators is whether an assessment is contributing to or detracting from the fairness of the educational system of which it is a part." (p. 118)

Some teachers, parents, and community members may express resistance to any form of alternative assessment. Teachers, in particular, may object to the additional time necessary for developing and grading performance assessments, and may have difficulty in specifying criteria for judging student work.

Schools, districts, and states may exempt from assessments students who traditionally have not performed well (e.g., second-language learners), thereby avoiding the problem of developing fair measures that provide a picture of the entire school community (Phillips, 1996).

Educators may administer alternative assessments and then rush to blame the test or the children for performance gaps. Instead, educators need to be accountable for student achievement. They also must align assessment with curriculum and instruction in order to improve student learning.

When reporting assessment results, educators must learn to use opportunity-to-learn data with care. Some schools and districts report scores for subgroups of students in the absence of opportunity-to-learn data; other schools develop opportunity-to-learn standards that measure only easy-to-access variables that are ancillary to good instruction (e.g., number of books in the library).

When analyzing test results, pairing isolated opportunity-to-learn variables with subgroup data can lead to erroneous cause-and-effect interpretations. For example, comparing the performance of Hispanic and non-Hispanic students along with the amount of reading assigned outside of school is inappropriate because of the lack of information on other important contextualizing factors.

DIFFERENT POINTS OF VIEW:

Although no educator would say that equitable assessment is not important, there are emerging schools of thought about the nature of equity and how it relates to assessment. In particular, these viewpoints relate to achieving a level playing field for assessing student work. Most researchers and practitioners agree that equity must be a major consideration when planning, developing, and administering assessment systems. Some researchers (Garcia & Pearson, 1994; Johnston, 1992; Estrin, 1993), however, believe that students' cultural learnings and interpretations of the world around them are so tied to their responses that it is unfair not to address these learnings and interpretations directly. These researchers feel that the only way to truly understand a student's performance is through assessments that are situated in the local realities of schools, classrooms, teachers, and students. Proponents of situated assessment argue that it is unlikely that large-scale, high-stakes assessment could ever equitably measure student performance. They see familiar raters (the students' teacher or panels of individuals) as the best able to judge a students' work because familiarity is necessary to understand the response patterns and culturally tied conceptions of testing and learning that each student brings to the assessment

top