3.2.3. Self-Assessment
3.2.3.1. Introduction
Based on work carried out since the late 1970s, various authors and researchers agree on self-assessment as a vital part of learner autonomy (Henner-Stanchina & Holec 1985:98; Dickinson 1987:16; Blanche 1988:75; Harris 1997:12), providing the opportunity for learners to assess their own progress and thus helping them to focus on their own learning. Hunt, Gow & Barnes even claim that without learner self-evaluation and self-assessment "... there can be no real autonomy" (1989:207). Rea (1981) sees self-appraisal as helping the learner become aware of his/her responsibilities in planning, executing and monitoring his/her language learning activities, and Oscarson agrees on such a formative prime aim, adding a more summative secondary aim of enabling the learner "to assess his total achievement at the end of a course or course unit" (1978). Dickinson points out that this does not necessarily devalue or conflict with external evaluation, which still has relevance for supplying official certification of learning (1987:136). Rather, as Dickinson & Carver observe:

A language course can only deal with a small fraction of the foreign language; therefore one objective of language courses should be to teach learners how to carry on learning the language independently. Part of the training learners need for this purpose is training in self-assessment and self-monitoring. (1980)

3.2.3.2.  Self-Assessment: history
Self-assessment research in language education has had two main goals (cf. Heidt 1979; Oscarson 1984): i) the investigation of possible ways of realising the goal of learner participation in matters of assessment and evaluation; ii) the investigation of the degree to which self-assessment instruments and procedures yield relevant and dependable results. The following brief survey of results in these fields (table XIX, below) is based on Oscarson's article on self-assessment of foreign and second language proficiency (1997:175-87):

TABLE XIX: SURVEY OF FINDINGS IN SELF-ASSESSMENT RESEARCH (BASED ON OSCARSON 1997:175-187).

Researcher

Results: General

Dickinson (1978)

Self-assessment is an area of second language learning for which the learner can potentially take responsibility

Lewkowicz & Moon (1985)

Considerable physical constraints may hinder effective learner-centred methodology, including self-assessment.

Blanche (1988)

"...most learners would be likely to find it comparatively easier to assess their communicative skills."

Blanche (1988)

In an examination of eight studies (Lee 1981; Low 1981; Fok 1981; Bournemouth Eurocentre 1982; von Elek 1981; 1982; Heindler 1980; Heidt 1979), results showed that self-evaluation practices appeared to have increased learners' motivation, and that more proficient subjects tended to underrate their linguistic abilities (cf. Ferguson 1978; Evers 1981; Achara 1980; Heindler 1980). Overestimation involved weak students to a greater extent than high achievers (cf. Ferguson 1978, Heindler 1980; Blanche 1985).

Strong-Krause (1997)

The predictive value of self-assessment placement instruments increases with the specificity of the task.

                                       Results: Validation studies

Shrauger & Osberg (1981)

The relative accuracy of self-assessment is at least comparable to other assessment methods.

Oscarson (1978)

Adult learners studying EFL are able to make fairly accurate appraisals of their linguistic ability using a variety of scaled descriptions of performance as rating instruments.

Clarke (1981)

Most of the co-relations between FSI (Foreign Service Institute) interview scores and listening and reading test scores, and self-assessment scores of U.S. college students proved to be enough to warrant the use of the self-estimates in the survey of proficiency in designated foreign languages for U.S. college students.

LeBlanc & Painchaud (1985)

High co-relations achieved (c. 0.80) by first-year undergraduate students led to the use of self-assessment as a placement instrument in the University of Ottawa's second language programmes.

Blanche (1988)

There is an emerging pattern of consistent overall agreement between self-assessments and ratings based on a variety of external criteria (Raasch 1979; 1980; Palmer & Bachman 1981; Evers 1981; LeBlanc & Painchaud 1985; Achara 1980; von Elek 1981; Rea 1981; Blanche 1985; Oscarson 1978; 1980).

Bachman & Palmer (1989)

Language users are more aware of the areas in which they have difficulty than of the areas they find easy. Self-ratings can be reliable and valid measures of communicative language abilities.

Pierce, Swain & Hart (1993)

Only a weak relationship was found between self-assessment by French Grade 8 immersion students and language test results.

Blue (1994)

There was a poor match between teachers' and students' assessments of their English for Academic purposes learning at university level.

Hargan (1994)

A self-assessment procedure at university level resulted in much the same level placements as indicated by a traditional multiple choice test. Krauser (1991) reports similar findings.

Moritz (1995)

It seems unreasonable to employ self-assessment as a measuring tool in any situation which entails a comparison of students' abilities.

Wilson (1996)

Using English, German and French language self-assessment adaptations of the Foreign Service Institute/Interagency Language RoundAPPENDIX A-(FSI/ILR) oral proficiency rating scale, in conjunction with an objective norm-referenced test, Wilson found that participants were capable of placing themselves "as they probably would have been placed, on the average, by professional raters using the (FSI-type) Language Proficiency Interview procedure".

Smith & Baldauf (1982)

Close association between the concurrent validity of self- and trained interviewer ratings for immigrants to Australia.

 

 

                                       Results: Migrant Studies

Von Elek (1985)

Strong agreement between adult immigrant students' assessments and those made by their Swedish teachers.

Coombe (1992)

Evidence of a strong relationship between self-assessment ratings and functional literacy skills in Russian, Vietnamese and Cambodian refugees learning ESL in the US.

Janssen-van Dieten (1992)

Moderate relationship between self-assessment and a test of Dutch as a second language for adult immigrants.

Latoma (1996)

Comparison of variability on background factors and self-assessment proficiency levels among adult migrants in the Nordic countries showed more favourable results than in Janssen-van Dieten's study (1992).

 

Results: Research on relationships and affect

Blanche (1990)

No systematic relationship between prior exposure to foreign languages and self-assessment ¡®error rates'

Erickson (1996)

Girls (more than boys) held what proved to be unfounded pessimistic views of their possible success in a language test which they had taken before the self-appraisal.

MacIntyre et al. (1997)

Language anxiety was negatively co-related with perceived proficiency.

 

Results: Self-assessment applied

Oskarsson (1978)

Designed a number of simple self-assessment questionnaires, using behavioural specifications as the general frame of reference.

Oskarsson (1984)

Designed further samples of self-assessment tools, including a proposed form of ¡®continuous self-assessment' conceived as a possible model for an instrument intended to be used on a regular recurrent basis.

Von Elek (1982)

Developed and published a self-diagnostic test of Swedish as a second language.

Lewkowicz & Moon (1985)

Practical and useful presentation of learner-centred evaluative materials and activities.

LeBlanc & Painchaud (1985)

Substitution of self-assessment questionnaires for the previously-used standardised proficiency tests.

Heilenman (1991)

Description of steps that may be taken in the practical development of self-assessment placement materials.

Cram (1997)

Practical illustration of self-assessment applied in the second language classroom.

Harris (1997)

Further examples of the role of self-assessment in formal settings.

 

3.2.3.3. Justifications
The favourable correlation of self-rating scores and external test scores in findings mostly support the use of SA in second language learning (but cf. Blue 1994; Pierce et al. 1993), and Oscarson's "rationale of self-assessment procedures in language learning" (1989:3) serves as a framework for the various justifications for self-assessment that have been proposed:

1.      promotion of learning: "Self-rating requires the student to exercise a variety of learning strategies and higher order thinking skills that not only provide feedback to the student but also provide direction for future learning" (Chamot & O'Malley, 1994:119). Assessment leading towards evaluation is an important educational objective in its own right; training learners in this is beneficial to learning (Dickinson 1987:136);

2.      raised level of awareness: "Students need to know what their abilities are, how much progress they are making and what they can (or cannot yet) do with the skills they have acquired" (Blanche 1988:75);

3.      improved goal orientation: "Engaging the learner actively in the evaluation of learning effects will probably lead to greater interest in techniques for continuous assessment, as opposed to terminal or ¡®end-of-unit' assessment" (Oscarson,  1978:2);

4.      expansion of range of assessment: If learners can appraise their own performance accurately enough, "they do not have to depend entirely on the opinion of teachers and at the same time they can make teachers aware of their individual learning needs" (Blanche 1988:75);

5.      shared assessment burden: Self-assessment is one way of alleviating the assessment burden on the teacher (Dickinson 1987:136). "Combining self-assessment with teacher assessment means that the latter can become more effective" (Harris 1997:17);

6.      beneficial postcourse effects: Self-assessment is a necessary part of self-direction (Dickinson 1987:136).

Much of the self-assessment debate focuses on its feasibility and practicality for self-directed individuals, often in self-access study situations. Harris (1997:13) also sees it as appropriate in test-driven secondary and tertiary education, claiming that self-assessment can help learners in such environments to become more active, to locate their own strengths and weaknesses, and to realise that they have the ultimate responsibility for learning. By encouraging individual reflection, "self-assessment can begin to make students see their learning in personal terms [and] can help learners get better marks" (Harris (1997:13). Peer assessment (a form of self assessment [Tudor, 1996:182] and justified largely by the same arguments) is especially applicable to the classroom setting, aiming to encourage students to take increased responsibility for their own curricula and to become active participants in the learning process (Hill 1994:214; Miller & Ng 1996:134). Tudor adds that critical reflection on the abilities of other learners with respect to a shared goal is a practical form of learner training which helps individuals to assess their own performance, and which reduces the stress of error correction through identifying them in others (Tudor 1996:182). Thus Assinder (1991:218-28) reports increased motivation, participation, real communication, in-depth understanding, commitment, confidence, meaningful practice and accuracy, when students prepare learning tasks for each other.

Haughton & Dickinson (1989) (cited in Miller & Ng 1996:135) set out to test nine hypotheses about peer assessment in their study of a collaborative post-writing assessment. Five hypotheses (items 1 to 5, below) dealt with the practicality of peer assessment, and four (6 to 9) with the benefits of the scheme:

  1. students are sincere and do not use the scheme as a means of obtaining higher grades than they themselves think they deserve;
  2. students are or become able to assess themselves at about the same level as their tutors, i.e. they can interpret the criteria in the same way;
  3. students are or become able to negotiate with tutors on the appropriate level of criteria;
  4. students are or become able to negotiate grades in a meaningful and mutually satisfying manner;
  5. the scheme does not result in a lowering of standards on the course;
  6. students perceive collaborative assessment as fairer than other (traditional) forms of assessment;
  7. students benefit in enhanced understanding of and attitude towards assessment;
  8. students become more self-directed as a result;
  9. the scheme demands more thoroughly worked out criteria of assessment and hence results in fairer assessment.

This study showed "a relatively high level of agreement between the peer assessments and the marks given by the lecturers" (Miller & Ng 1996:139), and similar reliability of results was reported by Bachman & Palmer (1982) with the self-rating of communicative language ability of ESL learners in the USA (aged 17-67). Fok (1981), looking at a group of university students in Hong Kong, also found a high degree of similarity between the students' self-assessment and past academic records for Reading and Speaking. Thus Haughton & Dickinson (1989) claim that to a large extent the scheme worked and that the students were able to assess their own work realistically, even though most students felt inexperienced as testers (lack of reliability) and were not comfortable with being tested by classmates (fear of losing face) (Miller & Ng 1996:141). Despite this, Miller & Ng considered that: i) the students were sincere; ii) they demonstrated a similar level of assessment to that of the lecturers; iii) the scheme did not result in a lowering of standards; and iv) the students benefited in their understanding of and attitude towards assessment by taking part in the study, stating that "language students are able to make a realistic assessment of each others' oral language ability" (Miller & Ng 1996:142). However, they also concluded that peer assessment requires "certain circumstances" in order to be effective:

  1.  the students must be high proficiency language learners;
  2. the group should be homogeneous;
  3. the group should have had previous exposure to each others' oral language ability;
  4. the tests must be conducted in an unthreatening environment;
  5. the students should be given some assistance in preparing their tests (Miller & Ng 1996:142).

Given these conditions, peer assessment can be seen as an effective means of involving learners in formative self-assessment (Miller & Ng 1996:134), with the presence of an audience in general having a positive influence on performance (Lynch 1988). Lynch also makes the important observation that:

... tutors can differ widely in their response to assessment of the same oral presentation, ... we need to experiment with peer-based evaluation ... to complement conventional tutor- and self-based assessment. (Lynch 1988:124)

3.2.3.4. Problems
A number of problems in language pedagogy and evaluation have arisen with the concept of self-assessment as an evaluation tool. Some of these are mentioned below:

  1. doubts on the reliability and feasibility of learners assessing their own self-directed learning and carrying out individual needs analysis (Dickinson 1987:150; Blue 1988);
  2. doubts about the sincerity of the learners (Dickinson 1987:150-1);
  3. doubts on the reliability and feasibility of self-assessment in formal education (Blue 1988:100; Janssen-van Dieten 1989:31; Pierce et al. 1993:38);
  4. reluctance of teachers to lose control of assessment (Blue 1988:100);
  5. conflict of need for students to be in control of aspects of evaluation, and demands of external imperatives (Dickinson 1978);
  6. mismatch between goals of learning as conceived by the learner and the educationalist (Blanche 1988; Oscarson 1997);
  7. cultural factors need further investigation. Many (adult) students did not share the same value systems with their instructors (Blanche 1988);
  8. learners need training and practice in assessing their own performances, and pass through a number of stages of support (Oscarson 1997):
  1. dependent stage: full dependence on external assessment;
  2. co-operative stage: collaborative self- and external assessment;
  3. independent stage: full reliance on independent self-assessment;
  1. question of whether self-assessment is both formative and summative, or whether it should only be seen as a process-oriented, integrative, and ongoing (i.e. formative) activity (Oscarson 1997).

A number of responses have been made to these problems:

 

1.      Reliability: There is evidence that learners can make satisfactorily accurate self-assessments (Oscarson 1978; 1984; Heidt 1979; Blanche 1988:85; LeBlanc et al. 1985; Blue 1988:100) and that there is a fairly consistent overall agreement between self-assessment and external criteria (Oscarson 1978; Dickinson, 1987:150).

2.      Sincerity: One reason put forward by teachers for not sharing responsibility for assessment (problem 4) is that students will ¡®cheat' and produce unrealistic scores. Dickinson (1987), however, points out that ¡®cheating' (a process in which a learner seeks to obtain personal advantage by unfair means [Dickinson 1987:150]), is not about learning but about demonstrating the results of learning to someone else, usually in learning situations which value scores and rank orders over actual success in learning. "Where the learner is concerned with real learning objectives, and where self-assessment is mainly used, cheating offers no advantages" (1987:151).

3.      SA in formal education: Work on peer assessment has shown that SA has an important place in formal education. Blanche also mentions that self-evaluation focuses attention on communicative competence levels in the classroom (Blanche 1988:85).

4.      Teacher-control of assessment: This raises the issue of teacher training as part of the preparation for student autonomy: "Relevant training of teachers may actually constitute a prerequisite for the effective realization of student-centred evaluation techniques" (Oscarson 1989:11).

5.      Conflict of internal/external evaluation needs: (item 9, below)

6.      Mismatch of goals: Learner training for self-assessment can help learners successfully identify their needs, which should not only enhance learning, but should also free the teacher to concentrate on developing learning materials and giving help in other parts of the learning process (Blue 1988:101).

7.      Cultural factors: Blue derives four nationality groupings for students, concluding that self-assessment is more difficult with multi-cultural groups (Blue 1988:109).

8.      Learner training: The adoption of autonomy as a goal of language learning has necessitated attention being given to learner training (cf. section 3.2.2).

9.      Formative/summative nature: Dickinson sees self-assessment used for formative self-monitoring purposes as "both possible and desirable" (1987:151), and also considers it feasible for other purposes, including testing for placement and diagnostic testing. Oscarson also sees self-assessment as enabling the learner "to assess his total achievement at the end of a course or course unit" (1978:3).

 

3.2.3.5. Conclusions
The lack of research into self-assessment to date has meant that most justifications (as for autonomy in language learning) have been a mixture of the educational, humanistic, philosophical, sociological and psychological. Thus Dickinson (1987) invokes learning theory, claiming that "the ability to evaluate the effectiveness of one's own performance in a foreign language is an important skill in learning, and particularly important when the learning becomes autonomous" (1987:136) (cf. Trim in Oscarson, 1978:ix; Council of Europe document 1974:7). Harris (1997:19) stresses the psychological benefits of self-assessment: "Above all, they [learners] can be helped to perceive their own progress and encouraged to see the value of what they are learning. ... The best motive to learn is a perception of the value of the thing learned." (1997:19). Van Lier voices the humanist perspective: "In addition to ¡®normal' testing, we need to pay attention to the basic moral purpose of education: promoting the self-actualization of every learner, to the fullest" (van Lier 1996:120), and Harris draws attention to the importance of affect: "If we attend to the affective and cognitive components of students' attitudes ... we may be able to increase the length of time students commit to language study and their chances of success in it" (1997:20). Harris sees self-assessment as a way of attending to such attitudes, since it encourages the student to become part of the whole process of language learning, and to be aware of his/her progress. Dickinson associates self-assessment with the process paradigm in language teaching (1987:151; cf. Breen 1987a), and a number of authors stress the learner-centred nature of self-assessment (Harris 1997; van Lier 1996:119; Oscarson 1978:1). Of particular significance for the present study, Harris (1997:19) sees self-assessment as a practical tool that should be integrated into everyday classroom activities, and Blanche proposes that self-appraisal "would be particularly helpful in the case of false beginners" (1988:85) .

Oscarson (1997) offers a "tentative" summary of the current state of self-assessment research relating to language education, warning that it would be premature to draw far-reaching conclusions about findings at this stage:

 

1.      There is no consensus on self-assessment as an evaluation tool, but a clear majority of studies surveyed report generally favourable results.

2.      The question of accuracy depends to a great degree on context and on the intended purpose of the assessment.

3.      Decoding skills (reading, listening) tend to be assessed higher than encoding skills (speaking, writing).

4.      SA is more accurate when based on task content closely tied to students' situations as potential users of the language in question.

5.      It is easier for learners to assess their ability in relation to concrete descriptions of more narrowly defined linguistic situations than in relation to descriptions of broad behavioural objectives and ¡®macro-skills'.

6.      Self-assessment seems more accurate when the assessment tools are written in the L1 rather than the target language.

7.      There do not seem to be any clear-cut gender effects in SA data (Schrauger & Osberg 1981; Smith & Baldauf 1982; Coombe 1992; Strong-Krause 1997) (Oscarson, 1997:182-3).

Oscarson (1997) predicts that future research will probably centre on two fields: i) a theoretical or conceptual area (psychological and other factors); and ii) a procedural or didactic area. He suggests the following tasks and questions seem particularly relevant:

 

1.      investigation of the relationship between cultural and educational conditions and feasibility;

2.      investigation of the relationship between psychological or developmental variables and prospects for meaningful and accurate measurement. Closer look at the effects of self-directed assessment on motivation and goal-achievement;

3.      empirical study of the role of supportive environments;

4.      further studies on the relationship between perceived and more objectively determined levels of performance, as well as on the incidence and possible causes of over- and under-estimation of ability;

5.      attempts to determine the proper balance between self-managed and ¡®other-managed' assessment under various circumstances, including consideration of the role of teacher-learner collaborative assessment;

6.      experiments trying to determine the extent to which the ability to make sound assessments improves with practice;

7.      practically oriented development work on how SA can be incorporated into courses and individual study in order to provide continuous formative feedback. (Oscarson, 1997:184-5).

¡¡