Evaluation of Student Performance

John Jordi; Genavie Cueman; Jennifer K. Smith

14 Evaluation of Student Performance

Determining Evaluative Criteria

Adapted with permission from Farris, 1985

Students are very sensitive to grades and the criteria on which they are based: “Will this be on the test? How much does the quiz count toward the final grade? Do you consider attendance and participation?” Grading is a thankless job but somebody has to do it, and you may as well be prepared to answer such questions on the first day of class; that means, of course, that you must have answered them for yourself well in advance.

Before constructing an exam or assignment, you need to decide exactly what it is you expect your students to demonstrate that they have learned. Reviewing the instructional objectives you established at the beginning of the term may be a good way to begin. The first step is to think carefully about the goals which you have set for the students. Should students have mastered basic terminology and working principles? Should they have developed a broad understanding of the subject? Should they be able to use the principles and concepts taught in the course to solve problems in the field? The next question is how you can best evaluate the extent to which students have achieved these goals. Perhaps a certain type of test will suggest itself immediately (multiple choice, matching, fill in the blanks, short answer, problem solving, and essay). If you know what you want to assess and why, then writing the actual questions will be much less frustrating.

UF Handbook for Testing and Grading is an excellent resource. This handbook focuses on how to construct and assess machine-scored multiple choice items, short answer items and essay questions. Also included are chapters on procedures for successfully administering tests, analyzing test questions, and assigning grades.

Test Construction

Objective Tests

Although by definition no test can be truly “objective” (existing as an object of fact, independent of the mind), an objective test in this handbook refers to a test made up of multiple-choice, matching, fill-in, true/false, or short-answer items. Objective tests have the advantages of allowing an instructor to assess a large and potentially representative sample of course material and allow for reliable and efficient test scoring. The disadvantages of objective tests include a tendency to emphasize only “recognition” skills, the ease with which correct answers can be guessed on many item types, and the inability to measure students’ organization and synthesis of material. (Adapted with permission: Yonge, 1977)

Since the practical arguments for giving objective exams are compelling, we offer a few suggestions for writing multiple-choice items. The first one is to avoid it if you can. If it is unavoidable, there are numerous ways of generating objective test items. Many textbooks are accompanied by teachers’ manuals containing collections of items, and professors or other TAs who are former teachers of the same course may be willing to share items with you. In either case, however, the general rule is adapt rather than adopt. Existing items will rarely fit your specific needs, so you should tailor them to more adequately reflect your objectives.

Second, design multiple choice items so that students who know the subject or material adequately are more likely to choose the correct alternative and students with less adequate knowledge are more likely to choose a wrong alternative. That sounds simple enough, but you want to avoid writing items which lead students to choose the right answer for the wrong reasons. For instance, avoid making the correct alternative the longest or most qualified one, or the only one that is grammatically appropriate to the stem. Even a careless shift in tense or subject-verb agreement can often suggest the correct answer.

Finally, it is very easy to disregard the above advice and slip into writing items which require only rote recall, but are nonetheless difficult because they are taken from obscure passages (footnotes, for instance). Some items requiring only recall might be appropriate, but try to design most of the items to tap the students’ understanding of the subject. (Adapted with permission: Farris, 1985)

Here are a few additional guidelines to keep in mind when writing multiple-choice tests: (Adapted with permission: Yonge, 1977)

The item-stem (the lead-in to the choices) should clearly formulate a problem.
As much of the question as possible should be included in the stem.
Randomize occurrence of the correct response (i.e., you don’t always want “C” to be the right answer).
Make sure there is only one clearly correct answer (unless you are instructing students to select more than one).
Make the wording in the response choices consistent with the item stem.
Don’t load the stem down with irrelevant material.
Beware of using answers such as “none of these” or “all of the above.”
Use negatives or double negatives sparingly in the question or stem.
Beware of using sets of opposite answers unless more than one pair is presented (e.g., go to work, not go to work).
Beware of providing irrelevant grammatical cues.

Essay Tests

Conventional wisdom accurately portrays short-answer and essay examinations as the easiest to write and the most difficult to grade, particularly if they are graded well. However, essay items are also considered the most effective means of assessing students’ mastery of a subject. If it is crucial that students understand a particular concept, you can force them to respond to a single question, but you might consider asking them to write on one or two of several options. TAs generally expect a great deal from students, but remember that mastery of a subject depends as much on prior preparation and experience as it does on diligence and intelligence; even at the end of the semester some students will be struggling to understand the material. Design your questions so that all students can answer at their own levels. (Adapted with permission: Farris, 1985)

The following are some suggestions which may enhance the quality of the essay tests that you produce: (Adapted with permission: Ronkowski, 1986)

Have in mind the processes that you want measured (e.g., analysis, synthesis).
Start questions with words such as “compare,” “contrast,” “explain why.” Don’t use “what,” “who,” “when,” or “list.” (These latter types are better measured with objective-type items.)
Write items so as to define the parameters of expected answers as clearly as possible.
Don’t have too many answers for the time available.

Responding to Student Writing

Writing is a tool for communication, and it is reasonable for you to expect coherent, lucid prose from your students. However, writing is also a mode of learning and a way for students to discover what they think about a subject, and you should be willing to participate in this learning and discovery process as well as grade the product. (Adapted with permission: Farris, 1985)

The quality of student writing is often far below acceptable standards. Many instructors try to ignore the problem by insisting that writing skills are not part of their assigned subject area. This attitude results in further problems for both instructors and their students. If you demand good writing, make your expectations known and offer help to those who need it (or refer students to tutorial services; see Part Three for information on the Reading and Writing Center and other available services). Students will try to meet your demands — make your standards worth meeting.

More and more, instructors are involving themselves in students’ writing (and learning) processes rather than simply “correcting” the final product by having them submit first drafts which are given constructive criticism on content, organization, and presentation. One-on-one conferences after the student has read the critique and perhaps begun a second draft are invaluable. The second draft is graded and usually demonstrates improvement on all fronts, especially in the depth of analysis and support for an argument so often found lacking in one-draft student papers.

Also popular with both students and instructors are peer feedback groups in which students read each other their first drafts for critique. These groups work best when a protocol is observed: generally the instructor creates a guide sheet.

Time permitting, each draft is read twice. The first time through group members listen only; on the second reading they write comments on their photocopy and/or fill out a form designed to address problems specific to the assignment. Then, one at a time, the group members offer their comments and suggestions to the writer. One advantage to the peer feedback method is that you, the instructor, are not the only audience for the students’ writing. They hear suggestions for improving their drafts from others prior to your reading of the papers. (Adapted with permission: Farris, 1985)

Grading

Reading 50 papers or 200 essay exams presents special problems, especially when all 50 or 200 are responses to the same topic or question. How do you maintain consistency? You are more likely to be thorough with the first few papers you read than with the rest and less likely to be careful with the comments when you are tired. To avoid such problems, read five or six papers before you start grading to get an idea of the range of quality (some instructors rank-order the papers in groups before they assign grades), and stop grading when you get tired, irritable, or bored. When you start again, read over the last couple of papers you graded to make sure you were fair. Some instructors select “range finder” papers — middle range A, B, C and D papers to which they refer for comparison.

Depending upon the number of students you have, you may have to spend anywhere from five to twenty minutes on a three-to-four page paper. Try to select only the most insightful passages for praise and only the most shallow responses or repeated errors for comment; in others words, don’t turn a neatly typed paper into a case of the measles. Avoid the temptation to edit the paper for the student. Remember, also, that if you comment on and correct everything, a student loses a sense of where priorities lie. Do not give the impression that semicolons are as important to good writing and to a grade as, say, adequate support for an argument. (Adapted with permission: Farris, 1985)

In assigning grades to essay questions you may want to use one of the following methods: (Adapted with permission: Cashin, 1987)

Analytic (point-score) Method – In this method, the ideal or model answer is broken down into several specific points regarding content. A specific subtotal point value is assigned to each. When reading the exam, you need to decide how much of each maximum subtotal you judge the student’s answer to have earned. When using this method, be sure to outline the model (ideal or acceptable) answer BEFORE you begin to read the essays.

Global (holistic) Method – In this method, the rater reads the entire essay and makes an overall judgment about how successfully the student has covered everything that was expected in the answer and assigns the paper to a category (grade). Generally, five to nine categories are sufficient. Ideally, all of the essays should be read quickly and sorted into five to nine piles, then each pile reread to check that every essay has been accurately (fairly) assigned to that pile which will be given a specific score or letter grade.

The Learning Analytics and Assessment team (within the UFIT Center for Instructional Technology and Training) assists instructors with student assessment technology, including the scanning and processing of optical mark reader (OMR) Scantron answer sheets. Scanning and scoring of answer bubble sheets is included in the cost of purchasing the answer documents through our department. Additional scoring services include item analysis, arrange analysis, and providing additional lists or data in a file format compatible to the Canvas e-Learning system. Consultations are highly encouraged as we can often provide customized service and assessment technology support upon request.

At the time of the exam it is helpful to write on the chalkboard/marker board all pertinent information required on the answer sheet (course name, course number, section number, your name, etc.). Also, remind students to fill in their University identification numbers completely to ensure that their answers will be properly graded by the computer.

Troubleshooting

Adapted with permission from Northwestern University

What if a student’s work is illegible? Consider photocopying the work, giving it back to the student to be typed, and then grading the typescript. Be sure the photocopy corresponds to the typed copy.

What if a problem emerges in the wording of a question? Give the student the benefit of the doubt. It is possible to have a poorly phrased question. Make a note in your grade book of what has happened and what action is taken so that when the final grade is calculated the error can be taken into account.

What numerical grade should be assigned to a failing grade or E? How low should a failing grade be? Should a grade of 0 be given only when the student literally handed in nothing? If a student had some correct parts, should the grade be a 50%? Since there is a large distinction between 0 and 50 when calculating a final grade, the question is vital. All instructors have their own preference. As with any grade, be sure that the grade you assign for a failing assignment or test is fair and justified.

Ten Tips to Help You Get Through Your Grading

Use a scoring rubric.
Meet with other graders to determine grading criteria.
Use “range finder” papers.
Read the paper before you begin marking.
Use pencil for comments. Crossing out your own mistakes or changing your response halfway through can be messy.
Choose the appropriate level of feedback for the assignment.
Use short comments in the margins, and elaborate comments at the end.
Use marking symbols, but make sure students have a key for these symbols.
Keep your allotted time per paper.
Take breaks! You will be more efficient if you give your mind a rest at regular intervals.

Records and Distribution of Grades

The Family Educational Rights and Privacy Act of 1974, better known as the “Buckley Amendment” (20 U.S. Code 1232g), prohibits the dissemination of a student’s educational records, which consist of records, files, documents, and other materials containing information directly related to a student without the written consent of the student, if 18 years of age or older, or of the parents.

As a result, public posting of student grades using complete social security numbers or university student identification numbers (the UFID number), or any portion thereof, violates the Federal Educational Rights and Privacy Act. A student’s social security number is part of the educational record and is a personal identifier of the student. Grades must not be posted by social security number or by the UFID number.

Grades may be shared through e-Learning in Canvas confidentially. Grades can be posted by instructors using the “Grades” tab of the Canvas course. Students can see only their own grades in the “Grades” tab of the Canvas course. Instructors should be certain that grades are calculated correctly on Canvas as any incorrect measurements will certainly result in student confusion.

Handing back papers or essays to a large class can be a time-consuming task. Some instructors deal with this by leaving time at the end of class to hand back assignments or tests, or they may ask students to come to their office to pick up papers. The latter alternative may provide an opportunity for students to get more personal feedback from you about their papers.

The University Grading System

“The University’s grading system is explained in full in the undergraduate catalog. The grading system is available online at http://www.registrar.ufl.edu/staff/grades.html.

Essentially, the system consists of a range of letter grades and corresponding grade points:

A = 4.0

A- = 3.67

B+ = 3.33

B = 3.0

B- = 2.67

C+ = 2.33

C = 2.0

C- = 1.67

D+ = 1.33

D = 1.0

D- = .67

In addition, there are non-punitive grades and symbols, such as “W” for withdraw, “H” for a deferred grade assigned only in approved sequential courses, “N” for no grade reported, and “I” for incomplete. There are also failing grades with no grade points: “E” for failure, “U” for unsatisfactory, and “WF” for withdraw failing. “I” grades are reserved for students who have satisfactory reasons for not completing the course work, such as a major illness. If an “N” or an “I” are not changed by the end of the next term, they will be computed as failing grades in the student’s GPA. If you feel a student should receive an “I” over a final grade at the end of the term, check with your teaching advisor or departmental chair.

Submitting Grades

As an instructor/grader, you may send your grades directly or export grades from your Canvas gradebook to upload them into myUFL for final approval and posting. The guide on Finalizing your Canvas Gradebook will walk you through the process of preparing your Canvas gradebook to export so that what is displayed in Canvas will match what is sent/uploaded into ONE.UF.

Useful Resources on Evaluation of Student Performance

Carlson, S. B. (1992). Creative classroom testing. Princeton: Educational Testing Service.

Eble, K. E. (1988). The craft of teaching: A guide to mastering the professor’s art (2nd ed.). San Francisco: Jossey-Bass, Publishers.

Legg, S. M. (1991). Handbook on testing and grading. Office of Instructional Resources, University of Florida.

McKeachie, W. J. (1986). Teaching tips: A guidebook for the beginning college teacher (8th ed.). Lexington, MA: D.C. Heath and Company.

Office of Instructional Resources (UF). (1991) PC gradebook version 1.2.

Popham, W. J. (2013). Classroom assessment: What teachers need to know. (7th ed.). Allyn and Bacon.

License

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License