This section on assessment is part of the Math Methodology series on instruction, assessment, and curriculum design. The short essay that follows, The Role of Assessment, is part 2 of the essay, Teaching and Math Methodology, which includes:
Part 1: Math Methodology: Instruction
Part 2: The Role of Assessment
Part 3: Curriculum: Content and Mapping
Anne Davies (2004) stated key tenets on the role of assessment, which illustrate the partnership that should exist between teaching and learning. "Keeping students informed about the learning objectives or standards they are working toward helps support their success. Quality and success also become clearer for students when we engage them in setting criteria" (p. 2). Educators who need help with writing learning objectives or behavioral objectives might consult Behavioral Objectives and How to Write Them from Florida State University.
Assessment is more than collecting data on test performance. Davies (2004) indicated that assessment is a process of triangulation or gathering evidence over time that agreed-upon criteria have been met from multiple sources: artifacts that students produce, observation notes on the process of students' learning, and documentation from talking with students about their learning. Assessment includes guiding students to self-assess their learning, involving parents and students in discussions of progress, and students showing evidence of their learning to audiences they care about. It is a complex process because of differences in learning styles, multiple intelligences, and the diverse backgrounds that students bring to classrooms (2004).
With the current focus on assessment using standardized testing, educators might have overlooked the value of performance assessments, which also provide evidence of what students can do. Such evidence of student artifacts gathered over time, as Davies (2004) noted, is clearly evident in Edutopia's video: Assessment Overview: Beyond Standardized Testing. It will definitely give you ideas for your own classroom.
The following sections address systems for assessment (diagnostic, formative, summative), more on self-assessment, teacher-made tests, and vendor-made tests.
There are two kinds of tests that may or may not help a teacher to do a better instructional job: teacher-made classroom tests and externally imposed tests, which are "those tests required by state or district authorities and designed by professional test developers to measure student mastery of the sets of objectives experts have deemed essential" (Popham, 2003, Preface section). The accountability movement has placed a great deal of stress upon teachers to prepare students for state standardized tests and even greater stress upon students to perform well on those tests, which were mandated by the No Child Left Behind legislation. Performance assessments will also be included in assessments related to the implementation of the Common Core State Standards. To this end, W. James Popham (2007) suggested that schools also need interim tests that they "can administer every few months to predict students' performances on upcoming accountability tests" (p. 80).
ASCD (Carter, 2004) supports using multiple measures in assessment systems, rather than reliance on the outcome of a single test, to accurately measure achievement and to hold stakeholders accountable. Such assessment systems are
Fair, balanced, and grounded in the art and science of learning and teaching
Reflective of curricular and developmental goals and representative of content that students have had an opportunity to learn;
Used to inform and improve instruction;
Designed to accommodate nonnative speakers and special needs students; and
Valid, reliable, and supported by professional, scientific, and ethical standards designed to fairly assess the unique and diverse abilities and knowledge base of all students (para. 9).
What does balanced mean?
Classroom assessments, interim assessments, and large-scale assessments are often associated with a balanced assessment program; however, the value of interim assessments is questionable. Interim assessments are also standardized tests, according to Popham (2011) that are "usually purchased from commercial vendors, but are sometimes created locally" (p. 14). They can be used in the following ways:
A caveat to Popham's 2007 statement is that interim assessments are "supported neither by research evidence nor by societal demand" (Popham, 2011, p. 15). The problem with their use is a potential unwanted or undesirable level of regimentation to instruction. For results of interim tests to be useful, "the timing of a teacher's instruction must mesh with what's covered in a given interim test" (p. 15).
Paul Black and Dylan Wiliam (1998) stated from the TIMSS video study, "A focus on standards and accountability that ignores the processes of teaching and learning in classrooms will not provide the direction that teachers need in their quest to improve" (para. 2). Those processes involve teachers making assessment decisions, which Popham (2003) indicated can be made based on the structure of the tests themselves or on students' performance on those tests. Teachers can make decisions about the nature and purpose of the curriculum, about students' prior knowledge, about how long to teach something, and about the effectiveness of instruction (Chapter 1, section: What Sorts of Teaching Decisions Can Tests Help?).
Sometimes state and district content standards are not worded clearly enough for use at the classroom level and lead to possible multiple interpretations. Hence, Popham (2003, Chapter 1) pointed out the need for teachers to examine test sample items to clarify the intent of a particular curricular goal. They can then focus instruction appropriately on that intent.
How teachers use assessment plays a major role in achieving standards. Assessments can be diagnostic, formative, and summative. As you read about those categories in what follows, consider the seven assessment and grading practices for effective learning suggested by Jay McTighe and Ken O'Connor (2006):
Use summative assessments to frame meaningful performance goals. ... To avoid the danger of viewing the standards and benchmarks as inert content to “cover,” educators should frame the standards and benchmarks in terms of desired performances and ensure that the performances are as authentic as possible. Present those tasks at the beginning of a new unit.
Show criteria and models in advance. Rubrics and multiple models showing both strong and weak work help learners judge their own performances.
Assess before teaching.
Offer appropriate choices. While keeping goals in mind, options judiciously offered enable students different opportunities for best demonstrating their learning.
Provide feedback early and often. Learners will benefit from opportunities to act on the feedback—to refine, revise, practice, and retry.
Encourage self-assessment and goal setting.
Allow new evidence of achievement to replace old evidence. (pp. 13-19)
Also consider the role of motivation in assessments you provide. Richard Curwin (2014) defined educational motivation as "the desire to learn" and believes that "the evaluation process is one of the most formidable killers of motivation in education. Rewards, punishment, incentives, threats, or any external strategy might get students to do their work, but they rarely influence whether children want to learn. These externals create finishers, not learners" (p. 38). He believes effort should be part of the grading process and that there are legitimate ways of evaluating effort without lowering standards. For example, count improvement, count seeking help, count offers to help other students, and count extra work. He offered the following evaluation strategies to help encourage students to try harder and increase their internal motivation:
Diagnostic assessments, typically given at the beginning of an instructional unit or school year, will determine students' prior knowledge, strengths, weaknesses, and skill level. They help educators to adjust curriculum or provide for remediation. According to Tomlinson and McTighe (2006) in Integrating Differentiated Instruction & Understanding by Design, they can also help "identify misconceptions, interests, or learning style preferences," and help with planning for differentiated instruction. Assessments might take the forms of "skill-checks, knowledge surveys, nongraded pre-tests, interest or learning preference checks, and checks for misconceptions" (p. 71). Thus, pretests help "to isolate the things your new students already know as well as the things you will need to teach them" (Popham, 2003, Chapter 1, section: Using Tests to Determine Students' Entry Status). Further, "A pretest/post-test evaluative approach ... can contribute meaningfully to how teachers determine their own instructional impact" (Chapter 1, section: Using Tests to Determine the Effectiveness of Instruction).
The point of a diagnostic is not just to assess, but to do something with test results leading to improved learning. Thus, progress monitoring with individual students or an entire class makes sense. According to the National Center on Student Progress Monitoring (NCSPM), progress monitoring is "a scientifically based practice." The term is relatively new, and educators might be more familiar with Curriculum-Based Measurement and Curriculum-Based Assessment. An implementation involves determining a student’s current levels of performance and setting goals for learning that will take place over time. "The student’s academic performance is measured on a regular basis (weekly or monthly). Progress toward meeting the student’s goals is measured by comparing expected and actual rates of learning. Based on these measurements, teaching is adjusted as needed. Thus, the student’s progression of achievement is monitored and instructional techniques are adjusted to meet the individual students learning needs" (NCSPM, Common Questions section).
Although NCSPM does not endorse specific products, it has identified tools (based on its annual reviews) that demonstrate sufficient evidence for its progress monitoring standards. Among those for math are Renaissance Learning's Accelerated Math and STAR Math, CTB/McGraw-Hill's Yearly Progress Pro, and Monitoring Basic Skills Progress (MBSP Basic Math) from Pro-Ed. These are representative of progress monitoring products.
Progress monitoring is one component of Response to Intervention (RTI), which is an education model for early identification of students at risk for learning disabilities. Of equal importance is the emphasis on providing appropriate learning experiences for all students by ensuring current levels of skill and ability are aligned with the instructional and curricular choices provided within their classroom (RTI Action Network, n.d., What is RTI? section). The RTI Action Network provides extensive information on RTI and the how to's of progress monitoring.
Formative assessment is assessment for learning of current students. It "is an essential component of classroom work and its development can raise standards of achievement” (Black & Wiliam, 1998, Are We Serious About Standards? section, para. 2). Formative assessments provide immediate evidence of student learning, and can be used to help improve upon quality of instruction and to monitor progress in achieving learning outcomes (Burns, Fager, Gummet, et al., n.d.)
Per Moss and Brookhart (2009) in Advancing Formative Assessment in Every Classroom, when used appropriately, it's also strongly linked to increasing intrinsic student motivation to learn. Motivation to learn has four components: self-efficacy (one's belief in ability to succeed), self-regulation, self-assessment, and self-attribution (one's perceptions and explanations of success or failure that influence the amount of effort that an individual will put into an activity in the future) (Moss & Brookhart, 2009, chapter 1).
"Formative assessment includes both formal and informal methods, such as ungraded quizzes, oral questioning, observations, draft work, think-alouds, student constructed concept maps, dress rehearsals, peer response groups, and portfolio reviews" (Tomlinson & McTighe, 2006, p. 71). Homework and conferences also fall into this category.
Students might write their understanding of vocabulary or concepts before and after instruction, or summarize the main ideas they've taken away from a lecture, discussion, or assigned reading. They can complete a few problems or questions at the end of instruction and check answers. Teachers can interview students individually or in groups about their thinking as they solve problems, or assign brief, in-class writing assignments (Brown, 2002, Examples of Formative Assessment section).
These writing assignments, accompanied by peer group discussions, are essential, as "Knowledge and thinking must go hand in hand" (McConachie et al., 2006, p. 8). Embedding writing in performance tasks enables teachers to "guide students to deeper levels of understanding" (p. 12). McConachie et al. provided the following example, appropriate for a grade 7 math unit on percents:
To celebrate your election to the student council, your grandparents take you shopping. You have a 20-percent-off coupon. The cashier takes 20 percent off the $68.79 bill. Your grandmother remembers that she has an additional coupon for 10 percent off. The cashier takes the 10 percent off what the case register shows. Does this result in the same amount as 30 percent off the original bill? Explain why or why not? (p. 12)
In determining if students truly understand percents in the above example, teachers are assessing if students know what a percent is, if they can use percents in a real-world application, and interpret their answers appropriately.
The RAFT method is a particularly useful formative assessment writing strategy for checking understanding. According to Douglas Fisher and Nancy Frey (2007) in their Checking for Understanding - Formative Assessment Techniques for Your Classroom, RAFT prompts ask students to consider the role of the writer (R), the audience (A) to whom the response is written, the format (F) of the writing, and the topic (F) of the writing. For example, to determine if students understand characteristics of triangles, one such prompt might be:
R: Scalene triangle
A: Your angles
F: Text message
T: Our unequal relationship (p. 69).
The RAFT Model in the following table includes other examples:
|Zero||Whole Numbers||Campaign speech||Importance of 0|
|Scale factor||Architect||Directions for a blueprint||Scale drawings|
|Repeating decimal||Customers||Petition||Proof/check for set membership|
|Exponent||Jury||Instructions to the jury||Laws of exponents|
|Variable||Equations||Letter||Role of variables|
|Content adapted from RAFT examples:
There are multiple resources of value to learn more about formative assessment. Dylan William (2011) provided over 50 formative assessment techniques for classroom use in Embedded Formative Assessment. Page Keeley and Cheryl Rose Tobey's (2011) Mathematics Formative Assessment: 75 Practical Strategies for Linking Assessment, Instruction, and Learning is noteworthy.
Homework is one method for students to take responsibility for their learning. It also falls into the category of formative assessment, as it "typically supports learning in one of four ways: pre-learning, checking for understanding, practice, and processing" (Vatterott, 2009, p. 96). However, educators have varying opinions on homework ranging from how much to assign to what kind (e.g., acquisition or reinforcement of facts, principles, concepts, attitudes, or skills), for whom, when to assign it, and whether or not it should be graded.
Guidelines for an appropriate total amount of homework each day in consideration of all subjects are provided in Helping Your Students with Homework: A Guide for Teachers from the U.S. Department of Education (Paulu, 2003):
Educators believe that homework is most effective for children in first through third grades when it does not exceed 20 minutes each school day. From fourth through sixth grades, many educators recommend from 20 to 40 minutes a school day for most students. For students in 7th- through 9th-grades, generally, up to 2 hours a school day is suitable. Ninety minutes to 2 hours per night are appropriate for grades 10 through 12. Amounts that vary from these guidelines are fine for some students. (p. 16)
Not all of teachers' homework practices are grounded by research. In Rethinking Homework: Best Practices That Support Diverse Needs, Cathy Vatterott (2009) stated:
Viewing homework as formative feedback changes our perspective on the grading of homework. Grading becomes not only unnecessary for feedback, but possibly even detrimental to the student's continued motivation to learn. With this new perspective, incomplete homework is not punished with failing grades but is viewed as a symptom of a learning problem that requires investigation, diagnosis, and support. (p. 124)
Teachers cling to grading homework for a variety of reasons. Vatterott (2011) noted teachers fear that if they do not grade homework, students will not do it; they believe students' hard work should be rewarded, and homework grades help students who test poorly (p. 61). The issue is that homework should still be assessed because students need feedback on their learning; but if homework is assigned for learning purposes, then teachers need to rethink when homework should be counted as part of a grade.
Per Vatterott (2011), homework should be separated into two categories: formative and summative. Homework that is formative should not be factored into the overall course grade. Practice with math problems would fall into the formative category. Homework assignments as summative assessments, such as research papers or portfolios of student work, may be considered. If teachers wish to tie homework to assessments, the easiest way "in students' minds is to allow them to use homework assignments and notes when taking a test. Another method is to correlate the amount of homework completed with test scores" perhaps by placing two numbers on a test paper--the test score itself and "the student's number of missing homework assignments" (p. 63). If teachers allow retakes for failing summative assessments, they might also require learners who "don't complete a set of homework assignments and then fail the related summative assessment" to "go back and complete all the formative tasks before they can retake the assessment" (p. 64).
Black and Wiliam (1998) provided the following suggestions for improving formative assessment (How can we improve formative assessment? section):
In terms of building self-esteem in pupils, “feedback to any pupil should be about the particular qualities of his or her work, with advice on what he or she can do to improve, and should avoid comparisons with other pupils” (para. 3).
Self-assessment by pupils is an essential component in formative assessment, which involves three components: students must recognize the desired goal, have evidence about their present position, and some understanding of a way to close the gap between the two. “[I]f formative assessment is to be productive, pupils should be trained in self-assessment so that they can understand the main purposes of their learning and thereby grasp what they need to do to achieve” (para. 7).
In terms of effective teaching:
“[O]pportunities for pupils to express their understanding should be designed into any piece of teaching, for this will initiate the interaction through which formative assessment aids learning” (para. 9).
“[T]he dialogue between pupils and a teacher should be thoughtful, reflective, focused to evoke and explore understanding, and conducted so that all pupils have an opportunity to think and to express their ideas” (para. 13).
"[F]eedback on tests, seatwork, and homework should give each pupil guidance on how to improve, and each pupil must be given help and an opportunity to work on the improvement” (para. 15).
Stephen Chappuis and Jan Chappuis (2007/2008) said that a key point on the nature of formative assessment is that "there is no final mark on the paper and no summative grade in the gradebook" (p. 17). The intent of this type of assessment for learning is for students to know where they are going in terms of learning targets they are responsible for mastering, where they are now, and how they can close any gap. "It functions as a global positioning system, offering descriptive information about the work, product, or performance relative to the intended learning goals" (p. 17). Such descriptive feedback identifies specific strengths, then areas where improvement is needed, and suggests specific corrective actions to take. For example, in a study of graphing, an appropriate descriptive feedback statement might be "You have interpreted the bars on this graph correctly, but you need to make sure the marks on the x and y axes are placed at equal intervals" (p. 17). Notice that the statement does not overwhelm the student with more than he/she can act on at one time.
Teachers, however, must help students to understand the role of formative assessment, as in the minds of many learners assessment of any kind "equals test equals grade equals judgment" (Tomlinson, 2014, p. 11). Hence, they easily become discouraged, rather than appreciating that such assessments do not call for perfection. Tomlinson suggested that educators might reinforce this message, telling learners something like the following:
When we're mastering new things, it's important to feel safe making mistakes. Mistakes are how we figure out how to get better at what we are doing. They help us understand our thinking. Therefore, many assessments in this class will not be graded. We'll analyze the assessments so we can make improvements in our work, but they won't go into the grade book. When you've had time to practice, then we'll talk about tests and grades. (p. 11)
Thomas Guskey (2007) pointed out that formative assessments will not necessarily lead to improved student learning or teacher quality without appropriate follow-up corrective activities after the assessments. These activities have three essential characteristics. They present concepts differently, engage students differently in learning, and provide students with successful learning experiences. For example, if a concept was originally taught using a deductive approach, a corrective activity might employ an inductive approach. An initial group activity might be replaced by an individual activity, or vice versa. Corrective activities can be done with the teacher, with a student's friend, or by the student working alone. As learning styles vary, providing several types of such activities to give students some choice will reinforce learning (pp. 29-30).
Guskey (2007) suggested several possible corrective activities, which are included in the following table. He recommended these be done during class time to ensure those who need them the most will take part.
How to Use Corrective Activities
|Activity||Helpful Characteristic||With Teacher||With Friend||By Oneself|
|Reteaching||Use different approach; different examples.||X|
|Individual Tutoring||Tutors can also include older students, teacher aides, classroom volunteers.||X||X|
|Peer Tutoring*||Avoid mismatched students, as this can be counterproductive.||X|
|Cooperative Teams||Teachers group 3-5 students to help one another by pooling knowledge of group members. Teams are heterogeneous and might work together for several units.||X|
|Course Textbooks||Reread relevant content, which corresponds to problem areas. Provide students with exact sections or examples so they can go directly to it.||X||X||X|
|Alternative Textbooks||These might offer a different presentation, explanation, or examples.||X||X||X|
|Workbooks/Study Guides||Includes videotapes, audiotapes, DVDs, hand-on material, manipulatives, Web resources, and so on.||X||X||X|
|Academic Games||Can promote learning via cooperation and collaboration.||X||X||X|
|Learning Kits||Usually include visual presentations and tools, models, manipulatives, interactive multimedia content. Can be commercial or teacher made.||X||X|
|Learning Centers/Labs||Include hands-on and manipulative tasks. Involve structured activity with specific assignment to complete.||X||X|
|Computer Activities||Can be effective when students become familiar with how a program works and when software matches learning goals.||X||X|
Source: Adapted from Guskey, T. (2007). The rest of the story. Educational Leadership, 65(4), 31.
*The IRIS Center at Vanderbilt University provides commentary on the use of peer tutoring and a video illustrating two students engaged in peer tutoring.
According to Guskey (2007), some students will demonstrate mastery of concepts on an initial formative assessment. These students are ideal candidates for enrichment activities while others are engaged in corrective activities. "Rather than being narrowly restricted to the content of specific instructional units, enrichment activities should be broadly construed to cover a wide range of related topics" (p. 32). As with corrective activities, students should have some freedom to choose an activity that interests them. Teachers might consider having students produce a product of some kind summarizing their work. This enhances the experience so that students don't construe the time spent as busy work.
Unlike formative assessment, which is assessment for learning, summative assessment is assessment of learning. According to Burns and colleagues (n.d.), these assessments are comprehensive, typically given at the end of a program, and provide for accountability. Such judgments include grading an essay or test, for example.
Traditional assessments might include multiple choice, true/false, and matching. Alternative assessments take the form of short answer questions, essays, electronic or paper-based portfolios, journal writing, oral presentations, demonstrations, creation of a product, student self-assessment and reflections, and performance tasks that are assessed by predetermined criteria noted within rubrics. Self and peer assessments can be both formative and summative in nature, and help students to take responsibility for and to become critical of their own work.
Assessments related to the Common Core State Standards in Mathematics will also include performance assessments, and extended response questions. Christina Brown and Pascale Mevs (2012) noted the following definition of performance assessment, as developed by the Quality Performance Assessment Initiative of the Center for Collaborative Education and the Nellie Mae Education Foundation:
Quality Performance Assessments are multi-step assignments with clear criteria, expectations and processes that measure how well a student transfers knowledge and applies complex skills to create or refine an original product. (p.1)
In Overcoming Textbook Fatigue, Releah Lent (2012) noted that performance-based assessments take many forms. They might include project exhibits, oral presentations, debates, panel discussions on open-ended questions, lab experiments, multimedia presentations, demonstrations of experiments or solutions to problems. Learners might conduct interviews, create visual displays (e.g., graphs, charts, posters, illustrations, storyboards, cartoons) or photo essays, construct models, or contribute to blogs, wikis or other electronic projects (Lent, 2012, p. 137). [Note: You will find more about performance tasks in CT4ME's section on curriculum mapping.]
In Assessing the Core, a webinar hosted by the School Improvement Network, Jay McTighe (2014) presented a Framework of Assessment Approaches and Methods (slides handout, p. 4), which highlights potential forms of performance assessments in the following table.
Framework of Assessment Approaches and Methods
How might we assess student learning in the classroom?
|SELECTED RESPONSE ITEMS||PERFORMANCE-BASED ASSESSMENTS|
|multiple-choice||fill in the blank: word(s), phrases(s)||essay||oral presentation||oral questioning|
|true-false||short answer: sentence(s), paragraph(s)||research paper||dance/movement||observation ("kid watching")|
|matching||label a diagram||blog/journal||science lab demonstration||interview|
|Tweets||lab report||athletic skills performance||conference|
|"show your work"||story/play||dramatic reading||process description|
|representation(s): e.g., fill in a flowchart or matrix||concept map||enactment||"think aloud"|
|science project*||Prezi/Power Point|
|3-D model||music performance|
|*McTighe's example was a science
project. Consider a math project in the math classroom.
Source: School Improvement Network (2014, January). Assessment to the Core: Assessments that enhance learning [Webinar featuring Jay McTighe]. Retrieved from http://www.schoolimprovement.com/webinars/assessment-to-the-core/
In Engaging the Online Learner: Activities and Resources for Creative Instruction, Conrad and Donaldson (2004, 2011) characterized an authentic activity as one that "simulates an actual situation" and "draws on the previous experiences of the learners" (Conrad & Donaldson, 2004, p. 85). They posed six questions to guide educators who design such activities:
Is the activity authentic?
Does it require learners to work collaboratively and use their experiences as a starting point?
Are learners allowed to learn from their mistakes?
Does the activity have value beyond the learning setting?
Does the activity build skills that can be used beyond the life of the course?
Do learners have a way to implement their outcomes in a meaningful way? (p. 86).
The School of Education at the University of Wisconsin-Stout has an extensive collection of authentic assessment resources and rubrics.
Consider having students use an e-portfolio for documenting their progress toward mastering state standards and to show off their best work.
"A portfolio is a collection of items that is assembled by students or their teachers to show a range of work in a subject." It can be used for both formative and summative assessments. For a structured portfolio, educators might provide activities and assignments to include. Learners would then create a table of contents (Lent, 2012, p. 134). George Lorenzo and John Ittelson (2005) defined an e-portfolio as "a digitized collection of artifacts, including demonstrations, resources, and accomplishments that represent an individual, group, community, organization, or institution. The collection can be comprised of text-based, graphic, or multimedia elements archived on a Web site or on other electronic media such as CD-ROM or DVD" (p. 2).
In Digital-Age Assessment, Harry Tuttle (2007) recommended using e-portfolios as a method to look beyond traditional assessment. "A common e-portfolio format includes a title page; a standards' grid; a space for each individual standard with accompanying artifacts and information on how each artifact addresses the standard; an area for the student's overall reflection on the standard; and a teacher formative feedback section for each standard. Within the e-portfolio, the evidence of student learning may be in diverse formats such as Web pages, e-movies, visuals, audio recordings, and text" (Getting Started section).
Portfolio Assessment from Prince George's County Public Schools (MD) contains a collection of resources linking portfolios to instruction. Learn what a portfolio is and why to use it, characteristics of an effective portfolio, the different types of portfolios, phases of development, and how to evaluate. Get resources for assessment and learn how to get started.
Using Technology to Support Alternative Assessment and Electronic Portfolios is a collection of online videos on e-portfolios, online articles and conference presentations on electronic portfolios by Dr. Helen Barrett. Consult Dr. Barrett's tips for Creating ePortfolios with Web 2.0 Tools. She also has extensive resources at her site: http://electronicportfolios.org/
Learning how to self-assess is an incremental process that can begin with the elementary grades. "During self-assessment, students reflect on the quality of their work, judge the degree to which it reflects explicitly stated goals or criteria, and revise. Self-assessment is formative...Self-evaluation, in contrast, is summative--it involves students giving themselves a grade" (Andrade, 2007, p. 60).
For self-assessment to be meaningful to students, they must prove to themselves that it can make a difference in their learning. In a differentiated classroom, self-assessment "also enables student and teacher to focus both on nonnegotiable goals for the class and personal or individual goals that are important for the development of each learner" (Tomlinson & McTighe, 2006, p. 80). Alison Preece (1995) provided eight tips for success. Teachers might point out that payoff, start small and keep things simple, build self-[assessment] into day-to-day activities, make it useful, clarify criteria, focus on strengths, encourage variety and integrate self-[assessment] strategies with peer and teacher [assessment], and grant it a high profile (p. 30).
As an example of building self-assessment into day-to-day activities, John Bond (2013) noted three easy-to-implement reflective strategies. Students might complete "I learned" statements. They could also use a think aloud in which they reflect and share what they learned on what had just been taught with a partner. They might complete "Clear and Unclear Windows." In this latter, students divide a sheet of paper into two columns headed by clear and unclear. Then they fill in those columns, respectively, with what they understood about a lesson and did not understand (Bond, 2013).
To ease students into the process of evaluation, Preece (1995) also noted that students might evaluate the materials they use and activities in which they are involved. Teachers might ask for suggestions for improvement of lessons they have presented, peers might comment on work of others by acknowledging what was good and providing a suggestion for a change or addition. Eventually students would "try a variety of strategies such as learning logs, conference records, response journals, self-report sheets, attitude surveys, and portfolio annotations" (p. 33). Teachers might encourage them to come up with questions on "attitudes, strategies, stumbling blocks, and indicators of progress or achievement" (p. 35). Students might write one or two statements of meaningful goals for themselves with some strategies for achieving them. The key for success on this latter is follow-up to monitor progress toward those goals with negotiated check-in times for discussion, including possible refinement or replacement.
To help students develop their personal accountability for learning, Preece (1995) suggested that teachers might require students to keep a record book with books/content read, completed assignments, projects, personal goals, accomplishments and what is working well, challenges to learning, and difficulties encountered. This serves as a basis for conferences, either with the teacher or with parents. In either case, students use this tool and lead those conferences to report their progress on learning.
Rubrics are not just for evaluation (i.e., assigning a grade) of student work. They are excellent tools to use for self- and peer-assessment, orienting students to what constitutes quality from the viewpoint of experts and serving as guides for revision and improvement. They are particularly valuable when students have input into their construction. When they use them to monitor their progress on an assignment, they might underline key phrases in the rubric, perhaps with a colored pencil, and then use that same color to underline or circle those parts in their draft work that meet the standard identified in the rubric. If they can't find where in their work that they have met the standard, they will immediately know that revision is needed (Andrade, 2007). The key to success when using rubrics is to build time for revision into the learning plan.
The design of the rubric is also crucial. Rubrics add the objective component to assessment and evaluation. Caution should be exercised on their use in evaluation. If a typical rubric has five to seven categories, some criteria of value (e.g., originality) to a grader might not be among those. The unique perspective of students and their creativity might be thwarted in self-assessing their own work using only the standards on the rubric. Maja Wilson (2007) pointed out the importance of dialogue as an assessment tool. Where "ideas, expertise, intent and audience matter...a conversation is the only process responsive enough to expose the human mind's complex interactions with language" (p. 80). Dialogue is just as important as using the rubric in assessment, and may lead to changes in the rubric itself as teachers collaborate with their students.
According to Popham (2003), the purposes of classroom tests vary, but prior to constructing any test, teachers should first identify the kinds of instructional decisions that will be made based on test results, and the kinds of score-based inferences needed to support those decisions. Teachers would be most interested in the content validity of their tests and how well their test items represent the curriculum to be assessed, which is essential to make accurate inferences on students' cognitive or affective status. "There should be no obvious content gaps, and the number and weighting of items on a test should be representative of the importance of the content standards being measured" (Chapter 5, All About Inferences section).
Test items can be classified as selected-response (e.g., multiple-choice or true-false) or constructed-response (e.g., essay or short-answer). To better foster validity, Salend (2011) recommended matching the testing items to how content was taught. "Essay questions are usually best for assessing content taught through role-plays, simulations, cooperative learning, and problem solving; objective test items (such as multiple choice) tend to be more appropriate for assessing factual knowledge taught through teacher-directed activities" (p. 54). When constructing either type, Popham (2003) offered five pitfalls to avoid, all of which interfere with making accurate inferences of students' status. They are "(1) unclear directions, (2) ambiguous statements, (3) unintentional clues, (4) complex phrasing, and (5) difficult vocabulary (Chapter 5, Roadblocks to Good Item-Writing section). Students would benefit by knowing the differential weighting of questions and time limits in the directions. Ambiguity would be lessened with clearly referenced pronouns when used, and phrases that have singular meanings. Items should be written without obvious clues as to the correct answer. Examples of unintentional clues include the correct answer-option written longer than the incorrect answer-options or grammatical tip-offs (e.g., never, always).
Illustrating Popham's (2003) pitfalls to avoid, Fisher and Frey (2007) provided an example showing the difficulty in writing test stems for multiple-choice items. A student looks at a right triangle with legs marked as 5 cm each. The intention is for the student to find the length of the missing hypotenuse, as shown. Consider the following stems: "Find X." and "Calculate the hypotenuse (X) of the right triangle." A middle school student who is at the beginning stages of learning English might circle the X. He has found it! However, it is the latter choice that was intended, and is, therefore, the better unambiguous stem for the question (p. 107).
Fisher and Frey's (2007) example illustrated the importance of creating tests that are accessible to all learners. Salend (2011) noted four elements to consider: directions, format, readability and legibility. Directions should be concise avoiding irrelevant information, include the precision that students should present in their answers, and provide point totals for items included, and sections. They often feature numerals for sequenced information, bullet points for information that does not have a specific order, direction reminders throughout the test, and symbols to prompt students to pay attention to directions. Directions enclosed within text boxes call attention to them and enhance the format of the test. The format also features test items in a numbered sequence, a reasonable number of test items on a page, similar question types grouped together, and adequate space for answering each question on the test itself rather than on a separate scoring sheet, numbered test pages. For improved readability, test questions only contain necessary words for minimal sentence length; the words chosen are consistent with those used in class. Rather than using pronouns, questions refer directly to important ideas or facts needed to answer them. For improved legibility, Salend recommended using familiar fonts (e.g., Times New Roman or Arial) and type size 12 to 14 points for most test takers, and at least 18-point type for visually-impaired learners or beginning readers. Avoid text in all capital letters; avoid underlining text; use left-justified text and line length of about four inches; only use highlighting features (e.g., boldface or italics) for calling attention to certain words in a sentence (pp. 54-55).
According to Popham (2007), assessments for the most part should be supplied to teachers, rather than having them create their own. However, "many vendors are not providing the sorts of assessments that educators need" (p. 80). For classroom use, formative diagnostic and interim predictive for upcoming accountability tests are most in demand, as well as "instructionally sensitive accountability tests that can accurately evaluate school quality" (p. 80). Teachers must be able to evaluate a vendor's test to determine if it fulfills the role that it is intended to serve. In doing so, Popham suggested that teachers keep the following questions in mind:
Does the test measure a manageable number of instructionally meaningful curricular aims?
Do the descriptive materials accompanying the test clearly communicate the test's assessment targets?
Are there sufficient items on the test that measure each assessed curricular aim to let teachers and students know whether a student has mastered each skill or body of knowledge?
Are the items on the test more likely to assess what a student has been taught in school rather than what that student might have learned elsewhere? (p. 82).
Whether or not a test is teacher-made or vendor-made, the Test Accessibility and Modification Inventory is a valuable evaluation tool for facilitating a comprehensive analysis of tests and test items, including analysis of computer-based tests. It was written by Peter Beddow, Ryan Kettler, and Stephen Elliott (2010) of Vanderbilt University. Analysis considers the passage/item stimulus, the item stem, visuals, answer choices, page/item layout, fairness, and depth of knowledge required to answer a question. Computer-based test analysis also considers the test delivery system, test layout, test-taker training, and audio. This latter will be of particular importance to consider as computer-based assessments are planned for the Common Core State Standards in Mathematics.
Read Patricia Deubel's article in T.H.E. Journal
Deubel, P. (2010, September 15). Are we ready for testing under common core state standards? T.H.E. Journal. Available: http://thejournal.com/articles/2010/09/15/are-we-ready-for-testing-under-common-core-state-standards.aspx
Everyone makes mistakes. In viewing assessment for learning, "One of the best ways to encourage students to learn from their mistakes is to allow them to redo their work" for full credit (Lent, 2012, p. 141). However, there are some guidelines to consider so that redos do not become a logistic nightmare, nor used inappropriately just to grade swap. The goal for redos is to engage learners in deeper learning. Releah Lent provided tips to help educators develop their policy on redos. Key ideas included:
The bottom line, according to Lent (2012), is that "It is time to move from an unrealistic system where students have only one chance to get it right to a system where they understand that redos are not only OK but expected" (p. 141).
The following sources will provide you additional information on assessment:
Heritage, M. (2007). Formative assessment: What do teachers need to know and do? Phi Delta Kappan, 89(2), 140-145. Retrieved from http://easlinstitute.org/wp-content/uploads/Heritage_formative_assessment.pdf
Heritage, M. (2010). Formative assessment and next-generation assessment systems: Are we losing an opportunity? Washington, DC: Council of Chief School Officers. Retrieved from http://www.ccsso.org/Documents/2010/Formative_Assessment_Next_Generation_2010.pdf
Popham, W. J. (2009). A process – not a test. Educational Leadership, 66(7), 85-86. Retrieved from http://www.ascd.org/publications/educational-leadership/apr09/vol66/num07/A-Process%E2%80%94Not-a-Test.aspx
Stiggins, R. J. (2002). Assessment crisis: The absence of assessment FOR learning. Phi Delta Kappan, 83, 758-765. Retrieved from http://www.edtechpolicy.org/CourseInfo/edhd485/AssessmentCrisis.pdf
Andrade, H. (2007). Self-assessment through rubrics. Educational Leadership, 65(4), 60-63.
Beddow, P., Kettler, R., & Elliott, S. (2010). Test accessibility and modification inventory. Retrieved from http://peabody.vanderbilt.edu/docs/pdf/PRO/TAMI_Technical_Manual.pdf
Black, P., & Wiliam, D. (1998, October). Inside the black box: Raising standards through classroom assessment [Online]. Phi Delta Kappan, 80(2), 139-144, 146-148. [Note: also see the article at http://weaeducation.typepad.co.uk/files/blackbox-1.pdf].
Bond, J. (2013, October 24). Strategies for reflective assessment. ASCD Express, 9(2). Retrieved from http://www.ascd.org/ascd-express/vol9/902-bond.aspx
Boston, C. (2002). The concept of formative assessment. Practical Assessment, Research & Evaluation, 8(9). Retrieved from http://pareonline.net/getvn.asp?v=8&n=9
Brown, C., & Mevs, P. (2012). Quality performance assessments: Harnessing the power of teacher and student learning. Quincy and Boston, MA: Nellie Mae Education Foundation and Center for Collaborative Education. Retrieved from http://www.ccebos.org/qpa/wp-content/uploads/2012/02/qpa_report.pdf
Burns, M., Fager, J., Gumm, A., Haley, A., Krider, D., Linrud, J., et al. (n.d.). CMU assessment toolkit. Central Michigan University. Retrieved from https://www.cmich.edu/office_provost/AcademicAffairs/CAA/Assessment/Pages/Resources.aspx
Carter, G. (2004, December). When assessment defies best practice [Online editorial]. Alexandria, VA: ASCD. Retrieved from http://www.ascd.org/portal/site/ascd/ menuitem.ef397d712ea0a4a0a89ad324d3108a0c/
Chappuis, S., & Chappuis, J. (2007/2008, December/January). The best value in formative assessment. Educational Leadership, 65(4), 14-18.
Conrad, R., & Donaldson, J. (2004). Engaging the online learner: Activities and resources for creative instruction. San Francisco: Jossey-Bass.
Coppola, C. (2006). Understanding the open source portfolio. Retrieved from http://www.osportfolio.org/index.php?option=com_content&task=view&id=15&Itemid=30
Curwin, R. L. (2014). Can assessments motivate? Educational Leadership, 72(1), 38-40.
Davies, A. (2004). Transforming learning and teaching through quality classroom assessment: What does the research say? National Council of Teachers of English. School Talk, 10(1), 2-3.
Fisher, D., & Frey, N. (2007). Checking for understanding: Formative assessment techniques for your classroom. Alexandria, VA: ASCD.
Guskey, T. (2007). The rest of the story. Educational Leadership, 65(4), 28-35.
Lent, R. C. (2012). Overcoming textbook fatigue. Alexandria VA: ASCD.
Lorenzo, G., & Ittelson, J. (2005). An overview of e-portfolios. Educause Learning Initiative. Retrieved from http://net.educause.edu/ir/library/pdf/ELI3001.pdf
McConachie, S., Hall, M., Resnick, L., Ravi, A., Bill, V., Bintz, J., & Taylor, J. (2006). Task, text, and talk: Literacy for all subjects. Educational Leadership, 64(2), 8-14.
McTighe, J., & O'Connor, K. (2006, Summer). Seven practices for effective learning. Educational Leadership, 63(10), 13-19.
Moss, C. M., & Brookhart, S. M. (2009). Advancing formative assessment in every classroom. Alexandria, VA: ASCD. Retrieved from http://www.ascd.org/publications/books/109031.aspx
National Center for Student Progress Monitoring (n.d.). Common questions for progress monitoring. Retrieved from http://www.studentprogress.org/default.asp
Paulu, N. (2003). Helping your students with homework: A guide for teachers. Washington, DC: U.S. Department of Education. Retrieved from http://www2.ed.gov/PDFDocs/hyc.pdf
Popham, W. J. (2011, Spring). Exposing the imbalance in 'balanced assessment'. Baltimore, MD: John Hopkins University, Better: Evidenced-based Education, 14-15.
Popham, W. J. (2007). Who should make the test? Educational Leadership, 65(1), 80-82.
Popham, W. J. (2003). Test better, teach better: The instructional role of assessment. Alexandria, VA: ASCD. Retrieved from http://www.ascd.org/publications/books/102088.aspx
Preece, A. (1995). Self-evaluation: Making it matter. In A. Costa, & B. Kallick, Assessment in the Learning Organization (pp. 30-48). Alexandria, VA: ASCD.
RTI Action Network (n.d.). What is RTI? Retrieved from http://www.rtinetwork.org/
Salend, S. (2011). Creating student-friendly tests. Educational Leadership, 69(3), 52-58.
School Improvement Network (2014, January). Assessment to the Core: Assessments that enhance learning [Webinar featuring Jay McTighe]. Retrieved from http://www.schoolimprovement.com/webinars/assessment-to-the-core/
Tomlinson, C. (2014). The bridge between today's lesson and tomorrow's. Educational Leadership, 71(6), 11-14.
Tomlinson, C., & McTighe, J. (2006). Integrating differentiated instruction & Understanding by Design. Alexandria, VA: ASCD.
Tuttle, H. G. (2007, February 15). Digital-age assessment. Technology & Learning. Retrieved from http://www.techlearning.com/features/0039/digital-age-assessment/44127
Vatterott, C. (2011). Making homework central to learning. Educational Leadership, 69(3), 60-64.
Vatterott, C. (2009). Rethinking homework: Best practices that support diverse needs. Alexandria, VA: ASCD.
Wilson, M. (2007). The view form somewhere. Educational Leadership, 65(4), 76-80.
See other Math Methodology pages: