NEWL Quality Assurance

Overview

NEWL measures students’ functional proficiency in one of four target languages.  NEWL results can be used for predicting success in language learning beyond high school study. The assessment does not assess specific curricular achievement or the accumulation of grammatical, literary, or historical knowledge. Exam specifications reflect the most recent draft of the World Languages Framework (College Board, 2019), and the proficiency guidelines developed by American Council on the Teaching of Foreign Languages (ACTFL).

Quality Assurance

NEWL is administered via American Councils’ Language Assessment Support System (ACLASS), which supports the administration of approximately 1,000 examinees annually. The platform facilitates two-way communication between the American Councils administration and the many field experts who develop NEWL. The platform enables on-line item and exam development, data collection, scoring, and certification productionAll exam forms are stored in the on-line system for quick assembly, sharing, analysis and quality control.

ACLASS enables standardized exam conditions for all: Examinees take the same exam, administered under the same conditions and protocols, across various locales, via an internet connection. The ACLASS system also allows for efficient examinee registration and real-time monitoring during exam administration through a chat navigator. Customization of exams is also possible when an examinee has unique circumstances such as a need for make-up testing or special needs accommodations.

In sum, the ACLASS computer-based testing system supports NEWL through:

  • Collaborative exam development
  • Workflow management system
  • Online examinee registration
  • Easy exam center set-up and proctor certification
  • Real time, on-line exam proctoring and communication
  • Online constructed response scoring platform (writing and speaking tasks)
  • Efficient score reporting and data organization

Exam Development

NEWL assesses four language skills: reading, listening, writing, and speaking.  The reading and listening comprehension sections consist of all multiple-choice items; writing and speaking sections have open-ended response tasks. The content development team consists of five assessment specialists and a broad network of professional linguists, exam item writers, editors, and cultural experts. The full time American Councils assessment team has expertise in second language acquisition, assessment design, foreign language standards, item and exam development, exam administration, psychometrics, and statistics. See the Key Personnel section below for more information about the assessment team at American Councils.

Peer Review in Item Development

American Councils’ Quality Assurance starts with a rigorous peer review of all exam items in their development phase. American Councils employs detailed protocols and assigns all personnel clear roles in the item development process. Trained specialists analyze and edit all passages, audios, and their accompanying translations and/or transcripts, as well as their items, through a process that requires several rounds of exchanges between Target Language Experts, Item Writers, Item Editors, and American Councils’ Item Development Managers. The stages of American Councils’ item development process are outlined below in more detail.

Stage One: Passage and Audio Selection for Reading and Listening Sections

American Councils selects culturally authentic reading materials and listening audios for NEWL that are level-appropriate and meet specific content and quality criteria. “Culturally authentic” means that the passages and audios are actual open-source media, taken from sources in the target culture and meant for native speakers. Target Language Experts align them to the ACTFL proficiency scale to ensure adequate coverage at all assessed performance levels. A different group of language experts conduct multiple rounds of language reviews to ensure the appropriate levels of the passages and audios as well as the accuracy of their English translations. In the final stage of passage or audio selection, a program manager confirms the passage level and quality - or brings in an additional expert to review if needed - before forwarding the materials to an exam item writer.

Stage Two: Item Writing

Item writers submit their work on a rolling basis according to targets in an item development work plan. Once items are submitted, item development managers review and ensure the appropriate number of items for each passage or audio based on their content and ACTFL level. Managers also examine the structure and parts of the submitted items and send any problematic items to editors for further review and revision. Once the items are revised by editors, managers ensure that the item sets need no further editing before they are sent to a different expert for a blind review process. A blind review is when the reviewer sees the passage and items exactly as a student would, with no additional information about the item’s history, edits, or expected answer key.

A different expert who has not participated in the development of that item set provides blind review to ensure that it is level appropriate, no distractors are correct answers, the item cannot be answered correctly through background knowledge, and the item set meets other NEWL quality criteria. Managers then check the blind review comments and send the item set back to the original target language expert or editor for further reviews, if necessary. Once completed, an item development manager reviews and approves the item set in its entirety.

Stage Three: Psychometric Review

An American Councils Psychometrician conducts psychometric analyses on the exam items and NEWL test forms. Quantitative item statistics are calculated from the raw examinee item response data to confirm that (1) items, keys, and distractors are functioning appropriately, (2) the items discriminate between high and low performers, and (3) there is no evidence of differential item functioning (DIF), which may introduce bias between genders and between heritage and non-heritage students. If psychometric quality criteria are not met, items are flagged for further review and revision.

Personnel Performance Monitoring

American Councils ensures item quality through rigorous performance monitoring of target language experts and item writers. We allocate a larger proportion of item development tasks to high performers and train or releases poor performers. Through this process, American Councils also identifies the specific strengths of each expert, such as a particular talent for item development at a lower or higher ACTFL level.

Secure Administration

NEWL is administered by trained proctors in computer labs or school classrooms in the United States and abroad. On exam days, proctors are required to sign-in and sign-out each of their examinees by entering their proctor information at each student’s computer. American Councils provides detailed directions on how to administer NEWL through a Proctor Instruction Manual as well as a detailed script to be read aloud to examinees on the day of administration.  American Councils also provides online monitoring support for all exam administrations and an emergency telephone number for students experiencing connectivity or login issues.

The ACTFL Performance Scale and NEWL Scoring

NEWL is a criterion referenced exam. NEWL assesses 4 ACTFL proficiency levels on reading and listening comprehension:  Novice High, Intermediate Low, Intermediate Mid, and Intermediate High. The figure below shows the entire proficiency scale from Novice Low up to Distinguished. Note how NEWL accesses the low to middle levels of the ACTFL scale.

Empirical data are used in the standard-setting sessions to determine the cut scores for each examined performance levels on the reading and listening sections. The writing and speaking sections are rated holistically by trained language specialists based upon the rating rubrics. Twenty percent (20%) of the writing and speaking responses are double rated, and the inter-rater reliability is computed to ensure acceptable consistency across raters.

American Councils developed NEWL to be comparable to the standards of the College Board AP Exams. NEWL sub-scores are reported in terms of language proficiency on the ACTFL scale, with a comparable five-point scale of the College Board AP® exams. In most universities, scores of 5/4/3/2/1 on NEWL earn the same credit and/or placement as the same score on an AP exam in a World Language or any other subject.

The ACTFL Proficiency Guidelines can be further referenced by skill here.

          ACTFL Level Descriptions | Language Center | College of Liberal Arts

Key Personnel

A person smiling for the camera

Description automatically generated with low confidence­­­Mrs. Huma Shamsi is the NEWL Senior Program Coordinator with over a decade of experience in non-profit organizations, including many years in the field of assessment and developing proficiency exams in the less commonly taught languages. She is multi-lingual and trained in both ILR and ACTFL proficiency levels. She leads all aspects of the NEWL program. She has been involved with NEWL since its inception in 2017 and is passionate about linguistic and cultural competency.

 

A person with a beard and a bun

Description automatically generatedMr. Elias Wright is the Test Administration Coordinator for American Councils for International Education. As Test Administration Coordinator, Elias is responsible for coordinating exam administration, including registering examinees, deploying tests, generating score certificates, and performing quality assurance tests. He also provides remote monitoring and technical support for proctors during exams.

 

A person in a suit and tie

Description automatically generatedMr. Ken Petersen, Ph.D., is the Technical Director for Online Learning and Assessment at American Councils where he has overseen the development of web-based language learning and assessment systems for the past 20 years. Prior to his work with American Councils, he spent many years as an ESL teacher, researcher, and learner of languages throughout South and West Asia. He holds an M.A. from the University of Washington in Near Eastern Languages and a Ph.D. in Applied/Computational Linguistics from Georgetown University. His areas of expertise include instructional design, language testing, natural language processing, computer systems architecture, and second language acquisition. On NEWL, Dr. Petersen manages American Councils’ on-line, computer-based exam system.

A person with long hair wearing glasses

Description automatically generatedMrs. Nur Karatas, Ph.D., is the Lead Item Development Manager at American Councils and holds a doctorate in Second Language Acquisition from the University of Maryland. Since she joined American Councils in 2020, Dr. Karatas has monitored and coordinated the flow of item development per the ILR and ACTFL performance scales in multiple foreign language testing projects, such as DLPT5, Flagship, ProjectGo and NEWL. On NEWL, Dr. Karatas manages the process of ACTFL-based item development and exam form assembly. She provides regular trainings for target language experts, item writers/editors and raters of speaking and writing tasks to improve quality assurance of test items and scoring practices.

A picture containing person, person, posing, white

Description automatically generatedMrs. Ying Li, Ph.D., is the Principal Psychometrician at American Councils and holds a doctorate in Educational Measurement and Assessment from the University of Maryland. Dr. Li joined American Councils in 2016, and since then, she has planned, implemented, and managed test development, scoring, standard setting, and report delivering processes for proficiency-based language assessments, including NEWL. Prior to her position at American Councils, Dr. Li had worked as a psychometrician on large scale statewide assessments, as well as national certification exams with intensive experiences in creating test blueprints, reviewing examinees’ data, and facilitating standard setting meetings to determine the cut scores. 

A picture containing person, person, outdoor, glasses

Description automatically generated  Mr. Todd Drummond, Ph.D., has a doctorate in Education Policy from Michigan State University. He has worked in educational assessment, standards, test development, and statistics for over 20 years. Dr. Drummond has worked as a foreign language educator and test item development trainer on contract with the U.S. Agency for International Development (USAID) and the World Bank. He has conducted assessment and test item development and standard setting workshops in Bangladesh, Ethiopia, Kyrgyzstan, Moldova, Georgia, Tajikistan, Mozambique, Honduras, and Ukraine, and led educational assessment initiatives in several other countries. On NEWL he supports the Program Coordinator in all aspects of her work.