Jump to page content Jump to navigation

College Board

AP Central

AP Exam Reader
Siemens Awards for Advanced Placement

APAC 2010
Print Page
Home > The Exams > All About the Exams > Statistical Specifications

Statistical Specifications

One of the most important parts of making sure that AP Exams are both comparable to college-level exams and reasonably parallel across exams from year to year is the creation of "statistical specifications" that measure the distribution of item difficulties on an exam. The Development Committees and content experts work closely with ETS statisticians in applying these specifications when they develop AP Exams.

The delta is an index of item difficulty used at ETS. For the multiple-choice sections of AP Exams, the statistical specifications are made up of desired distribution of deltas, with a particular mean and standard deviation. The calculation of deltas, the process of delta equating, and the differences between observed and equated deltas are described in detail in the analyzing section of this Technical Corner.

There are two different models for constructing exams:
  1. Model I. This model, illustrated in Table 2.2, makes use of item response theory (IRT); detailed information on using IRT in the development of statistical specifications can be found in Marco (1977). Results of IRT analyses of multiple-choice questions can be directly translated into distributions of deltas that are suitable for use as statistical specifications. The AP Program recommends granting credit and/or advanced placement to students receiving grades of 3, 4, or 5, and these delta distributions were specifically chosen because they provided excellent discrimination of students at the 2 to 3 cut-off point (i.e., the definition of "qualified students"), and more than adequate discrimination at the 3 to 4 cut-off point. A detailed discussion of the assignment of AP grades is presented in the grading section of this Technical Corner.

    The Model I distributions were developed using one specific exam administered to a group of AP students in a particular year. As a result, the specifications are reexamined on a periodic basis and adjusted as necessary to be kept relevant to the ability levels of current groups of AP students.

  2. Model II. For the smaller-volume exams (that is, those that are taken by fewer students), IRT is not used to develop statistical specifications. Instead, ETS statisticians develop separate distributions of observed deltas for those exams that have four-choice items and for those that have five-choice items (see Table 2.4). These distributions are centered on middle difficulty and have approximately a normal distribution around the mean. Slightly more items are specified for delta intervals below the mean than above in order to maximize, to the extent possible, discrimination around the 2 to 3 cut-off point.

    Subjects that use Model II are: AP Art History, Computer Science, Economics (mean 11.9), Environmental Science (mean 12.0), French Literature, German Language, International English, Latin, Music Theory, Spanish Literature, Statistics, and World History.





  ABOUT MY AP CENTRAL
    Course and Email Newsletter Preferences
  AP COURSES AND EXAMS
    Course Home Pages
    Course Descriptions
    The Course Audit
    Sample Syllabi
    Teachers' Resources
    Exam Calendar and Fees
    Exam Questions
    FAQs
  PRE-AP
    Teachers' Corner
    Workshops
  AP COMMUNITY
    About Electronic Discussion Groups
    Become an AP Exam Reader

Back to top