Jump to page content Jump to navigation

College Board

AP Central

AP Course Audit Web Site
Become an AP Reader
Click for more information about College Board Online Events


Print Page
Home > After the Exam: The Gini Index Activity (Statistics and Calculus)

After the Exam: The Gini Index Activity (Statistics and Calculus)

The following is a joint post-exam project for AP Calculus and AP Statistics students.

I recommend a project based on the Gini Index. The Gini Index is a measure of the inequity of distribution (income, land ownership, use of energy, etc). This is a project I have my students do in calculus each year and I have also had my statistics students work on part of it. I think it would be great as a joint project.

Daniel J. Teague
NC School of Science and Mathematics
Durham, North Carolina
Imagine lining up all the households in the U.S., from the household with the smallest income to the household with the largest income (that's the Gate's household). Now, divide this linear ordering into five equally sized groups. Each group contains 20 percent of U.S. households. What fraction of the total income of the U.S. does each fifth have?

In 2000, the distribution of income reported by the census bureau was the following:

The first fifth of the households held 3.6 percent of the total income.
The second fifth of the households held 8.9 percent of the total income.
The third fifth of the households held 14.9 percent of the total income.
The fourth fifth of the households held 23.0 percent of the total income.
The highest fifth of the households held 49.6 percent of the total income.

(A list of historical income distributions since 1967 can be found in "See also," below.)

We would like to develop a method of "measuring" how inequitable this distribution income is (calculus) and use this measure to compare the inequity for other years, other countries, or Democratic versus Republican presidencies (statistics).

First we need a way to measure inequity. Imagine a country in which the distribution of income is perfectly equitable. In this case, the bottom 20 percent of the households will have 20 percent of the income. The bottom 40 percent will have 40 percent of the income, the bottom 60 percent of the households will have 60 percent of the income, and the bottom 80 percent will have 80 percent of the income. Of course, 100 percent of the households have 100 percent of the income. The cumulative distribution will be represented by the line y = x. The area under y = x from 0 to 1 is 1/2.

Now consider a country in which the distribution is perfectly inequitable. In this case, the bottom 20 percent of the households will have 0 percent of the income. The bottom 40 percent will have 0 percent of the income, the bottom 60 percent of the households will have 0 percent of the income, and the bottom 80 percent will have 0 percent of the income. Only one person has all the income and everyone else has nothing. The cumulative distribution will be represented by the line y = 0. The area under y = 0 from 0 to 1 is 0.

In 2000, we have:

The lowest fifth of the households held 3.6 percent of the total income.
The lowest two-fifths of the households held 3.6 percent + 8.9 percent = 12.5 percent of the total income.
The lowest three-fifths of the households held 12.5 percent + 14.9 percent = 27.4 percent of the total income.
The lowest four-fifths of the households held 27.4 percent + 23.0 percent = 50.4 percent of the total income.
Of course, the lowest five-fifths of the households (all of them) held 50.4 percent + 49.6 percent = 100 percent of the total income.

This gives the ordered pairs (0.2, 0.36), (0.4, 0.125), (0.6, 0.274), (0.8, 0.504), and (1.0, 1.0). Of course (0,0) is also a point in this measure. These points define a function that lies between y = 0 and y = x. This function is known as the Lorenz curve, L(x). The more inequitable the distribution, the greater the area between
y = x and y = L(x).

Since the largest possible area is 1/2, perfect inequity, we will define the Gini Index as the ratio of the area between y = x and y = L(x) to 1/2. (This also answers the student question, "When will we ever be interested in knowing the area between two curves?")

Now, how do we find the Lorenz curve? We know that the function must pass through (0,0) and (1,1), so we choose a power function y = x^n as our model. We cannot just use our calculators to find the power function y = ax^n, since this does not necessarily pass through (1,1). (While Lorenz curves can have many different shapes, depending on what is being modeled, in this setting models of the form y = x^n are typically used.)

Calculus students must develop a method for finding L(x) by minimizing the sums of squares for the function.
S(n) = (y1 - .2^n)^2 + (y2 - .4^n)^2 + (y3 - .6^n)^2 + (y4 - .8^n)^2.

Notice that (0,0) and (1,1) give you no information for this function, but the form y = x^n guarantees these points will be on the curve. They will need to use numerical techniques to find the zero of the derivative.

They could also re-express the data and find the least squares solution to:
S(n) = [ln(y1) - n ln(.2)]^2 + [ln(y2) - n ln(.4)]^2 + [ln(y3) - n ln(.6)]^2 +
[ln(y4) - n ln(.8)]^2
They can do this analytically.

Once they have found a method for generating a Lorenz curve from the data, they need to find the Gini Index.

This is a simple computation, GI = defint(x - L(x), x, 0, 1)/0.5

(A Gini Index can also be computed directly from the data using a trapezoid rule, without first finding the Lorenz curve, but that removes some of the calculus from the problem.)

Once the Gini Indices have been computed, we can proceed to the statistical questions, such as:

  1. Are the Gini Indices lower for Republican administrations than for Democratic administrations? Interpret the result of this investigation.
  2. Are the Gini Indices higher for industrialized countries than for agrarian countries? Interpret the result of this investigation.
  3. Is there a relationship between the Gini Index and unemployment levels?
  4. Is there a relationship between the Gini Index and race, etc.?
A more detailed explanation of the Gini Index can be found in "See also," below.







  MY AP CENTRAL
    Course and Email Newsletter Preferences
  AP COURSES AND EXAMS
    Course Home Pages
    Course Descriptions
    The Course Audit
    Sample Syllabi
    Teachers' Resources
    Exam Calendar and Fees
    Exam Questions
    AP Credit Policy Information
  PRE-AP
    Teachers' Corner
    Publications
  AP COMMUNITY
    About Electronic Discussion Groups
    Become an AP Exam Reader

Back to top