# compute descriptive statistics for earnings by gender for the full sample and by race ethnicity

Data

For this project, we will use a Individuals who worked at least 40 of the past 52 weeks and at least 35 hours during a typical week are classified as FTYR workers. Our data file includes 3 variables: annual earnings (dollars), gender (male or female), and race/ethnicity (non-Hispanic white, non-Hispanic Black, non-Hispanic Asian, and Hispanic).

Statistical Analysis

Compute descriptive statistics for earnings by gender for the full sample and by race/ethnicity. Examine the data for suspicious values.

Using the full sample (i.e., for now do not exclude individuals with unrealistic earnings), compute the gender earnings gap overall and by race/ethnicity. The gender earnings gap (GEG) is defined as follows:

GEG = 1 – (median earnings of women working FTYR / median earnings of men working FTYR).

Test whether the population mean earnings of women working FTYR differ from the population mean earnings of men working FTYR. Repeat this test for each of the four racial/ethnic groups. For example, test whether the population mean earnings of Hispanic women working FTYR differ from the population mean earnings of Hispanic men working FTYR. For each of the hypothesis tests, determine whether the difference in sample mean earnings across the two groups is statistically significant at the 1% level, statistically significant at the 5% level, or not statistically significant at the 5% level.

Optional: Repeat the analysis above after excluding individuals with unrealistic earnings. Indicate which observations were excluded for this second analysis. To help you determine which observations to exclude, note that the 2014 federal minimum was \$7.25/hour and that all individuals in the sample reported working at least 40 of the past 52 weeks and at least 35 hours in a typical week. Report the results of these hypothesis tests and/or discuss whether the implications differ from the results based on all observations. Be sure to report results based on all observations even if you also report these additional results.

Please note that the GEG is a function of the median earnings but the hypothesis tests involve mean earnings.

Report

Write a brief report for the Institute for Women’s Policy Research (Links to an external site.)Links to an external site..

Open with a brief motivation for estimating gender earnings gaps.

Provide a table reporting the mean, median, and standard deviation of earnings by gender (for all four racial/ethnic groups combined) and by gender and race/ethnicity. Also report the overall gender earnings gap as well as the gender earnings gap for each race/ethnicity. Provide a brief discussion of these descriptive statistics.

Report and interpret the results of each of your five hypothesis tests. Be sure to discuss the statistical significance of each difference.

Discuss the implications as well as the limitations of your findings.

Scoring Criteria

Background and Motivation

The purpose of the analysis is clearly stated near the beginning of the report.

The report provides background on the data including the source of data and the subsample sizes (number of men and women).

The report engages the readerâ€™s interest by explaining the importance of the topic.

____ 4 possible points

Descriptive Statistics

The report provides accurate and relevant descriptive statistics, namely the mean, median, and standard deviation of menâ€™s earnings overall, womenâ€™s earnings overall, menâ€™s earnings by race/ethnicity, and womenâ€™s earnings by race/ethnicity. The report also indicates the gender earnings gap overall and by race/ethnicity.

These results are presented in a table. The table looks professional.

The discussion of the descriptive statistics and gender earnings gap is appropriate and interesting.

____ 8 possible points

Hypothesis Tests

The report presents the results of hypothesis tests concerning differences in mean earnings by gender overall and by race/ethnicity.

The description of the tests is clear and accurate.

These results are presented in a table. The table looks professional.

The t statistics, p-values, and hypothesis-testing conclusions are correct.

The interpretation of the findings is correct.

____ 18 possible points (up to 2 extra credit points for this section if optional results are also presented and/or discussed)

Implications and Limitations

The report provides a good discussion of the implications of the findings including discussions of both the statistical and economic significance of the findings.

The report addresses at least two limitations; one of these could relate to the possibility of incorrect earnings values for some observations.

____ 6 possible points

Quality of Writing

The report is clear. There are no sentence-level errors. The writing is professional.

____ 4 possible points

____ 40 Possible Points (Up to 42 Possible Points if Optional Results Presented)