Describing the Books Data Set
27 sixth graders reported books read over summer.
Context report:
observations- Attribute: books read over summer
- Units: books
- Method: self-reported survey
Mean Measures Equal Distribution of Values
The mean is what each person would get if the total were divided equally:
- Add every value; divide by
- Interpretation: "the fair share" — equal portion for each
- Sensitive to outliers — extreme values pull the mean
Computing the Mean: Books Data
But wait — most students read 5 or fewer books. One student read 18.
The outlier (18) pulled the mean upward — 6.4 doesn't feel representative.
Finding the Median: Sort Then Locate
The median is the middle value when data is sorted:
- Odd
: exact middle → position - Even
: average the two middle values
For our data:
What the Gap Between Mean and Median Reveals
| Scenario | Mean | Median | What it means |
|---|---|---|---|
| Books data (with outlier 18) | 6.4 | 5 | Right-skewed; mean pulled up |
| Quiz scores {70, 72, 74, 76, 78} | 74 | 74 | Symmetric; mean ≈ median |
Key insight: When mean and median are close → symmetric distribution.
When mean > median → right-skewed (or high outlier).
Quick Check: Which Measure Fits?
Scenario A: Heights of 20 students — mean = 62 in, median = 62 in
Scenario B: Monthly allowances — one student gets $500, rest get $10–30
For each scenario: which measure better represents a "typical" value?
Think before the next slide...
Three Tools to Measure Data Spread
We know the center. Now: how spread out are the values?
Three measures of variation:
- Range — the total span
- IQR — the middle 50%
- MAD — average distance from the mean
Each tells a different story about variability.
Range: Maximum Minus Minimum Value
For books data:
Pro: Simple to compute
Con: One outlier inflates it — range only reports extremes
IQR: Spread of the Middle 50%
The IQR (interquartile range) measures the span of the middle half of the data:
= median of the lower half = median of the upper half- Resistant to outliers — ignores the top and bottom 25%
Computing IQR Step by Step
Median = 14th value = 5. Split remaining 26 values into halves.
- Lower half (13 values):
- Upper half (13 values):
Middle 50% of students: between 3 and 8 books.
MAD Measures Typical Distance from Mean
The MAD is the average of how far each value is from the mean:
- Find the mean
- Compute
for each value - Average those absolute deviations:
MAD Computation Using a Deviation Table
Data set: {2, 3, 5, 5, 7, 8, 12},
Values in this set typically vary by about 2.6 books from the mean of 6.
Outlier Effect: Comparing All Three Measures
Add an outlier — change 12 to 30 in our data set. What changes?
| Measure | Without outlier | With outlier (30) | Sensitive? |
|---|---|---|---|
| Range | 10 | 28 | Yes — huge change |
| IQR | 4 | 4 | No — unchanged |
| MAD | 2.6 | ~4.3 | Yes — increases |
IQR is resistant. Range and MAD are sensitive to outliers.
Center and Spread: Choosing the Right Pair
We have our center and variation. Now: which pair should we report?
- Symmetric, no outliers → mean + MAD
- Skewed, or has outliers → median + IQR
The choice isn't arbitrary — it's about honest communication.
Two Paths for Choosing Summary Measures
Look at the display first. Then decide.
Writing a Summary for Symmetric Data
Quiz scores (25 students) — shape: roughly symmetric, no outliers → mean + MAD
- Mean = 74 points; MAD ≈ 2.1 points
- Summary: "25 quiz scores. Mean = 74 pts. Roughly symmetric. Values vary ~2 pts from the mean."
Writing a Summary for Skewed Data
House prices (20 homes): mostly $150K–$250K, one at $1.2M → right-skewed
→ Choose median + IQR
- Mean = $342K (pulled by outlier); Median = $198K; IQR = $62K
- Summary: "20 homes. Median = $198K. Right-skewed. Middle 50% within $62K range."
Mean of $342K overstates a typical home.
Your Turn: Complete the Summary
Given: 18 plant heights (cm) measured in a school garden.
Shape: right-skewed (a few very tall plants)
| Statistic | Value |
|---|---|
| Mean | 24.3 cm |
| Median | 19.5 cm |
| IQR | 8.2 cm |
| MAD | 5.1 cm |
Which measure pair should you report? Write the summary statement.
Plant Heights: Correct Summary Statement
Right-skewed → median + IQR
- Median = 19.5 cm; IQR = 8.2 cm
Summary: "18 plant heights in cm. Right-skewed with tall outliers. Median = 19.5 cm. Middle 50% within 8.2 cm range."
Mean of 24.3 cm overstates a typical plant.
What's Wrong with This Summary?
Error to identify:
A student summarized skewed income data as:
"The mean household income is $87,000 with a MAD of $31,000."
The data has several households earning over $500,000.
What's wrong? What should they have reported instead?
Lesson Summary and Four Common Warnings
✓ Always describe context first:
✓ Mean = fair share; Median = middle value — compute both and compare
✓ Range, IQR, MAD each measure variation differently — IQR and MAD are more informative than range
✓ Symmetric/no outliers → mean + MAD; Skewed/outliers → median + IQR
Watch out: For even
Watch out: MAD requires absolute values — signed deviations always sum to 0
Watch out: Skewed data or outliers → median, not mean
Watch out: Always interpret MAD with units — "values vary by ___ [units] from the mean"
Coming Up Next: Seventh Grade Statistics
You've mastered 6.SP.B.5 — summarizing data in context.
In 7th grade statistics, you'll use these skills to:
- Compare two populations using mean and MAD
- Draw inferences from random samples
- Explore overlapping distributions