2022
AP
®
Statistics
Sample Student Responses
and Scoring Commentary
Inside:
Free-Response Question 1
Scoring Guidelines
Student Samples
Scoring Commentary
© 2022 College Board. College Board, Advanced Placement, AP, AP Central, and the acorn logo are registered
trademarks of College Board. Visit College Board on the web: collegeboard.org.
AP Central is the ocial online home for the AP Program: apcentral.collegeboard.org.
AP® Statistics 2022 Scoring Guidelines
Question 1: Focus on Exploring Data
4 points
General Scoring Notes
Each part of the question (indicated by a letter) is initially scored by determining if it meets the criteria for
essenti
ally correct (E), partially correct (P), or incorrect (I). The response is then categorized based on the
scores assigned to each letter part and awarded an integer score between 0 and 4 (see the table at the end
of the question).
The model solution represents an ideal response to each part of the question, and the scoring criteria
iden
tify the specific components of the model solution that are used to determine the score.
Model Solution Scoring
(a)
The s
catterplot reveals a strong, positive, roughly
linear association between the mass and length of
bullfrogs. There are no points that seriously
deviate from the straight-line pattern of the points
in the plot.
Essentially correct (E) if the response provides
a description that includes at least three of
components 1-4 and component 5:
1. Direction of association (positive or
increasing)
2. Strength of a
ssociation (strong)
3.
Form of association (linear or a
pproximately
linear)
4. Unusual features (no points with large
discrepancies from
the pattern (straight line)
exhibited by most of the points on the plot)
5. Context (association between length and mass
of bullfrogs)
P
artially correct (P) if the response satisfies
only one
or two components out of components
1-4 and component 5
OR
if the response satisfies at least three out of
components 1-4 but does not satisfy
component 5.
Incorrect (I) if the response does not meet the
crit
eria for E or P.
Additional Notes:
To satisfy component 4, it is sufficient to simply indicate that there are no unusual features.
To satisfy component 5, it is minimally sufficient for the response to refer to the association or
rela
tionship between mass and length without explicitly mentioning bullfrogs.
The strength of the response in part (a) may be considered if holistic scoring is needed.
© 2022 College Board
AP® Statistics 2022 Scoring Guidelines
Model Solution Scoring
(b)
The value of the slope of the least-squares
regression line is 6.086. This value indicates that
the predicted mass of a bullfrog increases by
6.086 grams for each additional millimeter of
length.
Essentially correct (E) if the response satisfies
the following three components:
1.
Identifies the value of the slope as 6.086
2. Provides an interpretation that references an
increase o
f a number of grams of mass for
each one-millimeter increase in length
3. Indicates that the slope represents a change in
a predi
ction using non-deterministic
language such as “predicted,” “estimated,
“expected,” or “average”
Partially correct (P) if the response satisfies
only tw
o of the three components.
Incorrect (I) if the response does not meet the
crite
ria for E or P.
Additional Notes:
The value of the slope, 6.086, may be rounded to 6.09 or 6.1, but not to 6, to satisfy the numerical
requirement in component 1.
A response that only contains 6.086 in the interpretation satisfies component 1.
A calculation of slope may satisfy component 1, provided that two points from the line are used in the
calculation.
Units of measurements must be correctly specified for both mass and length to satisfy component 2.
It is not required to refer specifically to the “least-squares regression line.”
© 2022 College Board
AP® Statistics 2022 Scoring Guidelines
Model Solution Scoring
(c)
The coefficient of determination is
r
2
0.819.
This value indicates that 81.9% of the variation
in bullfrog mass can be explained by variation in
bullfrog length as described by the least-squares
line.
Essentially correct (E) if the res
ponse provides
a correct interpretation of
r
2
in context.
Partially correct (P) if the response provides a
generic inte
rpretation (no context)
OR
if the response provides a reasonable but
incorrect in
terpretation of
r
2
in context.
Incorrect (I) if the response does not satisfy the
criteria for E
or P.
Additional Notes:
Correct interpretations of
r
2
include the concept that part of the variation in the response (dependent or y)
variable i s explained by the linear relationship with the explanatory (independent or x) variable. The
response can take any of several equivalent forms, such as:
The proportion of the total variability in the dependent (response) variable y that is explained by the
independent (expl
anatory) variable x.
The proportion of variation in y that is accounted for by the linear model.
The proportionate reduction of the total variation of the y-values that is associated with the use of the
independent vari
able x.
The proportionate reduction in the sum of the squares of vertical deviations obtained by using the
least-squares line instead of the sample mean to predict values of y.
Correct interpretation of
r
2
must explicitly relate to the dependent variable. Mention of the data,
predicted values, or no mention of the dependent variable are incorrect interpretations. Common incorrect
interpretations include:
The percent (or proportion or part of the total) variability in th
e predicted y-values that is explained by
the linear relationship between y and x.
The percent (or proportion or part of the total) variability in the data that is explained by the linear
relationshi
p between y and x.
The percent (or proportion or part of the total) variability that is explained by the linear relationship
between y and x.
The percent (or proportion or part of the total) variability in y that is on average explained by the
linear relationship between y and x.
A reasonable but incorrect interpretati
on of
r
2
with context might include the following responses:
81.9% of the variation in mass and length can be accounted for by the least-squares reg
ression line.
81.9% of the variability in predicted mass is accounted for by the length.
For context, the response variable (y) must be identified as mass, and the explanatory variable (x) must be
identified as
length.
An interpretation of the correlation
between mass and length,
r = 0.819 = 0.905,
is not considered a
reasonable interpretation of
r
2
.
The value of the percentage (81.9%) or proportion (0.819) of variation does not need to be specified, but
if an incorrect value is specified, the score is lowered by one level, from E to P or from P to I.
The strength of the response in part (c) may be considered if holistic scoring is needed.
© 2022 College Board
AP® Statistics 2022 Scoring Guidelines
Model Solution
(d)
(i) The largest residual in absolute value belongs
to the bullfrog with length 162 millimeters and
mass 356 grams.
(ii) The least-squares regression line overestimates
the mass of the bullfrog with length 162
millimeters. Plot 2 shows that the point for the
bullfrog with length 162 millimeters is below
the least-squares regression line.
Scoring
Essentially correct (E) if the response satisfies the
following two components:
1. The response to part (d-i) identifies the correct
bullfrog (length between 160 and 165 millimeters,
mass between 350 and 375 grams)
2. The response to part (d-ii) explicitly indicates
whether
the linear model overestimates or
underestimates mass for the bullfrog identified in
part (d-i) and provides a correct justification based
on a comparison of the identified observation to
the least-squares regression line
Partially correct (P) if the response satisfies only
one of the
two components.
Incorrect (I) if the response does not satisfy the
crite
ria for E or P.
Additional Notes:
The comparison of the observation to the regression line in the response to part (d-ii) is satisfied if the response
does one of the following:
Correctly indicates if the observation is below (above) the least-squares regression line in Plot 2.
Notes that observed mass is smaller (larger) than the mass predicted by the least-squares regression line.
Marks the observation selected in part (d-i) on Plot 2 with an indication of the vertical distance from the
least-squares regression line.
Notes the correct sign of the residual.
Numerical values are not required in the response to part (d-ii). If a n
umerical value is given for the predicted
mass, however, it must be reasonable. A numerical value for the predicted mass could be computed with the
formula given in the stem, e.g.,
546 +
(
6.086
)(
162
)
= 439.9
grams, for a bullfrog of length 162 millimeters,
or a value can be read from the line shown in Plot 2. Any value between 425 and 450 should be considered a
reasonable value. Showing work is not required.
The word overestimate with the calculated predicted value of mass is enough to satisfy component 2.
If the wrong observation is identified in part (d-i), the response to part (d) may be scored P if the response to
part (d-
ii) correctly compares that observation to the least-squares regression line and states the correct
conclusion about overestimating or underestimating mass with justification.
It is not required to refer specifically to the “least-squares regression line.”
© 2022 College Board
AP® Statistics 2022 Scoring Guidelines
Scoring for Question 1
Each essentially correct (E) part counts as 1 point, and each partially correct (P) part counts as ½ point.
Score
Complete Response 4
Substantial Response 3
Developing Response 2
Minimal Response 1
If a response is between two scores (for example, 2 ½ points), use a holistic approach to decide whether to
score up or down, depending on the strength of the response and quality of the communication.
© 2022 College Board
Sample 1A
1 of 2
Sample 1A
2 of 2
Sample 1B
1 of 2
Sample 1B
2 of 2
Sample 1C
1 of 2
Sample 1C
2 of 2
AP
®
Statistics 2022 Scoring Commentary
Question 1
Note: Student samples are quoted verbatim and may contain spelling and grammatical errors.
Overview
The primary goals of this question were to access a student’s ability to (1) use data presented as a scatterplot to
describe a rel
ationship between two variables within the context of a study; (2) identify and interpret the slope of
a least-squares regression line; (3) interpret the coefficient of determination with respect to the proportion of
variation in values of the response variable that can be explained by variation in the values of the explanatory
variable; (4) identify the observation with the largest absolute residual using a scatterplot of the data with the
least-squares regression line inserted; and (5) determine if the least-squares regression line overestimates or
underestimates the value of the response for the identified observation and provide a justification based on a
comparison of the identified observation to the least squares regression line.
This question primarily assesses skills in skill category 2: Data Analysis, and skill category 4: Statistical
Argumentation. Skills required for responding to this question include (2.A) Describe data presented numerically
or graphically, (2.C) Calculate summary statistics, relative positions of points within a distribution, correlation,
and predicted response, and (4.B) Interpret statistical calculations and findings to assign meaning or assess a
claim.
This question covers content from Unit 2: Exploring Two-Variable Data of the course framework in the AP
Statistics Course and Exa
m Description. Refer to topics 2.4, 2.6, 2.7, and 2.8, and learning objectives DAT-1.A,
DAT-1.D, DAT-1.F, DAT-1.G, and DAT-1.H.
Sample: 1A
Score: 4
The response earned the following: Part (a) – E; Part (b) – E; Part (c) – E; Part (d) – E.
In part (a) the response includes strength, form, and unusual features, satisfying three of the first four components,
and satisfies
component 5, context. Part (a) was scored essentially correct (E).
In part (b) the response identifies the value of the slope by providing interpretation with context and includes
“expected” fo
r non-deterministic language, satisfying both components. Part (b) was scored essentially
correct (E).
In part (c) the response includes a correct interpretation of
r
2
in context. Part (c) was scored essentially
correct (E).
In part (d) the response identifies the correct bullfrog and indicates that it will be an overestimate with “negative
residual” for justification, satisfy
ing both components. Part (d) was scored essentially correct (E).
Sample: 1B
Score: 3
The response earned the following: Part (a) – E; Part (b) – E; Part (c) – I; Part (d) – E.
In part (a) the response includes direction, form, and unusual features, satisfying three of the first four
co
mponents, and satisfies component 5, context. The response does not satisfy component 2, strength. Part (a)
was scored essentially correct (E).
© 2022 College Board.
Visit College Board on the web: collegeboard.org.
AP
®
Statistics 2022 Scoring Commentary
Question 1 (continued)
In part (b) the response identifies the value of the slope, provides an interpretation with context, and includes “on
average” for non-deterministic language. All three components are satisfied. Part (b) was scored essentially
correct (E).
In part (c) the response does not include a correct interpretation of
r
2
because it refers to the “predicted mass.”
The response contains mass but does not contain length; therefore, the context is not met. Part (c) was scored
incorrect (I).
In part (d) the response identifies the correct bullfrog and indicates that it will be an overestimate with
justification by calculating the predicted point. All thr
ee components are met. Part (d) was scored essentially
correct (E).
Sample: 1C
Score: 2
The response earned the following: Part (a) – P; Part (b) – P; Part (c) – I; Part (d) – E.
In part (a) the response includes direction and strength, satisfying components 1 and 2. The response includes
context, satisfying com
ponent 5. Part (a) was scored partially correct (P).
In part (b) the response identifies the value of the slope, satisfying component 1. The response provides a correct
interpretation with units, satisfyin
g component 2. However, the response does not include any non-deterministic
language, so it does not satisfy component 3. Part (b) was scored partially correct (P).
In part (c) the response does not include a correct interpretation of
r
2
and no context is included. Part (c) was
scored incorrect (I).
In part (d) the response identifies the correct bullfrog and indicates that it will be an overestimate with
justification as “the line is above the point.” All three com
ponents are satisfied. Part (d) was scored essentially
correct (E).
© 2022 College Board.
Visit College Board on the web: collegeboard.org.