A Community Court Grows in Brooklyn 11
data series, or in the estimated mean of the distribution from which monthly observations are drawn
(Barry and Hartigan 1993). The problem addressed by Barry and Hartigan is identifying the
position(s) of one or more partitions that divide a set of ordered observations into contiguous sets or
blocks with constant mean within each block (Barry and Hartigan 1993, 309). Given a series of
observed values, n, the values could be divided into a number of contiguous blocks from one (the
distribution of the series does not change) to n (each observed value is its own block.) The change
point problem, essentially, is arriving at a set of points dividing the series into blocks that produce
the best “fit” of the observed data within each block to the mean of that block, without breaking the
series into too many blocks.
In contrast to other approaches to analyzing time series data, change
point analysis allows us to identify if and when a series undergoes a significant change, without
having to specify where we believe the change has taken place.
The product partition model, one of several possible solutions to change point problems,
arrives at estimates of the number and location of change points, as well as the mean at each point
in time, by iteratively sampling from a distribution of partition indicators given the data, the current
partitions, and the product of prior “cohesions,” representing the similarities between contiguous
observations. The draws form a Markov chain, with transition probabilities (the likelihood of
observing a change point at a given month, given a null prior) a function of the cohesions and the
relative sizes of the sums of the between-block sums of squares and within-block sums of squares.
Estimated means, which function as a smoothed representation of the data series similar to local
regression, and other parameters at each point in the series are updated from these sums of squares
(Erdman and Emerson 2008).
Change point analyses were done for each precinct, eight in all, entering felony and
misdemeanor arrest series together, and two analyses using total arrest series, one for the group of
three RHCJC catchment precincts together and one grouping the five non-RHCJC precincts. This
permits us to compare overall trends in the Red Hook districts with whatever trends are apparent in
the collection of comparison precincts. Figures D5, D6, and D7 present the estimated means over
time (the line graphs indexed on the left axis) for the felony and misdemeanor arrest series and the
change point probabilities (the shaded areas rising from the x-axis and indexed on the right axis).
Turning attention to these RHCJC precincts, some clear patterns emerge. All three analyses
uncover a change point with 100% probability in early 2000, accompanied by dramatic declines in
estimated average arrests. Precincts 76 and 78 both demonstrate such a spike in March 2000, the
month before the RHCJC opened, while Precinct 72 has such an indicator in May of that year.
Since the probabilities reflect the likelihood of a change in the following time interval, precincts 76
and 78 appear to have experienced a substantial change in average arrests in the month the court
Obviously, the best fit of change points to the data would be to define each time period as the mean of its own block.
The Bayesian product partition model has several advantages over similar methods. First, it allows one to adjust the
prior expected likelihood of observed changes (set higher if a large number of change points are anticipated) and a prior
for the signal-to-noise ratio (set higher if change points are indicated by smaller absolute changes in value.) The model,
estimated via Markov chain Monte Carlo (MCMC), produces as output the probability of a change point at every time
interval in the series (characterized by the proportion of iterations of the Markov chain in which a change point is fitted
at each position.) Thus, the researcher can decide what threshold to use when identifying a mean shift in the series.
Also, the product partition model has been extended to multivariate series by Erdman and Emerson (2011), so
information from more than one series, such as misdemeanor and felony arrests for a precinct or the total arrests from
several precincts, can be used simultaneously to identify when a change in arrest patterns occurs. The product partition
model for change point problems has been implemented in R by the package bcp (Erdman and Emerson 2011). Priors
for the signal-noise ratio and change point probabilities were set to the default values (.20, .20) recommended in Barry
and Hartigan (1993), based on MCMC simulations. Default values were also used for the burnin iterations (the number
of links in the Markov chain permitted for convergence) and the links used to characterize the posterior distributions.