The decline of the death penalty

I just finished reading ‘The Decline of the Death Penalty and the Discovery of Innocence’ (link, link to book’s website) by Frank Baumgartner, Suzana De Boef and Amber Boydstun. It is a fine study of the rise of the ‘innocence’ frame and the decline of the use of capital punishment in the US (I have recently posted about the death penalty). The book has received well-deserved praise from several academic corners (list of reviews here). In this post I want to focus on several issues that, in my opinion, deserve further discussion.

One of the major contributions of the book is methodological. The systematic study of policy frames (‘discourse’ is a related concept that seems to be getting out of fashion) is in many ways the holy grail of policy analysis – while we all intuitively feel that words and arguments and ideas matter more than standard models of collective decision making allow, it is quite tricky to demonstrate when and how these words and arguments and ideas matter. Policy frame analysis became something of a hype during the late 1970s and the 1980s, but it delivered less than it promised, so people started to look away (as this Google Ngram shows). Baumgartner, De Boef and Boydstun have produced a book with the potential to re-invigorate research into the impact of policy frames.

So far, the usual way to analyze quantitatively policy frames has been to count the number of newspaper articles on a topic, measure their tone (pro/anti) and classify the arguments into some predefined clusters (the frames). This is what the authors do with respect to the death penalty in Chapter 4. They collected all articles on capital punishment listed in the New York Times Index from 1960 till 2005, coded each article for its pro- or anti- death penalty orientation and classified the arguments found in each article into a pre-defined set of 65 possible arguments, clustered along seven dimensions (efficacy, morality, cost, constitutionality, fairness, mode of execution, and international issues) (p.107).  The approach allows one to track total attention to capital punishment, the net tone, and the relative attention to each of the seven dimensions over time. This is useful to identify, for example, the surge in attention after 1995 to issues of innocence and evidence in stories on the death penalty (p.120) which have ‘come to dominate’ the debate. Existing studies of policy frames usually stop here. But as the authors argue:

[The frequency of attention] matters, of course, but also important is the extent to which these arguments are used in conjunction with one another to form a larger cohesive frame. (p.136)

Enter evolutionary factor analysis (Chapter 5). The technique is essentially a series of factor analyses performed on overlapping 5-year time windows. Factor analysis identifies inductively (from the data) which arguments tend to go together. So you start with a factor analysis of the arguments contained in the articles published in 1960 to 1965 treating each year as a single observation. You repeat for each 5-year period (1961 to 1966, 1962-1967, etc.), track the clusters of arguments (the frames) that seem stable, and trace how they move and change over time. Using this approach, the book claims that a set of 16 arguments centered around ‘innocence’ (the frame) emerged in 1992, captured the debate and is still going strong. Since 13 of these arguments are anti-death penalty, the rise of the innocence frame is responsible for the increasingly anti-death penalty tone of the newspaper coverage. As I said, the approach is path-breaking and holds lots of promise, but I have one concern. Currently, each factor analysis is based on 65 variables (since the authors ignore all arguments that appeared less than five times in any five-year period, the effective number of variables is much smaller but still usually greater than the number of observations) and 5 observations only (the years). This introduces lots of noise in the data (as the authors themselves acknowledge) and necessitates a series of more or less arbitrary decision to get rid of statistical flukes. Rules of thumb about sample size in factor analysis often recommend a minimum of 100 observations and at least twice as many observations as variables [factanal in R even refuses to perform the factor analysis with more variables than observations; SPSS obeys]. So there is a potential problem, but what is a bit puzzling to me is that there seems to be a pretty obvious way to address the problem; a way which the authors do not discuss:
Why not run the factor analyses on all articles that appear in a year, taking the individual article as the unit of observation?
True, many articles are coded to feature only one argument, but the median number of arguments per article is two, and there are 1635 articles (so more than 40% of the sample) that have more than two arguments (that’s based on my quick-and-dirty calculations from the replication dataset available here). Apart from providing more observations, taking the article as a unit of observation makes theoretical sense as well – we want to know whether frames dominate individual contributions (articles), as well as the macro-debate in a given year.

Having demonstrated the rise and the recent dominance of the ‘innocence’ frame, in Chapter 6 Baumgartner, De Boef and Boydstun proceed to estimate the impact of ‘net tone’ on public opinion. As explained above, the book attributes the major changes in ‘net tone’ (pro- vs. anti- sentiment of the newspaper articles) to the changing frames, so indirectly this is testing the impact of frames as well. Using a vector error-correction model, the authors argue that levels of public opinion are ‘positively related to levels of homicides [control variable] and pro-death penalty media coverage’ (p.187). Chapter 7 turns to explaining the number of annual death sentences and concludes that both media ‘net tone’ and public opinion are significantly associated with this policy indicator. I wouldn’t be too quick to attribute any causal power to media tone, however. If one takes seriously the first part of the book, then the policy frame emerges as a potential confounding variable that works both directly (through framing the thinking of policy makers, jurors and judges) and indirectly through the media. If that was the case, the effect of media tone would be exaggerated in the statistical models as it would pick up to the direct effect of the policy frame as well. One can make a similar case for the effect of public opinion. I also prefer investigating more directly the direction of causality in such systems of variables that move together over time (using Granger causality tests or VAR) –  I see little theoretical reason why the number of death sentences cannot have an impact on public opinion, for example.

A bigger threat to the integrity of the story about the rise of ‘innocence’ and the decline of the death penalty, however, is the persistence of important differences among states in public opinion and the number of death sentences and executions. Since this book focuses on the tone and framing of the death penalty debate in a national media (NYT), it cannot address the question of cross-state variation. But I think it is a valid question, and one that deserves more research, whether the population and policy makers in some states are less sensitive (immune?) than others to the effects of framing, or whether they are exposed to different media with a different net tone and using a different frame. A recent paper by Kenneth Shirley and Andrew Gelman shows that black males, and to a lesser extent black females, “have shown the sharpest decline in support” over time (p.31, see also Figure 8 ) while the net change in support for the death penalty among non-black men and women is quite small (Figure 9).  It would seem that the ‘innocence’ frame has resonated much more (only?) with black people, and black people have responded stronger, and faster, to the arguments put forward by the frame. Perhaps the fact that many of the individuals exonerated from death row have been black can explain the differentiated impact of the innocence frame. In any case, there are interesting synergies between Shirley and Gelman’s study with its emphasis on individual and state differences and Baumgartner et al.’s focus on variation over time.

To conclude this rather lengthy post, the ‘The Decline of the Death Penalty and the Discovery of Innocence’ uncovers an exciting new direction for policy frames research. In fact, I am already starting a project attempting to apply the evolutionary factor analysis approach to policy framing in the context of anti-smoking policy in Europe.

The deterrent effect of the death penalty

Does the death penalty lead to a lower number of homicides? A recent paper by Charles Manski and John Pepper argues that, on the basis of existing US data, we do not know. Both positive and negative effects of the application of the death penalty are consistent with the observed homicide rates in the US. The argument itself is not new (see for example this 2006 paper by John J. Donohue III and Justin Wolfers) but Manski and Pepper’s text is still very interesting and highly instructive.

Manski and Pepper strip the problem to the core. Say we only have four observation points – the average yearly homicide rates for ’75 and ’77 in two sets of states that either did (A) or (B) did not reinstate the death penalty after the moratorium was lifted with the 1976 Gregg decision. So in 1975 both sets of states did not have a death penalty while in 1977 group A had reinstated it. I reworked the table with the rates into the figure below. The red dots show the homicide rates in the ‘death penalty’ states and the blue ones in the remaining ones.

The authors show that on the basis of these four numbers, there are at least three point estimates of the effect of the death penalty that we can derive from the data depending on the assumptions that we are willing to make.

First, we can assume that the selection of individual states into the two groups (‘death penalty’ or not) is independent of the potential outcomes (as if the assignment is random). In that case we can just compare the contemporaneous rates of the two groups in 1977 and we will conclude that the effect of the death penalty is +2.8 (meaning that the death penalty increases the homicide rate).

Second, we can assume that the mean ‘treatment’  (having the death penalty) response is the same for the treated (A) and the untreated (B) groups. In that case we can look at the before/after change in the homicide rates for group A between 1975 and 1977 and we will conclude that the effect is – 0.6 (meaning that the death penalty decrease the homicide rate).

Third, we can assume the the ‘treatment’ response is linear and homogeneous across states and dates. In that case we will arrive at the difference-in-differences estimate which compares the change between 1975 and 1977 in group A with the change between 1975 and 1977 in group B which in the particular case will lead to the conclusion that the effect of the death penalty is +0.5 (in the ‘treated’ group A the change is a reduction of 0.6 but in the ‘untreated’ group B the reduction is greater at 1.1).

Not only are these three estimates quite different, but each of them is based on easily challenged assumptions. The paper proceeds to present an alternative ‘partial identification’ approach which relies on weaker assumptions but this comes at a price – we can only identify bounds on the potential effect of the death penalty rather than precise numbers. The authors examine a number of alternatives, but  in all cases the conclusion is that the possible deterrent effect of the death penalty is contained within large intervals which inevitably contain zero. In short, under weaker assumptions we cannot even say whether the death penalty has a positive or a negative effect on homicide rates.

The paper does a remarkable job in showing clearly how  the seeming precision of point estimates is only possibly after the researcher makes a series of questionable assumptions. I am also convinced that in many policy settings having a reliable interval of possible values for the effect of a treatment (policy) is more useful than having a seemingly precise but unreliable and ultimately invalid point estimate. I know too little about the exact method applied in the paper to arrive at these intervals (this 2007 book by Manski seems like a good introduction) to assess the details but the approach seems more than reasonable.

One of the best things about the paper is that it deliberately sidelines statistical issues in order to focus squarely on the problem of inference. The problems of the uncertainty of point estimates, correct standard errors, etc. come on top of the general problem of inference. In many presentations the statistical issues are conflated with the more, uhm, fundamental problem of deriving a valid statement about the effect of a treatment (policy, institution, etc.) on the basis of observational data.

Finally, it is worth drawing some policy implications from the fact that scientific research cannot provide an answer to the question about the potential deterrent effect of the death penalty. I guess the current message  ‘it could be negative or positive, but in any case it is likely to be small’ implies that deterrence should not play a major role in the debates about the future of the death penalty. Different arguments, and different frames, should be the basis of discussion and decision since we simply don’t know whether a deterrent effect exists. In fact, in their recent book Frank R. Baumgartner, Suzanna De Boef, and Amber Boydstun leave only a very minor role (if at all) for the deterrence frame as a driver of death penalty policy change. Instead, they argue that it is the ‘discovery of innocence’ that is responsible for the ongoing transformation of the policy in the US, but their book and their approach are worth a separate post.