July 11, 2012

Reading and Revolutions, or, What’s the Matter with Google Books?

Michael Witmore and Robin Valenza have a fascinating post up this morning  at Wine Dark Sea asking, “What Do People Read During a Revolution?” They ran a visualization based on Google Books’ massive database (as of 2010) categorized through Library of Congress subject headings and found that there were massive spikes in the publication of histories during the 1640s/1650s and the last quarter of the eighteenth century — or, as you might otherwise know those periods, during the English Civil War and the American and French Revolutions.  Smaller spikes occur during the years around the European upheavals of 1848 and 1914. Based on that, they posit a suggestion:

What are people reading during a revolution? Poetry? Books on military technology? Theology? No. If we take the first spike, the years leading up to the English Revolution, the answer in the years leading up to the 1642 regicide seems to be “Old World History.” The second chronological peak—in the decades around the American (1776) and French (1789) Revolutions—shows the same pattern. In periods that historians would link to major political upheaval, the world of print shows similar disruptions: publishers are offering more history for readers who, perhaps, think of themselves as living through important historical changes.

As a scholar of publishing and (the American) Revolution, I found such a conclusion troublesome, for reasons I’ll elucidate below. Having read the post over again after my initial concerns, I want to emphasize that Witmore and Valenza were careful to add several contextualizing questions that need further exploration:

We should be precise: these data don’t indicate that more people are reading history, but that a higher proportion of books published by presses can be classed by cataloguers as history. There are many follow up questions one might ask here. Does publication tie strongly to actual reading, or are these only loosely connected? Are publishers reducing the number of books in other subject areas because of scarcity of resources or some other factor, which would again lead to the proportional spikes seen above? Are the cataloguing definitions of what counts as Old World History or history in general themselves modeled on the books published during the spike years?

Even allowing for those additional questions, however, I have two sets of concerns with the correlations that the graphs imply, and thus want to argue that the graphs are not nearly as illustrative or helpful as they at first seem.

First, I’ll simply repeat the hesitation that others have discussed more eloquently (most notably Ben Schmidt at Sapping Attention) about the limitations of the Google Books database as a representative set of literature. In this particular case, the most pressing limitation is that I’m not sure of the national origins of the set used in the graph, which is labeled as “all books published.” That’s fine as far as it goes, but does it include publications in English as well as other languages? If only English, both British and American? How about Ireland? Without knowing even the language of the corpus, it’s difficult to project, for example, the significance of the spike around the French Revolution, the 1848 European revolutions, or the outbreak of World War I (and subsequent Russian Revolution).

Second, I’m concerned at the use of book publication by itself as the unit of measurement. For one thing, giving each publication equal weight elides the importance of popularity. It’s all well and good if there were ten unique titles published in runs of 200 each that discussed the natural history of the South Seas, for example, but it’s less significant if in the same year (say in the 1760s) one publisher put out an edition of several thousand copies of Pamela. While this is a hypothetical (I don’t have British numbers handy), the example is certainly plausible given the appeal of history and fiction as genres for sale, and the ways in which they were frequently published in terms of size and quality of editions. Furthermore, as William St. Clair has argued [PDF], readers didn’t read in order of publication. So even as new histories were published in the 1770s and 1780s, more distantly published works remained popular. (One presumes that this last would be accounted for by unique entries for each edition, but I’m not sure whether each is entered separately.)

Third, one more comment about the books themselves. The graph appears very suggestive of the correlation between history publishing and revolution, but there’s another possible interpretation. That’s because the historical periods that the graphs identify (the mid-seventeenth century and the late eighteenth century in particular) also coincide with periods when the publication of travel and exploration narratives flourished. So did the publication of books about history spike in the 1770s because of the American Revolution, or because Captain Cook was in the midst of his voyages to the Pacific? The granular approach offered in the graphs doesn’t allow for that kind of analysis.

Fourth (and here I’m finally getting to my real area of expertise), using books as the unit of measurement seriously underestimates the importance of all other kinds of publishing and reading from these periods. For each of the historical upheavals that see a spike, it was in non-book publications—pamphlets, newspapers, almanacs, broadsides, ephemera—that much of the intellectual and political work occurred. As it happens, it appears that at least a few editions of, for example, Common Sense and Letters from a Farmer in Pennsylvania appear in the Google Books database, but again, there’s no allowance for popularity. Furthermore, because of the way in which GB assigns a publication date, both of those publications appear as frequently as twentieth and twenty-first century editions as they do during the period they initially appeared. More importantly, there’s no way to account for newspaper publication of the Farmer’s Letters or excerpts of Common Sense, or of the hundreds of other essays working out political questions during the American Revolution. Both during and after the Revolution, magazines were also important sites for the publication of politics, science, and yes, history, but they aren’t catalogued that way by the Library of Congress. Other scholars have done great work examining the impact of such publications in England during the Civil War (see here and here) and France (see here and here) during its Revolution, just to start.

Last, it’s important to remember that the American colonies during the era of the American Revolution were not saturated with books, by a long stretch. It’s a bit of a simplification, but few books were published in North America because of the expense, and not that many were imported, again because of the expense. So even histories published in Britain that circulated to North America did not do so in great numbers—except, perhaps, in excerpted form in British magazines or American newspapers.

These are some initial thoughts, but I make them to suggest that I find the graphs, attractive as they are, far more problematic than suggestive in what they can show us about reading, publishing, and revolutions.



  For those who might be interested, Ben Schmidt also commented on the Wine Dark Sea post. He argued that the data could very well show not that people read more history during times of upheaval, but that librarians classify more work from those periods as history.

    Comment by Joseph M. Adelman — July 12, 2012 @ 10:14 am

