Friday, April 12, 2013

The hope of digital humanities

Next week I return to Texas A&M University, where I started my academic career and spent twenty indifferent years of it, to deliver a lecture on the digital humanities. The subject is an appropriate one for me, I guess, since I was a pioneer of the digital humanities a good decade before they were even called that. Along with the late Denis Dutton, I founded the listserve discussion group PHIL-LIT in the summer of 1994, just a few weeks after L-Soft launched its first version of listserv software. I moderated PHIL-LIT for nine years until, sick unto death of the partisan politics that had crowded out any discussion of philosophy and literature, I pulled the plug on it.

All talk about the digital humanities is pretty evenly divided between those who are skeptical that computers will ever do anything more than lighten the drudgery of humanistic scholarship by speeding up its more mechanical tasks and those, like Alan Liu of the University of California at Santa Barbara, who are excited by the prospect of a “uniquely contemporary kind of discourse”:

Seen one way, such projects make the transmission of academic knowledge more efficient and flexible. . . . But, viewed differently, they also prepare the academy to refract such technologic through its own values, which are not always on the same page with the business master plan.[1]Just as long as the computer-ready humanities are not on the same page as business!

Number me among the skeptics. My suspicion is that what Liu calls the “structured encoding of knowledge” is really only another way—a newer way, I grant you, and for now a stranger way—of preparing copy for the printer. The printer has been replaced by a machine; our copy must now be machine readable. But the copy itself remains unchanged fundamentally (I apologize for the swear word). The encoding is a superaddition to it.

One reason for my skepticism is that the digital humanities have been around for nearly half a century now, and the hoped-for breakthrough has yet to occur. Jerome McGann, a well-known scholar of romanticism, expresses the hope succinctly when he predicts that computers will be able to “expose textual features that lie outside the usual purview of human readers.”[2] But even the most successful work in the digital humanities (like the four-author paper “The Expression of Emotion in 20th Century Books,” with its impressive equations and graphs) has produced what scientists call results of low statistical power—small sample sizes, small effects being studied.

In 1965, IBM awarded a grant to Yale University to investigate the promise of computers in humanistic research. At the inevitable conference that ensued, the late Jacques Barzun was optimistic about the promise of computers for indexing, collating, verifying, drawing up concordances, and similar attention-to-detail work, but he warned that humanists who hope to rely upon the computer for more far-reaching results will only “reduce wholes to discrete parts that are disconnected from the value or nature of the whole.”[3]

Barzun’s warning is even more timely now that digitalization has opened up archives and library collections that were once closed to everyone outside a small elite. By means of topic modeling, a humanistic scholar can now search more text in an afternoon than he previously could in a lifetime. But the problem—the problem as defined by Barzun—remains. The excited advocates of the digital humanities, which they familiarly call DH (they don’t mean Lawrence), are worried about a different problem altogether, which they are confident the new computer-backed methods and conventions will solve:What galvanizes many of us working in cultural heritage is how DH tools and practices will enable us to move beyond the traditional methodologies of description of, and access to, archival or cultural collections. These traditional practices, holdovers from a world of physical materials and all the attendant requirements of arrangement, bulk, and storage, have also been fundamentally subjective. Catalogs, finding aids, L[ibrary of] C[ongress] S[ubject] H[eadings]—all are products of interpretive biases.So too, for that matter, are topic models. There is no escaping the undertow of subjectivity, which is simply another way of saying that data are not self-interpreting: a mind must interpose between machine and meaning. And this is the scandal of the digital humanities. They have been unsuccessful at their fondest hope—eliminating the mind from humanistic scholarship.

Barzun’s warning is a reminder that mind, the moisture in the robot, is forever indispensable to human knowledge, including the humanities. The connection of discrete parts to the value and nature of the whole is an operation that can only be performed by a human being who is capable of judgment in addition to designing search protocols.

Let me brag for a moment. Perhaps my only substantive contribution to humanistic learning is the discovery that Ralph Waldo Emerson coined the term creative writing, which he first used in “The American Scholar” (a discovery that has been incorporated, without attribution, into the third edition of the OED, by the way, thus giving the lie to Cassio’s claim that his reputation is a man’s immortal part). Without question, the selection of archival materials that I plowed through to study the history of the idea of creative writing was a product of my interpretive bias. But the mistake is to assume that my bias illegitimately skewed the search results somehow. You are not permitted to ignore the fact that I was right about the origin of the term. My bias (namely, that creative writing reeks of American romanticism) led me to the right materials.

The confidence that they “will enable us to move beyond the traditional methodologies” might be called the Great White Hope of the digital humanities. It is overweight, overhyped, an expression of superstition and prejudice.

The real promise of the digital humanities is at once less exciting and more liberating. What the digital humanities promise is the death of the credential. Anyone at all can now undertake an inquiry into the human heritage, and anyone at all can now publish her findings. No one need any longer submit her research for prior approval to a figure in a position of institutional power. She is free to follow her inclinations and talents—free to follow them as far as they will carry her. This is what political conservatives, who complain incessantly about the “liberal bias” in academe, fail to understand. No one is in control of humanistic scholarship any longer, no party, no league of prestigious institutions, no system of acceptance and rejection. When credentials have lost their cultural influence, the only influence in the humanities will be the influence of brilliant undeterred minds. And that is the final hope of the digital humanities.

[1] Alan Liu, “Transcendental Data: Toward a Cultural History and Aesthetics of the New Encoded Discourse,” Critical Inquiry 31 (Autumn 2004).

[2] Jerome McGann, Radiant Textuality: Literature after the World Wide Web (New York: Palgrave, 2001), p. 190.

[3] Jacob Leed, Review of Computers for the Humanities? A Record of the Confederence Sponsored by Yale University on a Grant from IBM, January 22–23, 1965, Computers and the Humanities 1 (September 1966): 13.


George said...

"they also prepare the academy to refract such technologic through its own values"

As long as the academy publishes its index of refraction, who could complain?

"holdovers from a world of physical materials and all the attendant requirements of arrangement, bulk, and storage, have also been fundamentally subjective."

I would hate to shock a scholar, but digital information is subject to requirements of arrangement, bulk, and storage. I spent about an hour of my day talking with co-workers about how we might retrieve some files from 2011 that had recently been deleted--in part because of their "bulk".

Unknown said...

Digital technology will enhance the humanities (and literature) but--I hope--will never replace the printed word on paper. Luddite that I am, I remain thankful that my advancing age will allow me to escape the complete disappearance of ink and paper. BTW, I hope you enjoy your return to College Stage. Give 'em hell, David!

Unknown said...

Of course, I meant to say "College Station." I need a proof-reader!

Aonghus Fallon said...

I guess one of the dangers of having instantaneous access to information is that you miss out on a journey that was often an education in itself, even if this wasn’t immediately apparent. I’m reminded of a book that came out a few years ago -‘Hare Brain, Tortoise Mind: How Intelligence Increases When You Think Less’ (which – in a sublime irony – I’ve never actually read). The book hypothesises an undermind which absorbs information more slowly and intuitively, but which is ultimately more effective than its logical, hare-brained counterpart.

Jonathan Chant said...

Very interesting. And, finally, uplifting in its hopeful conclusion.

Troy Camplin said...

I think the digial humanities will find their place in the conjunction between the ability of computers to discover complex patterns and human interpretive abilities. I, for one, was able to find out that novels may demonstrate fractal patterns of meaningful words. I will note, however, that I then had to go in and actually read where the words were clumped in order to discover the shifting meanings.

Alberto said...

Dear David,

I am the first author of the paper "The Expression of Emotion in 20th Century Books”. First of all, thank you for citing it and for calling it "successful"! I am more puzzled though about what you exactly refer to with "results of low statistical power—small sample sizes, small effects". In fact we intentionally did not use much stats in that paper, since the effects were extremely evident! Also, I can not imagine how would be possible (for this kind of analysis) to have a sample size bigger than the one we used (the Ngram Corpus of Google Books)!

Of course our analysis was quite rough - we are working now on refining and checking the robustness of the results by using other techniques and different way to score emotions in the text (BTW we are quite happy that the original results hold, and we hope to publish soon a follow up).

I can not really comment on the rest of your post since I am not in the business of Digital Humanities - I am an anthropologist interested in quantitative study of cultural evolution, and for this research language in books was - so to speak - a cultural artifact "easy to quantify".

All the best,