Distant Reading
Franco Moretti, an Italian literary scholar, pioneered a new research technique within the digital humanities known as “distant reading.” Distant reading, as Moretti told the New York Times, is ” understanding literature not by studying particular texts, but by aggregating and analyzing massive amounts of data” (Schulz 3). It is also referred to as “textual/text analysis.”
Distant reading allows for an extraction of the text off the page and illustrated through digital tools. This provides a clear overall reading that is not always evident through close reading, the traditional process of reading materials. Distant Reading is the ability to draw our attention from what traditional reading teaches and uncover the relation of patterns that emerge at a distance and close up. Tanya Clement refers to this process as if you are turning a magnifying glass upside down. As Clement states, it’s a method to “defamiliarize texts, making them unrecognizable in a way…that helps scholars identify features they might not otherwise have seen, make hypotheses, generate research questions, and figure out prevalent patterns and how to read them”(Clement 3). It is important to emphasis the new outlook text analysis can give us on an otherwise standard reading level.
The practice of distant reading is becoming increasingly popular as new technologies emerge and questions are asked.
Distant Reading Changing Perspectives
Pretend you have just finished all of Shakespeare’s plays. You find each to be fascinating in its own right, but you forget the main concepts. Distant reading allows you to pull each play apart, extracting hidden information without having to read every play for the second time. Then, you can compare Hamlet, Macbeth, Othello, etc. together and make conclusions on each text or Shakespeare’s ideas.
A digital humanist, Ted Underwood, claims that you can “identify distinctive vocabulary” (Underwood 15). Therefore, this would give you an opportunity to pick out Shakespeare’s most used words and analyze his diction/lexicon for each play. Consequently, you can make inferences on why Shakespeare wrote the way he did or what his ultimate goal was in a piece of writing.
For example, this is my Wordle on Emily Dickinson’s poem entitled “I cannot live with You.”
This picture depicts all of the words in her poem without commonly used English words. The most dominant word in the Wordle is presumably “Life.” From this I can make some inferences and conclusions about this poem. For instance, the main idea surrounding the poem is “Life” according to the Wordle. However, just looking at the words without context makes it difficult to understand the tone of the text. That is when “differential reading” becomes important.
Differential Reading
Clement explains differential reading as “…close and distant reading practices as both subjective and objective methodologies” (Clement 2). Thus, close reading allows a critical analysis of a literary work, while distant reading acts as an “upside down magnifying glass,” illustrating hidden patterns to scholars.
For instance with my Wordle of Emily Dickinson’s poem, it would help to read it myself and synthesis my own interpretation of the poem. Thereafter, I can put together my ideas with the digital techniques to make a more accurate hypothesis.
Another digital humanist, David Hoover, states that “Investigating how and the extent to which authors differentiate the voices of characters or narrators…” (Hoover 3) is possible. This could either be in a novel, play, poem, etc. Taking Hoover’s point into consideration, I could compare multiple Dickinson poems and compare the tones and rhetoric in each. Furthermore, I could even break a poem up into each stanza and look for different tones between the stanzas. The possibilities are endless with distant reading; there is always new information and approaches to discover.
Challenges to Distant Reading
There are some significant disadvantages to distant reading.
- Copyright Laws – As Hoover stated, “For texts not available in digital form, an electronic text can be created by scanning and OCR. Unfortunately, it is not entirely clear that this is legal for texts in copyright” (Hoover 13).
- Finding the Text – It is incredibly hard to find some texts online. Even if you are so fortunate to find your text, sometimes there are different editions, authors, and publishers. It can be extremely difficult to choose the text that best suits your research.
- Texts not in Digital Form – In this case, you can perform an OCR scan. Although, you must keep in mind copyright laws. Moreover, if there are any additional drawings or markings on the original, they might not copy.
- Sentiment – It is difficult for a computer to distinguish between emotions. As the reader, you have your own perspective and develop emotions from that.
- Expansiveness of Archives – The collections of certain digital archives may be too small for a complete analysis.
This is only a small list of disadvantages. Unfortunately, there are some more, but in most cases the pros outweigh the cons.
Example of a Challenge to Distant Reading
First, I did a Wordle of the Preamble of the United States Constitution.
Next, I used the three most dominant terms, “establish,” “United,” and “States.” However when I entered these three words into the N-gram, I put “United” and “States” together – advantage of close reading.
I set the years between 1700 and 2008, to see the frequency of the terms used in literature.
Then, I used culturnomics or bookworm: ChronAm, to plot the same terms over time. However, Bookworm: ChronAm, was not reading “United States” even though it said you could enter a 2-gram (two word phrase). So I graphed “establish.”
Then I tried to graph “UnitedStates.” I did get a graph, but when I looked at the source texts, the only word highlighted was “United” in the articles. This shows that not all digital tools will work properly with what you want to do.
Finally, I tried “The United States” and received an oddly shaped graph. The articles highlighted words like “here” and “mistakes” which have nothing to do with “The United States.”
Summary of Distant Reading
Text analysis is based on the use of both subjective and objective practices. While, the objective practices require a mathematical output of word frequency, etc, there is a certain subjectivity in relation to interpreting the meanings from a graph, based on knowledge of history, philosophy, etc. – unquantifiable subjects.
Underwood describes it as ” …an interdisciplinary conversation about methods…” (Underwood 5). He also states that you may get sucked in and come across new territory not yet discovered. Fortunately, that is where the fun lies, daring to climb to new heights and to make superior breakthroughs.
Works Cited
Schulz, Kathryn. “What Is Distant Reading?” The New York Times. The New York Times, 24 June 2011. Web. 31 Jan. 2016.
Clement, Tanya. “Literary Studies in the Digital Age.” Literary Studies in the Digital Age. 2013. Web. 31 Jan. 2016.
Hoover, David L. “Literary Studies in the Digital Age.” Literary Studies in the Digital Age. 2013. Web. 31 Jan. 2016.
Underwood, Ted. “Seven Ways Humanists Are Using Computers to Understand Text.” The Stone and the Shell. 04 June 2015. Web. 31 Jan. 2016.
“Bookworm.” Bookworm. Web. 31 Jan. 2016.
“Google Ngram Viewer.” Google Ngram Viewer. Web. 31 Jan. 2016.
“Wordle – Beautiful Word Clouds.” Wordle – Beautiful Word Clouds. Web. 31 Jan. 2016.