What is Distant Reading and how does it help us?

Time traveling is always a way people have been dreaming about so that they can see what happened in a certain era in the past. Unfortunately, time traveling is a piece of technology that is so hard to achieve even there has already been numerous scientists contributing their time doing research on how to go back to history. Yet time traveling is still only a theory. But how can we know about the world our ancestors lived in? There is one effective way which is by analyzing our ancestors’ literature. From analyzing their articles or books, we can get a brief view of the era they lived in. Yet at the same time, analyzing several articles could lead us to a bias or disagreement. It is also too time-consuming for human to “close-read” thousands of books and articles. Another issues has been raised: how can we analyze literatures in a more time-efficient way and still get some solid analytical results?

Mr. Franco Morretti has a surprisingly simple solution to the issue I mentioned above — “don’t read them”. He went on and introduced the term “distant reading”: understanding literature not by studying particular texts, but by aggregating and analyzing massive amounts of data.[1] He claimed that “distant reading” is necessary for analyzing literatures. When you close-read books, you might get distracted by some unimportant stories or points. Consequently, you might miss some hidden key aspects of the literature you are analyzing. Most of the time you might get a biased result from close reading. Thus “distant reading” allows us to “uncover the true scope and nature of literature” while bringing us lots of benefits.[1] “Distant reading” can help us look at literatures from a bird’s view and summarize a more general understanding of the nature of literature. Since we are not reading the literature, our own understanding of some contents wouldn’t mislead us. At the same time, with the development of all kinds of text analyzing technology, it has never been so easy to apply “distant reading”.

We can also use computational methods to apply “distant reading”. As a result, textual humanities are being changed by computational methods. This kind of “distant reading” is building connections between humanity and many other fields such as social science, sociology, computer science, etc.. Mr. Ted Underwood claimed that “In short, computational analysis of text is not a specific new technology or a subfield of digital humanities; it’s a wide-open conversation in the space between several different disciplines.” [2]We can now understand humanities from all kinds of perspectives. Consequently, we might also be able to discover what have been hidden from us in the literature. The results we get from analyzing will get more diverse as well.

Mr. David Hoover also agreed that “almost any literary study can benefit from at least some modest and basic kinds of computer assistance”[3]. We cannot avoid investigating a large collection of texts or a frequently occurred items. In these cases, it is almost impossible for a human to process. We have to use a computer or any other computational tools to assist us apply “distant reading”. Here is another scenario: you are trying to sort all the adverbs used in the literatures by years. It would probably take you weeks and months depending on what sorting algorithm you use. Nevertheless, a computer will only spend seconds or minutes to accomplish such a task. There are all sorts of sorting algorithms a computer can achieve while human brains cannot. Plus these algorithms have excellent run-time performance. “Distant reading” can save us an unmeasurable amount of time.

“Distant reading” can also help us visualize the texts in literatures. From visualization, we are able to find out patterns of texts that are hard to be seen from “close reading”. For example, here is a graph generated by a computation tool Wordle:

Screen Shot 2016-01-27 at 1.42.47 PM

This is a graph showing all the dominant terms in Martin Luther King’s “I have a dream speech”. It is quite obvious that Dr. King’s speech’s main topic is freedom even if I never have read this speech. Then by further looking at other dominant terms, it is easy to find out that this is a speech about civil rights and social justice. Next you can plug one of the dominant terms (“freedom” for example) into another tool called Google Ngram, you will be able to get a general idea of that time period.

Screen Shot 2016-01-31 at 8.56.56 PM

From the graph, we can tell during 1950s to 1970s, people were eager of freedom. There might be numerous unfair incidents occurred in that era. Thus you are able to get these results in just several minutes by “distant reading”. Instead, you will probably spend days to get this general picture if you read all the literatures from 1840 to 2000.

In one sentence, “distant reading” is a necessary and time-efficient approach of investigating humanities in a certain era without the need of time-traveling. It has transformed humanity to a more diverse subject.

[1] Schulz, Kathryn. “What Is Distant Reading?” The New York Times. 2011. Web. 30 Jan. 2016

[2] Underwood, Ted. “Seven Ways Humanists Are Using Computers to Understand Text.” The Stone and the Shell. 2015. Web. 31 Jan. 2016

[3] Hoover, David L. “Text Analysis.” Literary Studies in the Digital Age. 2013. Web. 30 Jan. 2016

 

One reply on “What is Distant Reading and how does it help us?”

AC,

What an incredible comparison between distant reading and time traveling! That gave me a whole new perspective. It reminds me of looking through the digital newspapers, and imagining reading them back in the 1800’s. It is an amazing experience to have access to such rich historical pieces.

Also, I like how you pointed out while close reading people can easily get “distracted” by several points. I would agree; however, I think it has to do a lot with the experiences that a person had. They will have a certain sentiment towards ideas that they have experienced in their lifetime.

Overall wonderful job!

Leave a Reply

Your email address will not be published. Required fields are marked *