Author: AC Li
Time traveling is always a way people have been dreaming about so that they can see what happened in a certain era in the past. Unfortunately, time traveling is a piece of technology that is so hard to achieve even there has already been numerous scientists contributing their time doing research on how to go back to history. Yet time traveling is still only a theory. But how can we know about the world our ancestors lived in? There is one effective way which is by analyzing our ancestors’ literature. From analyzing their articles or books, we can get a brief view of the era they lived in. Yet at the same time, analyzing several articles could lead us to a bias or disagreement. It is also too time-consuming for human to “close-read” thousands of books and articles. Another issues has been raised: how can we analyze literatures in a more time-efficient way and still get some solid analytical results?
Mr. Franco Morretti has a surprisingly simple solution to the issue I mentioned above — “don’t read them”. He went on and introduced the term “distant reading”: understanding literature not by studying particular texts, but by aggregating and analyzing massive amounts of data.[1] He claimed that “distant reading” is necessary for analyzing literatures. When you close-read books, you might get distracted by some unimportant stories or points. Consequently, you might miss some hidden key aspects of the literature you are analyzing. Most of the time you might get a biased result from close reading. Thus “distant reading” allows us to “uncover the true scope and nature of literature” while bringing us lots of benefits.[1] “Distant reading” can help us look at literatures from a bird’s view and summarize a more general understanding of the nature of literature. Since we are not reading the literature, our own understanding of some contents wouldn’t mislead us. At the same time, with the development of all kinds of text analyzing technology, it has never been so easy to apply “distant reading”.
We can also use computational methods to apply “distant reading”. As a result, textual humanities are being changed by computational methods. This kind of “distant reading” is building connections between humanity and many other fields such as social science, sociology, computer science, etc.. Mr. Ted Underwood claimed that “In short, computational analysis of text is not a specific new technology or a subfield of digital humanities; it’s a wide-open conversation in the space between several different disciplines.” [2]We can now understand humanities from all kinds of perspectives. Consequently, we might also be able to discover what have been hidden from us in the literature. The results we get from analyzing will get more diverse as well.
Mr. David Hoover also agreed that “almost any literary study can benefit from at least some modest and basic kinds of computer assistance”[3]. We cannot avoid investigating a large collection of texts or a frequently occurred items. In these cases, it is almost impossible for a human to process. We have to use a computer or any other computational tools to assist us apply “distant reading”. Here is another scenario: you are trying to sort all the adverbs used in the literatures by years. It would probably take you weeks and months depending on what sorting algorithm you use. Nevertheless, a computer will only spend seconds or minutes to accomplish such a task. There are all sorts of sorting algorithms a computer can achieve while human brains cannot. Plus these algorithms have excellent run-time performance. “Distant reading” can save us an unmeasurable amount of time.
“Distant reading” can also help us visualize the texts in literatures. From visualization, we are able to find out patterns of texts that are hard to be seen from “close reading”. For example, here is a graph generated by a computation tool Wordle:
This is a graph showing all the dominant terms in Martin Luther King’s “I have a dream speech”. It is quite obvious that Dr. King’s speech’s main topic is freedom even if I never have read this speech. Then by further looking at other dominant terms, it is easy to find out that this is a speech about civil rights and social justice. Next you can plug one of the dominant terms (“freedom” for example) into another tool called Google Ngram, you will be able to get a general idea of that time period.
From the graph, we can tell during 1950s to 1970s, people were eager of freedom. There might be numerous unfair incidents occurred in that era. Thus you are able to get these results in just several minutes by “distant reading”. Instead, you will probably spend days to get this general picture if you read all the literatures from 1840 to 2000.
In one sentence, “distant reading” is a necessary and time-efficient approach of investigating humanities in a certain era without the need of time-traveling. It has transformed humanity to a more diverse subject.
[1] Schulz, Kathryn. “What Is Distant Reading?” The New York Times. 2011. Web. 30 Jan. 2016
[2] Underwood, Ted. “Seven Ways Humanists Are Using Computers to Understand Text.” The Stone and the Shell. 2015. Web. 31 Jan. 2016
[3] Hoover, David L. “Text Analysis.” Literary Studies in the Digital Age. 2013. Web. 30 Jan. 2016
Language Games
Which words are dominant? Which are subordinate? What cultural and conceptual expectations does this visualization of King’s speech raise?
In Dr. King’s speech, the dominant terms I found are “freedom”, “dream”, and “justice”. “motels”, “symphony”, “hamlet”, and “seared” are subordinate terms. Social justice was not achieved during that background time of King’s speech. It appealed that there were many people who were eager for their freedom.
Look at the time periods underneath and click on the peak periods. What are the source texts?
The graph tells me that freedom has always been a frequently used word from 1800 to present. The period when “freedom” is most frequently used is during the 20th century. The source texts are Freedom and Community, Freedom and Nature: The Voluntary and the Involuntary, etc.
Now, enter other dominant terms. What does the graph show you? Can you think of some explanations for this change?
The graph shows me “dream” has always been used more and more often in people’s articles. I think with the development of technology, the quality of people’s lives has been dramatically improved. People now have the ability to achieve their dreams. Also, everyone is able to have a dream now.
The first two paragraphs of The Declaration of Independence:
What terms are dominant?
“Government” and “powers” are dominant terms.
Now create an Ngram with those two terms. What does the graph show you? Can you think of some explanations for this change?
The graph show me that “government” has always been used very frequently for decades. The reason behind I think is because government is such an important element in people’s life. At the same time, the head of the governments are always rotating. The policies are always changing. Thus government is always a big topic.
In the 19th century, “powers” was a dominant term because it was not a peaceful era. Many countries were under wars. In order to survive, countries must have enough power.
What prosodic elements does the author of this site identify? In what way does this add to the power of King’s speech?
David Repetitions. It makes King’s speech sound like a flow of stream (i.e. easy to cause sympathy).