What is Distant Reading and how does it help us?

Distant reading is a concept first raised by Franco Moretti, who said distant reading is “understanding literature not by studying particular texts, but by aggregating and analyzing massive amounts of data.”[1] He believes that literature scholars cannot reveal the whole picture of literature work by only reading a small number of existing books. They have to split up into groups and spend large amount of time reading different text and then gather the analyses together to conclude the patterns of different literature development. But with the existence of distant reading, scholars are able to explore and discover different ways to identify and understand texts.

Distant reading is not designed to bring brand-new understandings to literature work beyond the current ones but to provide different understanding that might be ignored or disregarded by human beings. As Hoover states in his Text Analysis article, “Computer-assisted textual analysis is neither a panacea nor substitute for sound literary judgment, but its ability to refine, support and augment that judgement makes it an important analytic method for literary studies in the digital age.”[2] Human brains and computer work in completely different ways. Each of them uses different methods and focuses on different aspects when analyzing so that the result would be completely different. Human reading and distant reading (reading by computers) could complement each other’s extremely well. The “big data” analysis of literature and the personal “small reading” could collaborate and communicate. Therefore, text analysis by computers is one of the important branches of digital humanities that trends so much recent years.

Distant reading provides us with a whole and well-rounded perspectives and pictures to the literature with larger sizes, even different backgrounds and genres. According to Clement, she finds that “data-mining procedures proved to be productive in initially illuminating complex structural patterns that helped [her] discern those underlying patterns.”[3] Text analysis is based on word frequencies and word collocation. People usually don’t pay attention to preposition words, pronouns and articles that have no relations with the meaning and understanding of the work. Even if people do pay attention to words, they will not remember the patterns of their occurrences, concordances or frequencies and not even say to understand their roles in the literature language diction, syntax and structures. It is extremely difficult to analyze literal syntax or writing styles. As Hoover states, “words have the advantage of being meaningful in themselves and in their significance to larger issues like theme, characterization, plot, gender, race, and ideology.” Distant reading and text analysis exclude subjectivity to the analysis of the work. It “[defamiliarizes] texts, making them unrecognizable in a way (putting them at a distance) that helps scholars identify features they might not otherwise have seen, make hypotheses, generate research questions, and figure out prevalent patterns and how to read them.” [3] This idea also corresponds my discovery when I used Wordle along with Google Ngrams to plot the frequencies of words in Martin Luther King’s “I have a dream” speech and the Declaration of Independence. These two graphs generated by Wordle give clear and vivid images of what their main topics are and what the strong intentions are the authors.They reinforce our understanding of the text with powerful and direct statistics backup, and also bring cultural conceptual expectations. The main topic of King’s speech is to call for freedom, justice and equality in American society and the main purpose of the Declaration of Independence is to enforce the importance of the government. Even with a small number of words as input for the program, we could get these strong demonstration. It is possible that given a much larger input, Wordle could provide us with an unbelievable result. Similarly, Google Ngram uses these subtle changes in the word occurrences during the history and provides us with an interesting and profound result that calls for our thinking and explanations.

Screen Shot 2016-01-27 at 1.35.42 PMIndependence

Screen Shot 2016-01-29 at 1.51.01 PM

Compared to traditional reading (close reading) that focuses on only small amount of literature with same genres or era, distant reading helps scholars to identify many more patterns, similarities and differences.Text analysis makes it possible for us to answer the questions related to history of literature, for instance, how to distinguish between American literature and British literature, what is the most notable difference in styles between American literature and British literature, how to distinguish novels and poems faster, how to distinguish the work of male authors and that of female authors, and how to identify the work of anonymous authors or the unknown authors due to the loss of record, etc. In the readings, Hoover includes his finding of the differences in diction between male and female authors [2] and Underwood shows his simple imaginary statistical model that distinguishes pages of poetry from pages of prose [4]. They prove that distant reading is powerful enough to distinguish some patterns that seem impossible for scholars to find or might take them years to discover.

Distant reading makes the field of humanities dynamic and energetic. It is an interdisciplinary approach and conversation among various fields such as social sciences, humanities, computer science, sociology, statistics, literary history, etc., according to Underwood [4]. As more fields get involved in the discussions, the finding will become well-rounded and diverse. In the past, only humanities and social scientists worked on how to analyze and understand some texts or work, but now gradually, scholars from a number of different fields gather together to focus on one topic and contribute their ideas, as if the world gathers and collaborates together for one goal. I believe under the intelligence and strength of diverse collaboration, there would be huge changes in not only the enhancements of our understanding of texts but also cultural equality.

Although distant reading has a promising long term prospect, it brings a number of challenges and defects. First, there is copyright restriction to digital text. And also because of the limitedness of OCR technology, the recognition of text is not perfect. Second, text analysis could indeed help scholars find problems very quickly but it could not provide rational explanations or corresponding solutions which extremely limits its application. Besides, individual literature work is often the focus of humanities scholars and social scientists. Massive use of big data analysis could hinder the important patterns of each individual literature work. Additionally, it restricts the development critical thinking skills and innovativeness. There is no doubt that distant reading would continue to be a popular trend in academia but it would not replace the traditional studies of humanities. It could complement and collaborate with traditional methods to bring ideas and discovery.

 

[1] Schulz, Kathryn. “What Is Distant Reading?” The New York Times. 2011. Web. 30 Jan. 2016.
[2] Hoover, David L. “Text Analysis.” Literary Studies in the Digital Age. 2013. Web. 30 Jan. 2016.
[3] Clement, Tanya. “Text Analysis, Data Mining, and Visualizations in Literary Scholarship.” Literary Studies in the Digital Age. 2013. Web. 31 Jan. 2016.
[4]  Underwood, Ted. “Seven Ways Humanists Are Using Computers to Understand Text.” The Stone and the Shell. 2015. Web. 31 Jan. 2016.

One reply on “What is Distant Reading and how does it help us?”

Taylor,

Great job at distinguishing the differences between distant and close reading! I especially liked the point you made on the difficulty of noticing patterns in a text without textual analysis. It is true that it could be done with close reading, but not with much accuracy. Especially looking at the sentences syntactically!

Great Work!

Leave a Reply

Your email address will not be published. Required fields are marked *