Corpus Creation: Feminist Speeches

My corpus creation began with ideas and consolidating my interests into overarching topics that I could start with and then narrow down. I began by playing around with the idea of Civil Rights in the past and present and narrowed down my ideas to two: a look at women’s speeches throughout history with a concentration on how their focus and patterns changed with time and more equality or a look at student’s reactions to the BlackLivesMatter movement and Campus protests, focusing on the uses of the different hashtags based on location and other demographics. After researching both topics I decided on women’s speeches and began to search for the most influential and most important speeches throughout history. Many of my results had similar lists, with the trend of certain speeches showing up in most if not all lists. I finally decided to change my focus from women’s speeches throughout history to the speeches in one “top 10” list  to better understand why these were chose as the most influential speeches and to see if there was a trend between them despite there different time periods and settings in which they were said.

The final corpus I decided on was a list from Marie ClaireThe 10 Greatest Speeches Of All Time By 10 Inspirational Women

  1. Virginia Woolf, ‘A Room of One’s Own’ (1928)
  2. Emmeline Pankhurst, ‘Freedom or Death’ (1913)
  3. Elizabeth I, ‘Speech to the Troops at Tilbury’ (1588)
  4. Hillary Clinton, ‘Women’s Rights are Human Rights’ (1995)
  5. Sojourner Truth, ‘Ain’t I a Woman’ (1851)
  6. Nora Ephron, ‘Commencement Address to Wellesley Class of 1996’ (1996)
  7. Aung San Suu Kyi, Freedom From Fear’ (1990)
  8. Gloria Steinem, ‘Address to the Women of America’ (1971)
  9. Julia Gillard, ‘The Misogyny Speech’ (2012)
  10. Maya Angelou, ‘On the Pulse of Morning” (1993).

I chose this corpus because it is a mix of various different types of writing and by from a diverse group of women: from a free slave to a Queen heading into battle. Some of these texts are lengthy, published texts on their own, while others are speeches of no more than 350 words. I wanted to know what exactly ties all these together to make them influential speeches and how they used their language to make an impact. I also want to look at the historical and geographical context of each of the speeches to see what influenced their style and arguments.

Because my corpus is made up of famous speeches, it was not too difficult to find transcripts of the texts I needed and after downloading them I was able to convert them into txt files. My biggest trouble came with Virginia Woolf’s ‘A Room of One’s Own’, because it is a book. I had to make a transcript for Gloria Steinem’s Address to the Women of America myself from recordings of her 30 second speech. I have begun to use a few a few tools to analyze the texts: removing common words, looking at word frequencies and organizing the texts by their date as well as by their country of origin.

Voyant analysis of 'Ain't I a Woman', Sojourner Truth
Figure 1. Voyant analysis of ‘Ain’t I a Woman’, Sojourner Truth

Figure 1 displays the beginning of my analysis: after cleaning up the text files I put them into Voyant to see any patterns they may have. Considering the topic it is not surprising to see some of the most common words throughout all the speeches were “men” and “women”. Religion (God) was mentioned often as well as well as “rights”.

Distant Reading

Ted Underwood defines distant reading as “an interdisciplinary conversation about methods”, referring to computational methods of literary analysis. Although scholars disagree on the accuracy and dependency of computational analysis, there are various programs that help look at text in a new light by using visual representation (charts, graphs, pictures, etc.). Distant Reading looks at patterns in writing, word frequencies, punctuation and a plethora of other small details of writing and quantifies them for insight on an author’s writing style, the time period, subject and the culture of the place or time. It does not necessarily replace reading: to further understand the patterns it is important to understand the context of the story and characters. Another way to establish that the data being collected is insightful is to look at other works from the time period or place. A pattern discovered in an author’s writing might not actually be a characteristic unique to him/her but rather a norm in the writing style of the period.

Davidic Chiasmus and Parallelisms This website finds patterns in text and can be used as a part of the rhetoric approach to literature.
Davidic Chiasmus and Parallelisms
This website finds patterns in text and can be used as a part of the rhetoric approach to literature.

 

Screen Shot 2016-01-27 at 1.42.27 PM
A wordle- data program that generates the most used words in a text

Tanya Clement writes about how distant reading can help analyze reading in conjunction with close reading. She argues that both types of reading can be important to analyzing in all four approaches of literature: as a rhetoric, as an art, as a philosophy and as a cultural production. One program that can help in these methods is pictured to the left: a program that finds the patterns in a text and gives insight to the author’s style and meaning. For MLK’s speech this program was both an artistic and rhetoric approach: it showed patterns in writing but these patterns also added to the writing as an art: it highlighted the imagery in the language he used throughout his speech.

Screen Shot 2016-01-27 at 1.38.04 PM
An example of a google ngram

Hoover also mentioned using data analysis for various investigation techniques. One of the ways he considered for investigating text is looking at the character’s speech patterns and how a character’s voice may differ from another. Clement explored this phenomenon by studying word frequencies in a text. Most programs that look at word frequency, like Wordle ignore the most common words such as “it” “and” and “the” but Clement wrote about a program that actually included these words to learn about the different classes in the characters in Jane Austen’s novels. To take things a step further, if a scholar finds that an author used a particular word a lot more than he or she used any other such as the case with Wordswith and the word “alone”, he or she may also want to see if this is a particular author’s habit or a thing of the time period. To figure that out, the scholar would have to see the frequency of the word in other books of that time. One way to do this is to use Google ngrams: it shows the frequency of words throughout different time periods.

Bookworm - Culturnomics
Bookworm – Culturnomics
Culturnomics - analysis of words in Presidents' State of the Union Addresses
Culturnomics – analysis of words in Presidents’ State of the Union Addresses

The final program we looked at was Bookworm’s Culturnomics. These are programs that look at word use in texts, movies, speeches, songs, etc. throughout history. The one I specifically looked at was one that sorted president’s based on their uses of various words in their State of the Union Addresses. It was interesting to look at what different presidents were looking at based on historical events that were happening in the United States and around the world during this presidencies. Comparing it to the most common words in Martin Luther King’s speech demonstrated that the president at the time, Kennedy, was not talking about the same thing despite the movement happening in the country. This program also demonstrated to me the importance of programs that take into account conjugations and words that may have more than one meaning such as “right” vs “rights” or as a direction. a search with only one different letter can give an unwanted result or unnecessary data which effects the accuracy the conclusions drawn from the data.

Bookworm Analysis

I began by searching for the word “right’ in the Google ngrams and bookworm program because it is the word I found first in Obama’s State of The Union Speech and it gave me the following results:

Screen Shot 2016-01-31 at 10.08.30 PM
Ngram of the word “right”
Screen Shot 2016-01-31 at 10.08.47 PM
Bookworm representation of “right” in State of the Union speeches

Both charts show the same increase of the use of the word “right” recently with Obama having used the word more than any other president and an upward slope in the Ngram. However Kennedy who is president when Martin Luther King gave his speech is pretty low on the list giving the impression that he was not talking about the rights of people during his presidency.

After seeing these results however I decided to test out if I obtained different results by searching “rights” and was surprised when my figures did change:

Screen Shot 2016-01-31 at 10.10.48 PM
Ngram of the word “rights”
Screen Shot 2016-01-31 at 10.10.57 PM
Bookworm representation of the word “rights” in State of the Union Addresses

The data is completely different. Obama is in the final five and Kennedy is much higher on the list. The ngram shows a recent decline in the use of “rights”. This was important to demonstrate how a small alteration to a search can result in drastically different outputs and can effect the conclusions drawn from the data.