Battle of the Digital Tools!

 

My Corpus Construction

 

I created my corpus using older translations of The Odyssey, The Iliad, The Aeneid, and The Argonautica.  I cleaned it manually.  The reason why I chose to stick with the old translations of the texts is because it gives me a different view on the text.  Because I have read several of these epics I am very familiar with what happens and the trends.  One of the things we talked about earlier in the semester was de-familiarizing texts, and I think that using these older translations will properly do that.

 

Using Voyant and Antconc

 

The things that I looked at were a word link, a word cloud and a word concordance.  The word cloud told me what words were the most important.  Using that information I then moved forward to the word link.  The word link as seen below:

theimportanceofson

showed me what words were connected.  However the connection was out of context.  All that was left to do was look at the context using antconc. It gave me the context I needed to properly analyze the text.  Seeing what words come before other words allowed me to better see the importance.  A word without context means nothing.  You only get the true meaning of a word when you see how it is used.  Looking at son for example you see that it can be used to inspire different things. Below is the antconc word concordance.

theimportanceofson2

A Brief Comparison

 

Voyant and Antconc are two different programs, and each is meant to show you a different thing.  Voyant gives a more broad overview of the texts using several different tools embedded into the engine.  The interface is brighter and easier to navigate.  Both of those traits are common of something that is only meant to skim the surface.  Antconc is meant to go deeper.  It is more specific, and has a more clear purpose.  It is meant to show the keyness or the value of a word to a text, and how much it’s used. Voyent shows keyness in a very basic form but antconc is much more adept in showcasing it.  Voyent however is very good at showing what words are unique through all of it’s tools, and its mini applications.  They are both good programs but they have their specific uses.  I am more likely to frequent antconc because of its simplicity.

 

Pragmatics of my corpus

 

Studying these programs allowed me to further understand the importance of the word son in epic poetry and the importance of lineage.  A huge part of the ancient world is maintaining honor through lineage, and it is interesting to see that carry over into their poetry. You see that very specifically in the Argonautica, where Jason is not even referred to as Jason but as son of Aetes.  That might be specific to this translation but even then it still shows the implied translation, and from that you get the importance.  Although it happens the most in the Argonautica it happens in all of them and I would have never guessed that it was so instrumental in understanding the epics.

 

My Corpus Creation

When I started creating my corpus, I knew exactly what I wanted to include in it.  I plan to eventually write my own heroic epic poem, and in order to do so I must first analyze a few pieces of previous works to see what some commonalities are.  I found copies of 4 epic plays that I think will suit my needs nicely.  Included in those four are The Argonautica, The Aeneid, The Iliad, and The Odyssey.  I found a version of each, translated into English but also plan to see if there are places that match, if it matches in the original language as well.  Although my corpus may seem small, It is mighty.  Each of these texts contain several subsections, each subsection having substantial length.  

 

When I first started looking I went to project gutenburg to select my translations of the text but those are in a older type of english and because they are all from different time periods, they may not line up vocabulary wise.  Currently I am searching for a more current translation, and hope to find one soon.

 

I cannot say with certainty that I am looking for one specific thing, because I am not.  What I am looking for is a trend. However, I am not sure what kind of trend I will find. So I plan to use all of the tools that we have available to see what trends I can find.  For example if I choose to work with word density, then I would be able to what words are most commonly used, and from that infer information about the theme of each part of the text.  However because this is a translated text, I will have to check the English against the original language, whether it be Ancient Greek, or Latin.  N-grams will probably not be particularly helpful to me, because of the fact that this is a translated text.  Seeing how the english words were used over time wouldn’t really help me to further my understanding of these authors and their writing techniques.  Using a website like: nlp.stanford.edu:8080/corenlp/ to try and understand what words go with one another, only help to better understand the translator not what the original author meant.  The only way that I can use a tool like that, is if the translator did not take any liberties when translating and stayed very true to the text.  If that was the case I think I could use this tool, but I would have to be careful when extrapolating the data.

 

All things said I think that my corpus fulfills all of the requirements.  However analyzing my work might be slightly difficult and I will need to work hard to get to the true meaning.  That is one of the problems with working with a translated text. My hope is that, even though all of my texts are translated texts, they were translated enough times that the most upto date translations have caught the mistakes of previous ones and stay very close to the original language.

What is Distant Reading?

Distant reading is a practice that is becoming more and more popular as our technologies and resources evolve.  Distant reading allows us to take a text that we may have already read in a traditional sense and to then see it in a new light.  Imagine that you have already read the novel Harry Potter and the Deathly Hallows, and you thought you were the resident expert on that book.  Now imagine that I subsequently told you that there were new ways to read your book, that you had not thought about.  Perhaps this new way of reading would also allow you to compare Harry Potter and the Deathly Hallows to Harry Potter and the Goblet of Fire.  That is what distant reading allows us to do.  It allows us to use many different digital tools to analyze and compare many different forms of literature.  Each tool has its strengths and weaknesses, but they all have the same end goal.  That end goal is to defamiliarize texts in a way that allows us as the reader to more efficiently understand what the author is trying to teach us.

As I mentioned before there are many different tools that allow us to reach this goal.  The tool that I think is most instrumental to the understanding of patterns, as well as comparison between different texts is to create a word cloud.  There are many sites that exist that allow you to take texts and make a word cloud out of them, however the one I would like to focus on is wordle.net.  Its interface allows you to easily make a word cloud with just a few clicks.  Word clouds will always look different depending on what texts you use but the idea behind them is the same.  Here are two examples of word clouds that I have made using Martin Luther King’s I have a Dream Speech, and an excerpt from the Declaration of Independence:

wordle 2 wordle

As you can see although the words are different you can look through the most common words for words that are similar, or themes that may occur across that writers work.  Doing this allows one to more easily grasp the overarching theme of a book or passage as well as allowing them to easily compare themes from two separate books.  This process is much less time consuming than if one was to try to do this by reading and analyzing both books.

 

Another tool that we can use to practice distant reading is ngrams.  Below is an example of what one may look like using the same two excerpts I used previously:

ngrams

These graphs allow us to look through several hundred texts that span through time and see how frequently a chosen word was used.  It can show you a lot about how the meaning of that word, or what that word signifies, has changed over time.  It also allows you to see if words were used in tandem with one another as time went on.  Much like wordle it allows you to view themes.  But on a very different note it adds in a layer of how time affects a word and allows us to understand why an author may have chosen that word over any other word to get their meaning across.  That thought is not one that someone normally has while reading a book.  By defamiliarizing it and looking at that word through a tool, like making an ngram, we discover new meaning, allowing us as reader to more deeply understand what the author is trying to say.

Although there are perks to distant reading it is not necessarily the way that things should be carried out hence forth.  Like Clements says, distant reading should be used in conjunction with more traditional close reading to get a more full understanding of the text you have just read.  Although all three of the authors that we have read have said that we are advancing in a way that allows distant reading and computational understanding more viable, we should not throw away the close reading of the past.  If we combine both of these types of reading as we continue to read, I think we may find as a society that we get more from our reading experiences, as well as understanding more of what we read.