My Corpus Creation

When I started creating my corpus, I knew exactly what I wanted to include in it.  I plan to eventually write my own heroic epic poem, and in order to do so I must first analyze a few pieces of previous works to see what some commonalities are.  I found copies of 4 epic plays that I think will suit my needs nicely.  Included in those four are The Argonautica, The Aeneid, The Iliad, and The Odyssey.  I found a version of each, translated into English but also plan to see if there are places that match, if it matches in the original language as well.  Although my corpus may seem small, It is mighty.  Each of these texts contain several subsections, each subsection having substantial length.  

 

When I first started looking I went to project gutenburg to select my translations of the text but those are in a older type of english and because they are all from different time periods, they may not line up vocabulary wise.  Currently I am searching for a more current translation, and hope to find one soon.

 

I cannot say with certainty that I am looking for one specific thing, because I am not.  What I am looking for is a trend. However, I am not sure what kind of trend I will find. So I plan to use all of the tools that we have available to see what trends I can find.  For example if I choose to work with word density, then I would be able to what words are most commonly used, and from that infer information about the theme of each part of the text.  However because this is a translated text, I will have to check the English against the original language, whether it be Ancient Greek, or Latin.  N-grams will probably not be particularly helpful to me, because of the fact that this is a translated text.  Seeing how the english words were used over time wouldn’t really help me to further my understanding of these authors and their writing techniques.  Using a website like: nlp.stanford.edu:8080/corenlp/ to try and understand what words go with one another, only help to better understand the translator not what the original author meant.  The only way that I can use a tool like that, is if the translator did not take any liberties when translating and stayed very true to the text.  If that was the case I think I could use this tool, but I would have to be careful when extrapolating the data.

 

All things said I think that my corpus fulfills all of the requirements.  However analyzing my work might be slightly difficult and I will need to work hard to get to the true meaning.  That is one of the problems with working with a translated text. My hope is that, even though all of my texts are translated texts, they were translated enough times that the most upto date translations have caught the mistakes of previous ones and stay very close to the original language.

Leave a Reply

Your email address will not be published. Required fields are marked *