Presentation Reflection

My Motivations

Screen Shot 2016-03-22 at 10.12.54 PM

I have always been a huge fan of food. I love trying cuisines from different cultures. It is amazing that different cultures have created so many ways to cook one specific ingredient. Driven by the love of food, I decided to do an analysis on recipes from different cultures.

My Research Questions

  • What different cooking techniques different cultures prefer to use?
  • Do ingredients vary depend on different cultures?
  • Is there a kind of spice that people from a culture love to use?

Having these research questions in mind, I went on to collect cookbooks of different cultures and create my corpus.

My Corpus

When I was searching for cookbooks that I was interested in, I kept in mind that I needed to find cookbooks from cultures that are representative and diverse.

My corpus contained these texts:

  • French cuisines cookbook
  • Chinese and Japanese cuisines cookbook
  • Egyptian cuisines cookbook
  • Australian cuisines cookbook
  • Indian cuisines cookbook
  • Italian cuisines cookbook

The reasons why I picked French, Chinese, Japanese, and Italian cuisines cookbooks were because these cultures have always been known to be amazingly good at cooking. Also, these cultures are very representative among all the countries in their region. They also share one common characteristic which is they all have long histories. Longer histories can lead to more mature and unique cooking techniques.

The reasons why I picked Egyptian cuisines cookbook was Egypt has always been well known for its history and its pyramids. I thought it would also be interesting to find out how people in Egypt cook.

The reasons why I picked Australian cuisines cookbook was Australia is a country that is far separated from all the other countries. They have some ingredients that other countries don’t have. My grandfather traveled to Australia nine years ago and he told me that kangaroos in Australia were way overpopulated, Australian had to eat them to slow down the growth of kangaroos’ population. I was curious if I could actually find kangaroo as one of the ingredients in the Australian cuisines cookbook.

The reasons why I picked Indian cuisines cookbook was India has a unique religion. I wanted to see if religion has any impacts on how people cook.

Modifying Corpus

All the corpus I collected were in .pdf format except the Chinese and Japanese cuisines cookbook I got from Prof. Faull was in .txt format. Unfortunately, most of the tools I used for text analysis purpose only supported .txt format. I had to convert all the .pdf files I collected. Since cookbooks usually contain images of dishes,  some strange characters appeared in the text files after they were converted. These strange characters really affected the results generated by text analysis tools.

There were also many terms that were not related to the research questions such as “add”, “place”, “heat”, etc. I had to take them out when I was trying to generate a informative word cloud.

In general, cleaning my corpus was such a tedious process.

Voyant

After finally having a nice collection of cookbooks, I first used Voyant to analyze the cookbooks. I mainly used two tools: word cloud and bubblelines.

Capture1

This is the word cloud generated by Voyant for the whole corpus. From this word cloud, it’s easy to tell chicken is the most popular protein source for all cultures I picked. Onion is a popular kind of vegetable among all the cultures I picked. Salt and pepper are the spices used most often.

Screen Shot 2016-03-23 at 12.59.01 AM

 

This is the bubblines graph generated by Voyant. I searched for “pork”, “chicken”, “butter”, “pepper”, “fish”, and “curry” in this corpus in order to see how often these ingredients are used in different cultures’ cookbooks. It is surprising to see that people from Australia and India do not eat pork very often.

Voyant is an ideal tool to display my corpus in a fancy and colorful way.

Antconc

Next, I moved onto using Antconc to do further analysis on my corpus. I mainly used keyword-in-context searching tool and corpus comparing tool.

Screen Shot 2016-03-23 at 12.56.52 AM

I searched for “butter” in all corpus in order to see the contexts around butter so that I can see how people use butter and what people cook with butter.

I used corpus comparing tool to compare Chinese and Japanese cuisines cookbook with Egyptian cuisines cookbook. The results showed me that Chinese and Japanese eat much more mushrooms than Egyptian people do.

Antconc showed me some very useful information regarding the relationship and differences between cultures.

Jigsaw

I was very happy when I moved onto using Jigsaw. Finally, a tool supports .pdf format. I didn’t have to do any clean up or converting when I used Jigsaw. Jigsaw allowed me to create my own entity list which was very helpful for me since I was looking for certain ingredients in the text. Thus I created a file that contained a list of ingredients that are more frequently used based on the word clouds generated by Voyant.

I used list tool which served a similar goal as Bubblines in Voyant. List showed me if a certain ingredient appeared in specific cultures’ cookbooks.

Screen Shot 2016-03-23 at 1.00.20 AM

I found out that Australian people don’t cook pork or mushrooms very often.

Then I used graph tool to figure out the dishes people cook using certain ingredients.

Screen Shot 2016-03-23 at 1.00.27 AM

It was fascinating that Jigsaw was able to generate such detailed information.

Reflections on the Results

The results were surprising and informative. I was expecting to see kangaroo but I didn’t see any recipes use kangaroos. I wasn’t expecting to see Australian people cook mushrooms seldom. In general, I was quite satisfied with all the results.

Nevertheless, I also realized that the results could be biased. There are several factors that can cause these bias:

  • The cookbooks I collected might be written in different years.
  • The cultures I selected are representative, but the cookbooks might be too short to be representative.
  • The cookbooks might be translated and the translators might misunderstand the original author.
  • The recipes might be adjusted according to where the cookbooks were published.
  • The size of my corpus might not be large enough.

Talking about the analysis tools, there is no such thing as the best tool among all of them. It is a better idea to use them together and make them cooperate.

My Future Research

  1. I plan to further clean my corpus so that I can get more accurate results.
  2. I will expand my corpus. I will not only select more representative cultures but also collect more cookbooks for each culture at the same time.
  3. I will try to seek for more new and interesting relationships between cultures and within cultures.
  4. I will try to establish connections between food and other elements of each culture.

Comparison of my corpus and text analysis in Voyant and Antconc

Corpus construction 

Collecting all recipes from various back grounds has been a tedious process for me. The online resources are surprisingly limited and unorganized. When I first started building my corpus, I had to find all the recipes that I was interested in and combined them into my customized “cookbook”. As a result, the corpus  I collected are not general or representative. They are biased by my personal like. At the same time, the corpus is so small that I didn’t get as much information as I needed. I then started doing more research in order to expand my corpus more.

Fortunately, I found some cookbooks from different background on various websites. All the recipes in the cookbooks are as representative as I expected. In my corpus, there is an Italian Cookbook, a French Cookbook, a Chinese-Japanese Cookbook, an Indian Cookbook and an Egyptian Cookbook. Although all my corpus are not long, there are still many some words or phrases I want to get rid of.

Voyant vs. Antconc : Interface

When I first started using both text analysis tools, I was pretty impressed by the level of analysis they were able to deliver. As for interface, I really like Voyant because it is so colorful and user-friendly. Voyant also looks more developed comparing to Antconc. Voyant’s layout is also clearer and I was able to locate the useful tool that I needed to analyze my corpus. I also like that Voyant is able to understand different languages.

Screen Shot 2016-02-24 at 11.14.17 AM

Although Voyant looks fancier than Antconc in some way, Antconc is still more powerful and useful in terms of some analyzing techniques. It is quite obvious to see that many terms in the cirrus provided by Voyant are useless. Voyant was showing the words I was not interested in instead of the terms I was looking for. I had to modify the wordlist a lot so that I can make some reasonable analysis. Modifying the wordlist was a painful process. I had to add some terms first and confirmed the new list. Next, some other useless words would pop out. Then I needed to go back to the wordlist I just modified and add some more words. Finally, I got this cleaner result:

Screen Shot 2016-02-24 at 11.24.02 AM

Nevertheless, I have to admit that Voyant has done a good job by constructing a word cloud for you which Antconc fails to deliver. Plus Antconc is always blurry.

Voyant vs. Antconc: Searching

Using Antconc, I felt it has done a good job in terms of delivering a well-organized result of searching words. I was able to look at the surrounding texts of a keyword I was searching for. It also allows me to see which corpora a certain line comes from. Consequently, I was able to perform some differential searches. For example, by searching “pork” I was able to not only find out people from which background cook pork less often than other backgrounds but also obtain how each culture prepares pork.

Screen Shot 2016-02-28 at 9.51.35 PM

As for Voyant, there is a useful tool called Bubblelines which allows me to see the frequencies of each key term in each corpora by looking at the size of the bubbles.

Screen Shot 2016-02-24 at 11.50.14 AM

Yet Bubblelines didn’t tell me the location of each keyword in the text.

Voyant vs. Antconc: Specialties

There are some particular features I found out that Voyant had and Antconc had.

Voyant provides a tool that creates links between key terms. From the links, I was able to tell what terms were associated with certain words in my corpus. The relationships between words are very useful information for me to make analysis.

Screen Shot 2016-02-28 at 10.21.11 PM

Antconc allows me to compare corpus to find out the distinct between corpus. Antconc lists all the keyness of words in the corpus I want to compare. “Keyness” is a measurement that tells us how frequently a word appears in one corpora over it appears in another corpora. Therefore, I’m able to the divergence of two cultures in terms of cooking.

Screen Shot 2016-02-28 at 10.33.31 PM

This is the result I got from comparing Australian Cookbook and Chinese-Japanese Cookbook. The result is very interesting.

Reflection

This whole process of corpus construction and analysis have impressed me that how diverse cultures are in terms of cooking. Even people from the same continents cook differently. I am excited to see more contrast and analogies between cultures.

Corpus Creation

Food has always been my biggest interest. I love the joy of trying all the delicious dishes from different cultures. Whenever I visit a new city, I spend hours on researching the best-rated restaurants in town on yelp. I never let my stomach down. I also thought it would be very interesting if I compare how differently people cook in various cultures. It would also be fascinating if I can find a relationship between culture and the ingredients mostly used in these cultures, plus how diverse cultures cook different ingredients. Thus I decided to collect several recipes of different courses from dissimilar backgrounds.

Before I started searching for the recipes, I realized recipes are very special comparing to other types of text like academic journals or public speeches. They are usually in a format that is like a list of instructions. There is also another issue of collecting recipes: the angle of analyzing recipes is very crucial for collecting texts. There is such a wide variety of styles and ingredients of cooking. Therefore, it can be simple to lose focus or hard to find a focus. Unlike academic papers or political speeches, recipes don’t get digitalized by people. Thus there are not plenty of materials to be collected although these materials are various.

In order to obtain the most accurate result, I decided to first several some representative cultures from all major cultures. From all the European cultures, I chose French cuisine. I think France is a country with a long history and French culture is attractive. At the same time, French food has always been regarded as one of the best-tasting food in the world. I then selected Chinese cuisine from all Asian recipes. I chose China because China is a country with the longest history and the most diverse culture blended by 56 different ethnic cultures. Plus many other east Asian countries developed their cooking techniques by referring to China. Among all the African cultures, I chose Morocco. For the same reason, Morocco has a relatively long history. Their recipes have been developed for many years. From middle eastern cultures, I chose Egyptian culture. Since Egypt is also a country with one of the longest histories. It would be interesting to find out how Egyptian cook. From all the Pacific cultures, I selected Australia since Australia is a country that is isolated from all the other countries. Thus Australian culture must be special in some way.

Then my next step is to select representative recipes from each selected culture. Another aspect of recipes is the recipes are not very long; especially after all the stop words are removed. Therefore, it is possible for me to collect a large number of recipes from each culture. I chose recipes using different proteins. I also chose recipes that are vegetarian. Finally, I also collected two dessert recipes within each culture.

Although my plan of collecting texts is reasonable, it is still tricky to find all the materials I expected. The resources of the cook book are surprisingly limited. I found that wikibook is an excellent and helpful resource that exceeds my expectations.

I will also try to find some extra information on different recipes, for example, the year when a certain recipe is created. If I can associate more information on the recipes, I should be able to identify a more exact relationship between different factors and recipes.

After finish collecting all my corpora, I will use Jigsaw as my distant-reading tool to do my textual analysis. I will be looking for connections between cultures and cooking techniques, as well as connections between ingredients and corresponding cooking techniques.