Machines can read emotions if their sentiment lists and grading are tailored towards a certain type of work. A program cannot be trusted wholeheartedly or blindly and the user has to have prior knowledge of the corpus before running it through a program in case the machine does make a mistake. Prior knowledge allows the user to understand if a result is skewed by s writing style or time period and it also helps the user understand the context of why a text has a certain sentiment. It is also important to ensure that the list of words the program used for “happy” and “sad” or “calm” and “angry” analysis are relevant to the specific corpus and to create a separate list or modify the existing one to have the best results. For students who may be using translated or older texts it is necessary to find or create a list that will have words from that time period or that takes into account a translators choice to translate text into an older, unused English when a program’s list only has “new” English.
Programmers can create various codes that take into account the different ways sentiment can be read in a text. It can analyze a text using word frequency alone or word frequency and collocations, which may make it more accurate. In class we spoke about taking the word “not” into account: a document can say “not tragic” and “not angry” but if a program considers “not” a stop word, it will not read the opposite meaning of the sentence. The programmer may also want to take into account colloquial sayings that do not explicitly have one meaning: a computer without specific program is as clueless about US sayings as a foreigner misunderstanding when they are told to “break a leg”. Ramsey discusses creating a program that analyzes a text by “even distinguish [a] noun from the verb” for a word like “love” as well as using programs to find the richness of documents in a corpus and to “rank them according to ‘vocabulary richness’ (defined as the largest number of different words per fifty-thousand-word block)”.
Another possibility for a program is creating or using one that “learns” from experience. Like Mallet which works best and has more accuracy with bigger texts, creating a sentiment analysis program that adds to its database with every search might be useful, although this may result in disasters like Microsoft’s “Tay” twitter personality. “Tay “was a Microsoft project that was supposed to imitate a teenage girl to ease customer service and make it seem more personable. Tay was quickly corrupted by internet trolls and in the course of a day went from tweeting her love for humans to racist Nazi supporting tweets. Her story shows that even the best program can have difficulty mimicking human thoughts and conscience. This is an argument used against sentiment analysis, It is hard to trust the results of something as cold and unemotional as a computer when its analyzing something as “human” as emotions. We trust our computers to give us accurate results for maths and science, shown clearly in a students dependence on his or her calculator but it is difficult for people to accept the same calculations for emotion. According to Ramsey this is because of “fears of an inhumanistic technology” and the fact “that we might ‘lose the text’ frightens many”.
One reply on “Sentiment Analysis”
Interesting thoughts! Good work! Have you run any of your texts through the sentiment analysis programs?