I have been gathering a corpus of song lyrics. I intend to analyze the differences in words used, word frequency, word dependence, and anything else I can measure between songs that are supposed to provoke certain emotions. I decided to use Spotify as a resource for grouping songs. There is a section of Spotify that organizes playlists based on mood. The initial playlists that I chose were called “Don’t Worry Be Happy” and “Down in the Dumps.” One thing that I noticed when trying to find these playlists was that there were much more “happy” playlists then “sad” playlists and of the few playlists that were “sad” themed, none of them actually had the word “sad” in them. This made me think about social standards and if Spotify was encouraging being happy and discouraging being sad. This is something that I will continue to keep in mind as I gather my results.
When choosing songs, I decided to leave out some from the “Don’t Worry, Be Happy” playlist. Some songs that I did not include are: Don’t Worry Be Happy, Happy, Oh Happy Day, Shiny Happy People, if it Makes You Happy, and Happy go Lucky me. I thought that these songs, having the word “happy” in the title would create an inaccuracy by having the word “happy” be used too much. All of the songs from the “Down in the Dumps” playlist were used because none of their titles/choruses had an over use of certain words.
I googled every single song on each playlist for its lyrics and put them onto a document. After that, I cleaned them. Many times, there is a heading before the chorus that denotes it as the chorus. The same goes for verses and if there is a different singer, that is noted. I deleted all of these headings. In most of the lyrics, when there were repetitions, it would be put in numbers. For example, if a chorus is repeated twice, it would say [2x]. When this happened, I would delete the number and repeat the chorus manually by copy and pasting it. At first, I was unsure as to whether I should keep repetitions. I decided to keep them because this is obviously something that the artist thought was important. It could be a good indicator of what creates the mood in the song. Because the lyrics are written records of words that are sung, there are often times grunts, or slang used (Ex. “cause” instead of because). All of these slang words are spelled the same on the lyric sheets, so I decided to leave them in without changing them to keep the lyrics as natural as possible.
I am really interested in seeing the results of the textual analysis done on these lyrics. As of right now, I am not sure what results I will get. I am hoping that there is some sort of correlations in words and emotion that will be surprising. I will try to analyze these lyrics in as many different ways to catch any and every trend that there is.