A Comparison of Popular Song Lyrics Using Voyant

Comparison of Lyrics in Popular Song

This small study compares common themes in two collections of popular songs chosen by two groups of participants in community music groups in Ireland. The two groups for the purpose of this comparison are :

  • Group A: residents at a nursing home in Kerry aged 70 plus and all Irish by birth
  • Group B: A mixed age & ethnicity adult education group attending a community education course at a community centre in Cork city.

Both groups worked over a period of weeks with a community music facilitator developing skills in singing together as a group. Over the duration of the course the groups selected favourite songs to sing as their repertoire. There was no pre-selection by the facilitator and the participants suggested ideas for songs from their own repertoires and backgrounds.

The purpose of the analysis in this blog is to examine these songs using a text analysis tool Voyant ,in order to see whether themes emerge from each song collection and to draw conclusions based on that information. Do the themes emerging from the two groups overlap or correlate? Does this give some insight into the commonest themes in popular song in Ireland or in popular song generally?

Each group came up with a ‘songbook’ of selected songs. In order to examine the findings it is useful to interrogate each separately and then to make comparisons. Group A’s songbook was made up of the following songs, largely traditional ballads:

GALWAY BAY

MY LOVELY ROSE OF CLARE

RED IS THE ROSE

COUNTRY ROADS

WHEN YOU’RE SMILING

KATE’S SONG

THE HOUSE OF THE RISING SUN

WHISKY IN THE JAR

WILL YOU GO, LASSIE, GO

TOORA LOORA LOORA

SALLY GARDENS

BLUE MOON

ALWAYS ON MY MIND

YOU ARE MY SUNSHINE

OH SUSANNA

DIRTY OLD TOWN

THE MOUNTAINS OF MOURNE

THE CLIFFS OF DOONEEN

WHEN I’M SIXTY FOUR

THE STAR OF THE COUNTY DOWN

THE SHORES OF AMERIKAY

SPANCIL HILL

CLARE TO HERE

NORA

COME FROM THE HEART

When the text from these songs was entered into Voyant the following word cloud result emerged:

This illustrates that the four most frequently occurring words are: chorus, love, oh and old. Already this gives some insight into common themes of the song and even into song form as it is clear that choruses occur frequently, a characteristic that might be expected in Irish songs designed for joining in and ‘sing-along’. However if we are to concentrate on themes rather than form, then the word ‘chorus’ must be eliminated and Voyant instructed to treat it as a stop-word.

This yields the following result, available as an interactive view by clicking here  or on the image below

 

Now the most frequently occurring words are: love (27), I’m (21), old (19), oh (18), it’s (17). It is usual in text analysis to ignore pronouns but in this case it gives a useful insight. It can be assumed from this information that most of the songs in this corpus are sung from a first person perspective and that thematically they are nostalgic and romantic in nature, indicated by the references to ‘love’ and ‘old’.

As far as trends go the raw frequencies of these individual words – or tokens as they are called in text analysis – throw up some interesting information in the line graph above. Although there are clear peaks in the document where certain words are more dominant due to their prominence in individual songs such as ‘old’ in Dirty Old Town, there does appear to be a reasonably consistent occurrence of the top words throughout the songbook. This can be examined further by taking a look at relative frequencies. The word ‘loved ‘ has been included as it is a frequently occurring one (13 times) and linked to ‘love’ of course. The graph below shows that there is some trend of occurrence throughout the document even if it is not that consistent. Certainly ‘love’ is a re-occurring theme, which comes as no big surprise.

 

So if we draw from this analysis that the important themes occurring in songs chosen by the nursing home group are love and nostalgia, how does this compare with the songs chosen by Group B, a more mixed age and ethnic group?

The songs chosen by Group B are:

BOTH SIDES NOW – JONI MITCHELL

IMAGINE- JOHN LENNON

ANNIE’S SONG – JOHN DENVER

HAPPY – PHARELL WILLIAMS

THE HEART MUST GO ON – CELINE DION

SALLY GARDENS

RED IS THE ROSE

SOMEONE LIKE YOU – ADELE

MAKE YOU FEEL MY LOVE – BOB DYLAN/ADELE

THANK YOU FOR THE MUSIC – ABBA

Already we can see that there is a much higher incidence of contemporary music (1970’s to present) than in the previous sample group and only two traditional songs : Sally Gardens and Red is the Rose , both of which were also chosen by Group A. It is also clear that the two corpus are of different lengths which is why it is important to pay attention to relative frequency as opposed to raw, the latter giving the total number of occurrences of a word. What else can text analysis tell us about emergent themes in this second corpus?

Below is an interactive representation of the Voyant Analysis which allows the user to search different tokens and view findings accordingly. A full screen and therefore more user friendly interactive view can be reached by clicking here:

By using the search panel on the Reader section of this report it is possible to quickly see the distribution of a word’s occurrence. The word shows up highlighted in the text as well as in a line graph below. From this we ascertain that the song ‘Happy’ contains many examples of that same title word as well as the word ‘clap’ and so it makes sense to exclude these from the search and include them as  stop words to avoid  skewing the resulting word count frequency.

‘Like’ is  also removed as it is usually used as a comparison or within metaphor. Having assessed the frequent words, it is helpful to pick ‘love’ and ‘feel’ and represent them in the graph as they appear to be the dominant ones throughout the document.

Does this text analysis offer any insight into these song themes? Love and feelings are always going to be dominant themes in popular song and so in a sense the analysis merely backs up what we already know. The fact that ‘old’ appears throughout Group A’s songbook – even allowing for the peak around Dirty Old Town- does tell us something about the nostalgic style of these more traditional songs compared to more recent songs. Both groups threw up ‘I’m’ as a high frequency word, which would indicate that most of the songs from both groups are written from a first person perspective. That offers an interesting pointer for investigating a larger corpus to see if most popular songs are indeed written as first person narrative. Casual observation would guess that they are. Does this then make it easier for people to identify with the singer/ writer and is this why people select these songs above third person narrative?

Voyant certainly offers many opportunities as a starting point for analysing textual data and in this case helps to not only discover and validate information about themes but also about song form and perspective.