Topic Modeling: Graphing the Results

The first topic is Travel:

Screen Shot 2015-04-01 at 1.12.09 PM


In this graph, we see an increase in travel around 1893 and the only other spike that occurs is later on in 1904, but the 1904 spike is not as high as the spike in 1893, therefore, I decided to research why that might have happened.  I found out that by the end of the 19th century, they invented a new method of transportation.  Based on the website Primary Homework Help The Victorians , “In the 1890s they could travel by motor car.”  Based on the research, I think that people decided to travel more after the invention of the motor car which explains the spike in 1893.

The second and third topics are Writing with Business:

Screen Shot 2015-04-01 at 1.12.31 PM

In this graph, I decided to compare the topics writing and business.  These topics both seem to have a spike at about the same time; Writing in 1903 and Business in 1904.  Therefore, I decided to research this further to find out why this might be.  The amount of writing words appear the most in “The Adventure of the Three Students”.  After reading the plot on the Wikipedia article, there is a lot of writing going on in the story because of the fact that it deals with students and a university.  However, it does not explain why business words showed up often, therefore, I looked at another story that was published in 1904.  Based on the Wikipedia article, business words appear pretty frequently in the story “The Adventure of the Abbey Grange” because it talks about how a man has been killed by the Randall gang.  It is interesting why these words tend to rise and fall together; it helps us understand the stories better because it will explain that the stories’ topics will be about writing or business.

The fourth topic is Detective Case:

Screen Shot 2015-04-01 at 2.25.32 PM

In this graph, we see a spike, that is higher than the other peak, in detective cases around 1891.  Then, I decided to research why this spike happened when it did.  Based on the Wikipedia article about the Whitechapel Murders, “The Whitechapel murders were committed in or near the impoverished Whitechapel district in the East End of London between 3 April 1888 and 13 February 1891.”  Based on this research, It is possible that the Jack the Ripper case influenced the amount of detecting words in the Holmes stories in 1891.

The fifth topic is Death:

Screen Shot 2015-04-01 at 2.25.47 PM

In this graph, we see a spike around 1903 regarding death and there is no other spike like that one throughout the rest of the graph.  Based on the website The Guardian, “During the 1880s and 1890s, local authorities, the LCC and the Metropolitan Public Gardens, Boulevard and Playground Association began to clean up and reopen old burial sites.”  It is possible that the actions of these authorities influenced the amount of death words in the Holmes stories based on the fact that from 1893 onward there is a steady rise in the amount of death words.  However, after the researching, I still am not able to explain the sudden peak in 1903.

The sixth and seventh topics are Time with Crime:

Screen Shot 2015-04-02 at 10.55.17 AM

In this graph, we see a spike for both time and crime in the year 1904.  Based on my research in the Wikipedia article of the story “The Adventure of Charles Augustus Milverton”, which was published in 1904, it is about the crime of blackmailing.  It explains how in order to help solve the case, Holmes visits Milverton’s Hampstead house, disguised as a plumber, in order to learn the plan of the house and Milverton’s daily routine.”  Therefore, daily routine refers to time.  Even though, it is evident that crime and time words appear in every Sherlock Holmes story.

The eight and ninth topics are Physical Description with Building:

Screen Shot 2015-04-02 at 11.06.51 AM

I decided to pair these two topics together because I wanted to see if there is a correlation between the two and also because they are both descriptions.  Based on this graph, the amount of physical descriptive words and building words tend to rise and fall together.  Except in the year 1904, the amount of building words increases and the amount of physical description words is not as high.  Then, after 1905 they do the complete opposite of each other; when the amount of building words rise, the amount of psychical description words fall or vice versa.  It’s possible that this kind of correlation tells us  that either the story will have more building words or that the story will have more physical description words.

The tenth topic is Emotion:

Screen Shot 2015-04-02 at 11.19.53 AM

In this graph, we see that there is a spike in the following years where emotion words show up most frequently, 1893, 1904, 1913, and 1924.  I have come to the conclusion that the stories that were published in these years all contained woman characters, based on the Wikipedia articles, “The Adventure of the Cardboard Box”, “The Adventure of Charles Augustus Milverton”“The Adventure of the Sussex Vampire”, and the bubble news article.  Based on the fact that they all contained woman characters, It’s possible that the amount of emotion words increased during these times because in Victorian times women were not considered equal based on the Wikipedia article.  This helps us understand the stories better because we can connect them to how the past really was.


