Topic Modeling Graphs : An Investigation

Welcome to my topic modeling project! Throughout my research as to find some trends for these three different graphs, I have come across some rather interesting finds. Much like a topic modeling project we had reviewed in class, I was really interested in the historical aspects that may have inspired Sir Arthur Conan Doyle to include certain topics within his many stories. Here we go!


Screen Shot 2015-04-02 at 8.48.46 PM
“Writing” topic model. Note: the blue line represents stationary/paper products

The first topic that I created was Writing, with three subcategories : Stationary/paper products, secret letters and sending mail.

If we look at the left-hand side of my graph, we can see that all three topics had a huge spike around 1903 – so that year was the one that I searched around for.

According to The New York Times’ archive named “On This Day,” in Sept. of 1903, a cartoon of a “major post office scandal” was published in Harper’s Weekly, exposing some violations that a prior story had touched upon in March of the same year of a corrupt post master in the United States. I’m not certain if this would have any effect on Doyle’s work being that he was in a different country, but news travels fast – especially about scandals.

Speaking of scandals, I found a rather interesting English scandal that relates to the topic with the highest peak – secret letters.

I stumbled upon an original Daily Mail UK article that provided “never before seen” photos of Edward VII’s mistress – a woman named Lillie Langtry. According to a caption underneath one of her photos, “Langtry was a regular in high society- and counted Oscar Wilde and Arthur Conan Doyle as close friends.” Ah, such a small detail to this particular article, but a huge win in terms of my topic model research! If he in fact was friends with this woman, I’m sure that her scandalous personal relationship with a married man was an inspiration for his writing, hence why “Secret Letters” would be the largest peak on this graph.
Here are the topics that were covered under my “Secret Letter” classification, for reference:

word men american message words english short picture affair change give single letters copy criminal figures meaning agony dancing hilton

Langtry was English, as well as Edward; they had an affair; and she eventually immigrated to America following their secret romance. Coincidence? (I hope not, because that is a pretty interesting find if I do say so myself!)

According to the article, “Langtry is rumoured to have been the inspiration for the character of Irene Adler in Arthur Conan Doyle’s Sherlock Holmes tale, A Scandal In Bohemia.”


Screen Shot 2015-04-02 at 8.49.38 PM
Crime topic model graph

Now, onto a rather complicated looking graph on crime! This graph is divided up into four different topics: homicide investigation, house fire/arson, stabbing and detective. There are about four different peaks on this graph between the end of 1903 and October of 1904 – and I was out to see if there were any reasons behind this, aside from the possibility of them peaking due to publication date. Here are my findings:

I wasn’t really even sure where to start with this, so I began with a general Google search of “1903 crime UK.” I then stumbled upon a WikiPedia page on gun control laws in the United Kingdom – one of which involved the pistol in 1903. From there, I left Wiki and searched “1903 Pistol Act UK” and found a VERY helpful resource page that may in fact show why there was a prevalence of crime, homicide and police activity around the time where a gun control law was placed into effect. Gun violence must have had to happened prior to that in order to instigate an act to control guns in the first place.

According to the Dunblane Resource sheet, the act required that each gun be registered and not be carried by a minor or felon. As we know, most criminals do not follow rules – so maybe this is why there’s an influx in all of the categories in my topic model.

Another huge, famous inspiration that we may also be able to connect to “homicide investigations” being the largest of all peaks in 1903, would be that “Jack the Ripper” was indicted and put to death on April 7, 1903.

Physical Descriptions

Screen Shot 2015-04-02 at 8.50.37 PM

After a bit of intense research on scandal and crime, we are brought to my final topic modeling graph of physical appearances. This was a bit softer topic where it was in turn a bit harder for me to find connections. The trends weren’t very in sync with one another. Apparel peaks high twice, around 1891-1892. This was the section of years that consisted of the collection “The Adventures of Sherlock Holmes,” officially published in 1892. Due to the sheer subject matter of the stories, I can make an inference that descriptions of people’s apparel spiked up here due, in fact, to the publishing of the stories themselves.

Well folks, there you have my take on topic modeling with graphs! Thanks for reading.


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s