Topic Modeling trends – Using Google Fusion Tables

I have chosen abstract topics, which are not too related to History. Nonetheless, I have observed a thematic connection between them, so I divides them into 4 groups.

The related topics of each group show more appearance at the same time periods, suggesting that Arthur Conan Doyle was writing about related themes in each time. Especial concentrations can be seen between 1891-1893, and 1904-1905. After 1908, the release of stories had been constant till the 1920s.

Chart 1: topics 4, 10 and 15 – Investigation, Mystery and Violence

In February 1892, we can see the greatest peak of the whole graph related to the topic “mystery”. This was the release date of The Speckled Band, a story full of words related to mystery, as our class well knows. The peak of “violence” (April 21, 1893), is the release date of The Gloria Scott, a story that ends with a death, which related words are within the “violence” topic. The peak of investigation (September 16, 1893) is related to the story The Greek Interpreter, which involves kidnapping and intimidation, which are material for “investigation”. “Mystery” seems to be the most important topic in the 1904 eight stories, as it stands out from the other topics.

Chart 2: topics 14, 16, 26 – Time, Location, House

The greatest data here are the peaks of “Time”, in March 16, 1892 – release of The Adventure of the Engineer’s Thumb – and “House” in February 1, 1911 – release of “The Disappearance of Lady Frances Carfax”. The first, happens over the summer (time aspect), and the second involves a pursuit along housing environments.

Chart 3: topics 5, 8 and 29 – Conversation, Relationship and Appearance

The principal trends in this graph are a great peak of Relationship in September 1, 1891 (A case of Identity, a story about marriage and the relationship between stepdaugther-stepfather) and a growing appearance of “Conversation” matters in the stories between 1893 and 1903.

I have selected the topic 27 – Sitting – from my 40 topics to the list of the 10 favorite ones.

I have chosen to leave the most different topic one alone in the forth graph. It is “Sitting”, which includes words such as “chair sat room fire bell laid asked lit lamp”.

The first peak is related to the story The Boscombe Valley mystery (October 16, 1891), which involves traveling by train, carriage, driving, actions that might involve terms around “Sitting”. The second peak coincides with The Adventure of Wisteria Lodge (September, 1908), a story that happens inside a house (so it has related terms to “Sitting”).

All the charts in:


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s