MALLET Topic Modeling (Part Two)


I used several different numbers and played around with them a bit to find my topics. In all I used 5 different settings (50 topics, 100 iterations, 10 words; 50 topics, 2000 iterations, 10 words; 50 topics, 1000 iterations, 20 words; 100 topics, 2000 iterations, 20 words; 100 topics, 1000 iterations, 20 words) to find my 10 different topics. The topics I choose seemed very similar and gave a basic overview of what the Sherlock Holmes Series could be about. Some words in each general topic were part of a specific story. Still, they also gave insight to what the Sherlock Holmes series is all about.

The more obvious topics that relate to the Sherlock Holmes stories include the topics of Murder, Investigation, Home/Baker Street, and Observation. It makes sense that terms relating to these topics would be found most often in the stories and be grouped together. Holmes investigates murders (as well as other crimes), lives on Baker Street (where he meets with almost all of his clients when they ask him to do investigations for him) and has very advanced tools of observation, which he utilizes to solve the cases he takes on. These 4 topics provide a basic premise of the Sherlock Holmes stories – who is involved, where is it set, and what are the key plots.

The other topics that I choose may not a first glance seem like topics that relate to Sherlock Holmes, they are very important in understanding Sherlock and his methods. Money is a topic that may seen out of place but money is seen as being very important to Sherlock in the stories. Holmes would always like to know in advance how much he will be payed. One can infer that he is so confident in his abilities that he thinks anyone can pay him in advance. Another topic I found was Actions, most interestingly the words “sit” and “laid.” This shows that Sherlock is very comfortable with his surroundings as he listens (another action word) to his new clients stories. A topic that may be seen as sort of a stretch to compare to Sherlock Holmes is Married Life. The terms in this topic start out positive but soon become negative with words like “spite” and “hate.” Sherlock could deal with marriages gone wrong. Clothing (and its descriptors) and Writing can be lumped together almost because they are two things that Sherlock observes the most when he is trying to solve a case. The last topic, Time Measures included words “time,” “week,” and “days” so the reader can tell that even though Sherlock is a great detective, it may take him some time to figure out the cases that he takes on.

The MALLET tool was easy to use and allowed me to change the settings to get the results I wanted as much as I wanted. I believe the topics that I found, even the ones that don’t seem obviously connected to the Sherlock Holmes stories, provide a background of information for someone trying to get the general themes and ideas behind the stories, whether they be Digital Humanities students like us or just first-time readers.


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s