Text Analysis Tools and their Silences


*Note throughout this Blog I will be associating Text Analysis Tools and their ability or lack of to connect with text with comparison to Monk. Since I worked with Monk for the year I feel the most comfortable associating this post and assumptions towards it. No assumptions are made about any other tools since I have not worked in depth with them and am not as familiar with them*

Well this is the final blog post, the last and final one of Hamlet in the Digital Humanities. Since the final blog post is a “biggie” I figured I should write on something that I have been constantly thinking about since the beginning of the course: quantitative and qualitative analysis. The digital Humanities is all about looking at things from a computer data based perspective to find more ways we can locate information or otherwise known as quantitative information. Which is a different perspective than what people are used to which is qualitative analysis. This year we analyzed Hamlet qualitatively and quantitatively and looked at how the two relate or compare to one another. It seems to be an upcoming trend to look at text through computers and I was wondering what this effect may lead to the original qualitative analysis of texts and how the two differ all together. I believe that with new technology in the digital humanities arising this may create “silences” in meaning and understanding of a text and relation to it similarly on how digital texts have created “archival silences” in that the more digital we become with text analysis the less involved we seem to get with the text and understanding it from its original roots.

Lost Voice

“Archival Silences”. Where do I begin to explain the tricky term of Archival Silences… I guess that depends on which definition you are looking at. In Kate Theimer’s Blog on “The Two Meanings of Archival Silences and Their Implications”  she describes archival silences many different ways

1. Gaps or “silences” in a body of original records

2. reference to materials that are not represented in the digital collections that have been marked up in ways that make them useful for research

3. ways in which voices from the past are silenced

4. those materials that have been digitized and made available online

After looking at these definitions and looking at my blog and what I wish to write about I decided that in terms of this paper the silences which I refer to would be the gaps or voices that are lost in the Digital Humanities Text Analysis tools, and the implications digitizing has had on our bodies of work.

Overall I felt that with the Digital Humanities and the text analysis tools there seemed to be a lack of absence or silence within the association to the original text. I felt that even though in our class we spent time looking at Hamlet on paper and looking at Hamlet through a text analysis tool I still had a lot of difficulty connecting the two together. Even so after phase two I was able to relate the two together, however I found that each part could stand on its own.  What I mean by this is when you read Hamlet on paper you are able to understand it and pick up on certain themes and ideas and don’t need the digital version of Hamlet to grasp at it. I also found this the same with the digital version in that even though with text analysis tools it takes the text and picks it apart it looks at it a completely different way than a human does, in that it looks at things from a quantitative measure as in numbers, language and how often something may appear.

Even though the digitized version of the text looks at the text through a different aspect than what a person would do it looks at the text from a different level and thus the original story and themes that we pick up can be silenced. In that the digitized version of the text only looks at the text with aspects of words and numbers not a thought provoking questioning or understanding that we get from reading it. Thus we can say if you follow this logic that Digital Text Analysis tools themselves have created their own silences in that they are unable to pick up the human perception.

Hamlet and Text

To further test my question about archival silences that are created within text analysis tools I decided to look at the text that we have looked at throughout the whole year: Hamlet. I decided to do something basic and look at a common theme found throughout Hamlet which is “madness”. I wanted to see the ways in which human interpretation or qualitative analysis found this theme.

The theme of madness can be easily seen through one of Shakespeare’s best attributes: Language. Shakespeare has a very rich language which is layered with meaning on top of meaning. The ability to look at language, associate it and read into its many different meanings can be seen as a humanistic qualitative feature in that it human emotion and understanding is able to look at this language and see its many layered meanings as well as the association and feelings behind it. An example of madness can be seen when Ophelia has lost herself in Act 4 and the king states:”Poor Ophelia/Divided from herself and her fair judgment, /Without the which we are pictures or mere beasts”  (4.5.80-81). This quote shows how the madness has “divided” Ophelia, meaning that she is split off into two separate parts, her body as well as “her fair judgment”. We can see that her “fair judgment” indicates her senses in that she is no longer associated towards it. In turn it has turned her into a “beast” meaning that she is considered nothing more than an animal and that without her judgment Ophelia is seen as animalistic. This also gives reference to the idea of humanity and what makes us a person. Here King Claudius suggests that Ophelia’s reason and “judgment” make her human and without them she is nothing more than an animal. We must also consider the word “lost” in that it associates that she had it and now it is gone. It also gives light to the idea that she may find herself again, and there is hope that Ophelia may return to the girl she was and that she will no longer be considered animalistic this may also convey a sense of remorse in King Claudius’s choice of words in that he is hopeful that Ophelia will get better and he feels sorry for what has happened to her and the condition that she is now in. This can also be seen by how King Claudius used the word “Poor” as he associated a sense of sorrow towards her and apathy for what has come to her in that the madness has turned her into something she is not. Overall this quote shows how “madness” is not only found but is also described and related to in that day and age.

We can see through this brief analysis of text that there are many layers within Hamlet and by going through and taking apart the language bit by bit we can sense a greater understanding of not only the character Ophelia but as well as associations with how people were viewed if they were seen as mentally unstable as well as character feelings and associations towards one another. This two lined sentence in Shakespeare speaks volumes in references and meaning. It also gives the reader an association of feeling, understanding and a sense of emotion tied towards his words. On the other hand a computer would have no way of analyzing text through this deep of a method.

Gaps Created Because of the Digital Humanities

                For this past semester I have been looking at Hamlet within a text analysis towards Monk. If you have read any of my previous Blogs you probably realize that Monk is a frustrating tool which doesn’t tend to cooperate often but it is still a text analysis tool.

Text Analysis tools are used to gather a greater strength or grasp of ideas within a text. Text analysis tools basically do what the name says, they analyze text. Within each tool it took the text of Hamlet and analyzed it in its own specific way and found out some interesting things associated with it. This is where I think the silences begin. For me working with Monk it was difficult associating the text with the findings. This can be seen with the concordances in that it shows you the word that you are looking for but it does not pertain where in the text it was said, as well as who said it.

Here you see I have looked up the word “madness” in that Monk displays how often the word madness appears in the text. Although this may show that madness appears 22 times in the play. It also shows the phrase that madness was found in.

To me this is a big problem since as a class we were relying on these tools to give us information about the text. When it did give information I found that it had little association or connection to the text itself. I could see where Monk had found it, but since I had no idea of the origin I had no concept of its meaning and thus I had not gained a greater strength nor grasp of the text itself. Even though you are able to look at the words and see the context which they fit in you can’t relate it back to Hamlet because you don’t know its origin or its speaker. This shows the silences in that there is a gap of information that is not being received or understood, but it just gives you data. I know to try and fully understand what Monk is trying to say about Madness I will have to go back to the text and sift through it myself to know who said it where it was said in the play.

This is also shown with Monks unique tool Naye Bayes in the decision tree which picks up the theme that you have chosen and sees how confident it can be found throughout the text.

Once again I am shown data and information, but I have no idea why these words are associated with it or the context that they are spoken in. Even though Naye Bayes does show you the common words associated with it, it does not show you the ways in which the tool picked up the certain idea or theme. This shows me a lack of proof of what Monk actually found and how it can be useful. It leads once more to a silence which just shows data and information but lacks an actual connection towards the text and thus the reader.

In phase two my group and I decided to try and make a connection with the text towards the theme of spy and surveillance which we found extremely prevalent in act two by reading. This was a way for us to try and bridge this connection between qualitative and quantitative text while focusing on the idea of erasing the association of silences that some tools created. I found this to be an extremely helpful way of bridging the gap between human and computer association. I did feel that for the use of our presentation was for the main concept of trying to strengthen the theme we have picked up on and have found. I believe with the combination of all 5 tools we still would have picked up the theme of spying and surveillance if none of us had ever read Hamlet or understood the theme found within the text. However I am unsure of how well we would have been able o understand and grasp the strong concept of rich language that Shakespeare uses. As mentioned in one of my previous Blogs some words which we convey as strong themes throughout the play don’t even show up. The language used in a specific context seems to hold importance in the human understanding where a computer may lack or add “silences” to.

                                Overall I felt that the text analysis tools did create “silences” that were not included to or pertained to the text. When looking at both text and tool I did feel like there was still some aspects that were not being fully understood even though the computer gave me an answer. The best way I can describe it is solving a math problem by hand and understanding all the parts and particulars to it where as if I picked up a calculator I would have the answer starring straight at me in the face and I have no idea or concept of how it got there. This makes me wonder of our future and understanding of books and novels. Will there be a “calculator phase” that will just show us the answers but we have no idea or concept of how they got there? Can we really consider this diving deeper into a text when it just shows us the answer or tells us how often something appears?

The In-Between

A main point I saw throughout the whole experience was the point and validity to quantitative analysis. Yes it is interesting to see things broken down in a numbers based only perspective but I still felt that you had to rely heavily on the text itself to fill in the “gaps” or “silences” that were created because of the fact that it is taking a text that we understand qualitatively and putting it into a quantitative format.

Now don’t get me wrong about the tools in that they are all useless and evil, that is definitely not the case. I am merely suggesting that in looking into this new era of digitizing we still need our original texts to fill in the blanks that we don’t really understand. In that text analysis tools become a help towards our understanding but not a dependence. I am a strong believer in the original form of understanding text. I think it is important to go through a text and pick it out the old fashion way similarly to how you won’t understand someone until you have walked in their shoes. To me the text is the shoes and for me to fully understand and comprehend something you need that text. Without it you may have grasped a concept or idea that is being presented but there will be gaps.

What the future Holds

Kate writes on how these archival silences have had implications on ourselves and how eventually one day “that which is not available digitally with become equated with that which does not exist” . I guess you can say we have similar fears in that one day there may be a time when text analysis tools have taken over the concept of reading and understanding a book the old fashion way.

This can also relate to the technology today in that there are so many different ways that people communicate with one another but don’t at the same time. If you look at texting, Facebook chat, or even talking on the phone (which seems old fashion now a days) they all convey the idea of finding and passing information to one another fast while missing the human connection of emotion. There are countless times when someone will get mad over some computerized message because the human connection and emotion behind it is lost similarly to the silences found within text analysis in that the deeper meaning and context is essentially lost due to this phase of understanding and processing at a fast pace.

I can’t help but wonder if this will happen to our books and if they will “be or not to be” (3.1.58) meaning if they will be able to survive in an ever expanding digital world. I hope we don’t lose sight of the text and what it has to offer us because without it there will be a million unheard, unrecognized voices that eventually will go silent forevermore one day.



Works Cited List

Shakespeare, William. Hamlet. Stephen Greenblat, Walter Cohen, Jean E. Howard and Kathrine Eisaman Maus: Oxford University Press, 2009. Print. The Second Norton Edition.

Hamlet & Monk (and my brain) in Hibernation

I’ve decided to start this blog off on a completely negative note (something completely unusual for me, I know) by stating that this will probably be my worst entry to date and that it may lack in all things relating to making sense. I’ve gotten lots of really positive feedback concerning my last post, which has been awesome. However, I honestly feel like there is nothing intelligent left in my head to put down on ‘paper’ today. An overload of essays and papers and presentations has simply put my brain in a state of hibernation. As much as I am trying to focus, I am consistently finding myself looking at the wall with a blank stare on my face. That being said, I will try my absolute best to give everyone an update on the wonderful world of Monk and its progress with Act 4 and phase 2 as a whole!

Like I said in my last blog post, I had a rough idea as to what I, and the rest of my group, had planned on doing in regards to incorporating Monk into a hopefully helpful position for this new phase. After a little more research, it seems as though this may be achievable! Although I am still having problems with getting Monk and it’s workset comparisons tool to work. I find it positively frustrating that unlike other analysis programs that we learned about in class, we have no way of communicating our issues or concerns with the creators of Monk. They truly did abandon ship on this project. Tis quite saddening. But, there is really nothing we can do about that, especially at this stage of the game. At least from all of this I have become an expert on finding ways around issues! Or in other words, completely disregarding the original idea and moving on to something that is actually accomplishable.

My trusty phase 2 group has decided that it will work best to result to a nice ol’ reliable flow chart. Everyone’s programs were strategically placed so that it may do its part and then give its findings to the next in line so that more results will be produced. We begin the chart with Tapor. This program is able to define its own ‘worksets’ (pardon the Monk lingo) by specifically stating what it exactly wants to examine, whether it be a full act or simply a speech. Kira than hands these documents off to Katy who is able to grab hold of word frequencies for specific characters. Finally, Wordhoard, Wordseer, and Monk (Allison, Ayesha, and myself) are all able to take these word frequencies and see the context in which they arise in regards to particular characters that we are taking closer looks at. More specifically, we will compare the commonly used language between characters in different plays. I displayed an example of this in my last blog, but just for a refresher, we will be comparing the relationship and the language used between the pair Gertrude and Claudius in Act 4, scene 1 and Emilia and Iago in Act 5, scene 2.

If only we could make all difficult tasks and challenges in life into nice little flowcharts! Hopefully our chart in regards to our research and eventually our presentation works just as smoothly…

I realize this is still a very rough draft but I do feel like we have made a decent amount of progress. Everything is sort of at a stand off while we continue to figure things out individually. We at least know the direction we are heading in and what we are looking to eventually accomplish. I also know that as we use our programs more to get these first initial goals, I feel like we will be able to discover other things or tools that may deem themselves useful for our final presentation. Am I trying to hard to end this all on a positive note? That is for me to know, and you to ponder…

Digimon and Divination?

It is a grey day. Warm with snowflakes like glitter. Someone down the hall seems to be having a workroom party, which they are all quite content with; you can just tell by the laughter. But we are instead lost in a different world; a digital world, if you will. One with so much information compiled and cross-linked that it encompasses the realm of human experience, and encodes the most significant events, works, and experiences as data. It is a place where you can indulge in the works of a man who lived in the 1600’s, and divine new secrets 400 years later. When you really think about it, it is fantastic; unbelievable almost.
Yet, at the same time it is another new day in the Humanities, and a lot of planning done. Today was devoted to pre-project-planning (say that three times fast). Although there were not too many new discoveries, there was the exploration and expansion of the old ones. Monk; of course, is a mining tool, meaning that the more you work the more you will discover. As is, I have been finding more uses for the NaiveBayes and decision tree tools. They might be unconventional, and a little hit-or-miss, but the results are pretty exciting!
In the classification tool you can find NaiveBayes. Under which you load your worksets and rate them. I found that rating each scene with a theme will give me the words that make the predicted theme true or false. Thus, searching for confirmation of the theme “madness,” elicits words that have some cryptic connection with that theme. Such as the word “armour,” which has to do with the armour of the mind… From there, you have to make some good old fashion English major connections and argue your findings; something that we are all experts at. My idea is that the armour of the mind refers to its sanity.. which is slowly broken down by lies. Etc.
Anyway, you get the point. This is what my program is best at in comparison to the other programs. They have the frequency, concordance, and description tools, but this seems to be a unique feature of Monk. The biggest question now is if it can be useful enough to present. That is the question for next time.

The words are supposed to be suggestive in conjunction to "madness"

It is not the most succinct method of analysis, but there is still time to work with it, and it does prove to be interesting every time. For example, “black” appears five times in Act 3, and it is always in a very negative context:

Results for "Black"

In case you were wondering about the title and the bit of writing at the beginning, it just occurred to me that the premise of one of my absolute favourite childhood shows has an abstract relation to the Digital Humanities. That show was Digimon (I know, I know), where an alternate dimension that housed a world made in the image of the earth, with fictional-type-monster inhabitants existed. If you know the show you might remember that the digital world was created by the compilation of data that is stored in computers and over the internet. First the foundations were laid, and “Over the ensuing years, through the continued growth of the electronic communications network on Earth, the Digital World continued to expand and grow,” (http://digimon.wikia.com/wiki/Digital_World) It’s a little bit silly, but it is an accurate depiction of not only the information amassed on the internet, but of the Digital Humanities itself, which must hold significant portions of the literature that shapes the world we live in. Literature is made in the image of the earth and of human experience, and the characters that inhabit it are in the image of its creatures. The depth that it reaches to is too far to count. It is too far a stretch to say that the universe of data is alternate to the universe of reality?
Just a thought.

Where does the world end and data begins?

Simple is Best- Well… at least for Monk it is…

Looking back on phase two I find it neat to look at the text and analyze it through different tools and methods of analyzing. The use of combination of tools I thought was very helpful in that it was able to look at things from a different perspective similarly to how a different person looks at a text.  I decided for my final blog post to go simple, and look at the words and themes found in act 2.

I decided to go back to the old fashion way of analyzing and read! I re-read the text to try and find some other themes and a common one I found throughout the text would be public and private actions. The act is about how individuals try to come in on these private moments and actions to reveal to be public. This also seems to be centered around one character: Polonius.

Polonius seems to be that annoyingly pompous guy that always has to know what is going on, and you know when Polonius is around trouble is going to happen. Why does Polonius have this insistent need to be a know it all. The desire for him to know it all and be in the middle of everything can be seen as the thing that brings him down and kills them. The is seen since he is killed behind a curtain, spying.  However his outward appearance is seems opposite with the idea that the king describes him as “a man of faith and honorable”.

From here we can see Polonius’s outward appearance to the King is one of a high and noble status. This makes me think that Polonius cares about how other perceive him and that maybe to make himself feel better.

I also looked at “truth” and I found that it is mentioned 3 times within Act 2. I found that truth was used as 3 times with relation to finding the truth, seeking the truth. It always came back to the idea of knowing the truth and being aware of what was real and what was not.

This relates to Polonius in his constant need to find the truth and seeking in truth. It also relates to the ways he uses to find this truth out which can be seen mostly by sneaking around and having spies. The aspect of truth also relates to the King and Queen and how they feel like they must know the truth to Hamlet with his current state of being, if he is mad or not and his relation with Ophelia. I also looked at the word “hid” and found that it relates closely with the word “truth” in that it was used to cover up the truth and keep it secret and hidden. This once again touches on the idea of things being kept public and private in that everyone wants to keep their personal views private and everyone else’s views public.

It seems like within Hamlet it is a constant power struggle of knowledge and who knows the most and how they can use this information to their gain and knowledge while keeping their views private and away from what everyone else thinks. Act 2 seems to revolve around this idea of knowledge and power, who has it and how can it be used to your personal advantage.

Even though Monk isn’t that fancy or considered a great tool sometimes simple is better and with Monk it is either simple or really complicated and complex. However in either situation I find that you have to be able to know and relate to the text thoroughly. Having Monk as a tool seems to really show me how to not fully rely on a tool for pure information, and I find that it seems to be equal parts of Monk and self knowledge.

Collaboration Time! — Monk, You’re Not Invited

We decided in our meeting today that we would try to combine our tools in order to discover more about our characters and how they develop throughout Hamlet.  After we re-familiarized ourselves with each others tools, we began our collaboration.  Monk unfortunately didn’t seem to be of much help (sorry guys…), so I spent my time trying to figure out how everyone else’s tools can help me.

I began with Richelle’s tool Wordseer, and was intrigued by her visualization tool, the Heat Map.  Since I’m studying Claudius and Gertrude’s development, I thought a good place to start would be to search up when the words Queen, Gertrude, King and Claudius are used.

As I understand this only shows me when these words are said, it does not include the speakers.  Nonetheless I found it to be interesting.

An issue that I have with Monk (well, one of the issues) is that when I look up a lemma or a concordance, it doesn’t tell me the speaker, or where the word is used within the play.  It only gives me this:

I wanted to find out who in the play uses the word brother,  I was hoping it would be Claudius speaking about his brother, but Monk won’t show me that.

So I asked my group members if any of their tools could do that, and Dayna said that WordHoard can. Excellent! Another tool that I can use.  So I decided to look up the same word (brother) as I had in Monk, so I could get more accurate results.

I filled in the criteria in WordHoard:

And got my results!

I guessed correctly! The only time the word brother is used in Act 1 Scene 2 is when Claudius is making his first speech.  I’m definitely planning on using this function when I look deeper into my characters development.

In sticking to my ‘brother’ theme, I moved on over to Voyeur, to see what it could do for me.  I remembered from the Voyeur presentation that this tool could compare word frequencies, and I knew I wanted to use this feature.  I asked Ruby, the voyeur expert, how I could do this.  After she gave me a rundown of the tool I was able to work on my own and search words that I felt were relevant to my characters.  I decided to look up brother and guilt, in relation to Claudius’ guilt about killing his brother:

I am pleased with these results, but I would like to be able to find the moments in the text where these two words overlap.  Which actually I think I might be able to do, but I’m going to have to ask Ruby to help me out on that.

Finally, I decided to give Monk another try, and see what it would give me.  The two scenes that I wanted to focus in terms of Claudius’ development were Act 1 Scene 2, and Act 3 Scene 3.  The first being his opening speech, when he talks about his brother’s death, the latter being when he confesses to murdering his brother.  I created a workset of both these acts, and rated them as love or tragedy.  The reason being was that I wanted to see if Monk classified my two scenes as tragedies, in comparison to the other acts.

Based on the words used in these scenes, Monk is more confident that Act 3 Scene 3 is a tragedy then Act 1 Scene 2, but Act 1 Scene 2 is still considered to be more of a tragedy than other scenes in these acts.  Well done Monk, you’ve actually given me results that can help me.

I feel that now that I have a better understanding of everyone else’s tools (well except for Tapor…although I feel it might be as unhelpful as Monk, no offense Tapor experts) I will be able to become even more focused on my individual characters.  I hope to be able to learn more about Claudius and Gertrude, crossing my fingers that these tools will let me do that.

Marry, this’ miching malicho; it means mischief.

This is easily one of my favourite lines in any Shakespeare play. Why? Because the words befit the meaning in a style that is all their own. And I cannot hlp but thinking that is Shakespeare himself knew that twenty-five young adults were set free with the power of technology to analyze his plays, he might think that a mischief all its own.
In our own little sect of madness we got off to a bumpy start. We were all “masters” of our respective programs, but how do we compare them? How can we link each advantage and rate the,. How many of the tools overlap in use? And what becomes overshadowed by a newer, better tool?
Most of all, how can we find out?
We needed a common ground. Something inside Hamlet that every person can indentify. Which is of course madness, something every hard-working university student has met with at least once, but besides all that it is a theme within Hamlet that everyone will decipher differently. Is he sane and acting? Is he crazy from the start? Is he driven mad by his own efforts? Hamlet will always be a mystery so long as space-time continues.

Where we are now:
Since we had a goal in mind, we were able to find the means. Within different programs frequency searches, Naive Bayes, concordance searches, “described as” searches have all proved useful. We are able to track down suggestive words through Naive Bayes, and then put them into other searches to divine meanings. The other cool thing that we have been finding is the ability to compare Hamlet to other Shakespeare tragedies. “Madness” appears in Hamlet 22 times! The next most frequent is probably Romeo and Juliet at 11 times. That is a huge jump. So we know that Hamlet is focused on madness, now we just need to find subtle hints, recurring themes and general meanings that can help to indicate the true madness of Hamlet, or the play he puts on for everyone.

The uniqueness of our Act has been comig out slowly as well. We know (not necessarily because of the digital humanities) that our Act contains much of the most important action in the play. The “To Be or Not To Be,” speach appears, as well as “Get thee to a nunnery,” the play performed for Claudius, the confrontation of Gertrude, the murder of Polonius, etc! There is simply a ton of stuff to research and a lot to discover.
Most importantly for next time we must study:
The use of “poison in the ear” as a metaphor.
Any reference to the mind such as:

Every instance that describes a character as “mad.”
And really anything else we can think of.
So that is about it for past 1 of Phase 2. We have a strong Act 3 team, with only a few hiccups,and some illness 🙁 and hopefully there will be more success to report on the next post. Right now there are just to many questions! It’s pretty amazing what we can do though. What has taken minutes on MONK or voyeur, etc, would have taken months in the traditional way. Could you imagine going through every Shakespeare tragedy and noting the use of the words: “mad” or “madness?” It sounds crazy, and yet that is what the creators of these programs have done for us. We are grateful 🙂

MONK’s “Tragic” Words: A continuation

As a continuation of my last post

In my attempt to discover words that may participate in MONK’s classification of Act V as more tragic, I found myself being led in another direction of attempting to figure out why MONK insisted on classifying Hamlet as a ‘half-tragedy’ in comparison to the other words. My discoveries in individual word frequencies were interesting, as it would seem that they would contradict the ‘half-tragedy’ classification that MONK previously made. In other words, MONK seems to have contradicted itself.

In comparing the tragedies to all of Shakespeare’s plays, MONK has returned me with the following data:


The first verb that MONK provided on the list as appear most frequent in the tragedies in comparison to the rest of Shakespeare’s plays, was “swear.”

Upon selecting the word to see the break down of frequencies, I was provided with the following information:

“Swear,” as it appears in all of Shakespeare’s tragedies, appears most frequently in Hamlet.



To satisfy my own curiosity, I scrolled further down the list and selected a word that seemed less likely to appear in a tragedy, but still one I did not remember reading that frequently when I did my own reading of the Hamlet text. Selecting ‘smile,’ I was provided with the following chart:

In terms of the number of times the word “smile” appears in the tragedies, it appears most frequently in Hamlet.


I assure you, this pattern remains consistent throughout the list of frequencies that MONK has provided me.

I remain uncertain of if these results are being affected by the glitches and malfunctions that MONK has been experiencing as of late, but this does raise an interesting question:

If MONK’s data hasn’t been affected by its recent problems, where does this leave us with understanding Hamlet as being classified as a tragedy? 

If the words being provided by MONK as most frequently occurring in Shakespeare’s tragedies in comparison to the rest of his plays all appear most frequent in Hamlet, why is it then, that Hamlet is the play that is most frequently classified only as a ‘half-tragedy?’

This is a question that is beyond MONK or my own understanding to fully grasp, and so, it is my hope that the tools of my group members can take this information and further analyze it to bring us closer to an understanding of what this all means for Hamlet as a whole.

Perhaps it is not these tragic words that can be the basis for our classification of Hamlet as a tragedy. Perhaps we must take the comedic words used in Hamlet to understand why MONK refuses to accept it fully as a tragedy?

These are all questions I hope to have answered in my next blog post, as I believe that these answers will guide me to an interesting discovery about Act V in relation to Hamlet as a whole.




MONK: To be, or not to be?

In all of the discoveries that I have almost made, it seems that MONK has made its decision to ‘not be.’

Unable to create worksets that could be compared for word frequencies, which my group discussed as a good initial focus today, I have found myself at a loss of anything useful to blog about other than how this program has refused to co-operate with me. However, it occurred to me today, that perhaps for the sake of my group I shall force MONK to hand me something useful.

Yes, I do mean force.

In the interest of figuring out what classifies Act V as ‘more tragic’ than Hamlet, I began to use the preset corpus and genre worksets in order to determine which words were frequently used by Shakespeare in his tragedies. The following is what I learned in this endeavour.

It is worth mentioning, I think, for those of you that are familiar with MONK, you know that it has this irritating stubborn thing where it just refuses to remember the options that you have selected to search with when you hit previous, so this process was a long and arduous one.


To begin, I chose the preset worksets to be compared would be all of Shakespeare’s plays with his tragedies, in order to determine which words were unique to his tragedies. I was returned with these:

The words provided in this list are those words that appear most frequently in the comparison between all of Shakespeare’s plays and all of the tragedies.

When I select the word “justify” I am provided with a graph of the frequency of that word across te time span of Shakespeare’s writings:

I found it interesting that the year the word “justify” peaked was roughly around the time when Hamlet was written, and so I hit ‘continue’ in order to see the plays in which this word occurs and in which play in occurred most frequently.

The circulation period I was most interested in was between the year 1600-1610. Finding that time frame on the list, this is what I discovered:

The word ‘justify’ occurs more in Hamlet than it does in any other play in this time period.

It also appears more in Hamlet than it does in any other play, and all the plays on this list in all the time periods, were tragedies.

Going through the list, I found similar words of interest to tragedies (not just in Hamlet). For example, the word ‘rehearse’ appears only, or most frequently in this comparison, in tragedies.

Using words like this, I think it will be of interest to our group in analyzing Act V.


I believe that because Act V was classified by MONK as more tragic that the rest of the play, these words will be helpful in assessing why MONK has made this classification and it will provide a starting point for the other frequency analyzing tools in gathering further interesting analysis about Act V.

The building similarities of Ophelia and I.

Apparently you can run from the problems that arise with Monk, but you most certainly can’t
hide. My old enemy ‘frustration’ was presented to me once again after I began
looking for ways in which my program could prove itself to be useful in the
final stages of researching our text analysis programs. Of course when I expect
things to go slightly better than they previously have, they never do. Last night
I settled in at my desk to do some exploring of my program. I wanted to find
even an ounce of value from Monk that I could present before my group the
following morning. I knew this may be a difficult task, but I never expected it
to be as excruciating as it was. I came across a problem that was brand new to
me. I had never experienced this before, although I have since then discovered
that others in my Phase 1 group had.

It began with me trying to define a few new worksets that I could take a closer look
at, and eventually be able to compare different acts from Hamlet in hopes that
this would perhaps come in use for Phase 2. But shockingly (note my sarcasm)
Monk has decided it is no longer allowing me to have the ability of defining my
own worksets. More specifically, I am able to create a workset labelled “Act
One” but when I go into the compare worksets option, it tells me that I haven’t
created anything new. I honestly tried doing it about 100 times before I gave
up all hope. I called in for reinforcement, and my old trusty phase 1 friend,
Hayley, was there with a helping hand. Unfortunately our combined brain power
was not enough to make it work. We tried everything we could think of, but
regardless after downloading a new browser and countless different log-in
attempts, we sadly hung our heads in shame. Ok, not quite. But it was exasperating
to say the least! In the end, I decided I will give it a try on my grandpa’s
computer in the morning. If this doesn’t work, you’ll probably never see me
again as I will probably result to the same fate as Ophelia. Hey non nonny

With the help of my group I was able to construct some fairly useful ideas of what
my stubbornly difficult program Monk can do. Well, at least I am hoping it will
be able to do. But for the time being I can still talk about what I PLAN on
doing. Since Monk’s main original purpose was comparisons, we figure that it
may be able to help us compare relationships between characters in separate tragedies.
To name an example, we can try and relate Hamlet to a fellow revenge-filled
character in Othello; Iago. Both are plotting murderous acts upon someone who
they feel has done them wrong. What I am thinking I might be able to do is
examine a specific speech of one of the charcters, take note of words that
represent what I would assume could appear in the other play, and use the concordances
option in Monk to see if my assumptions were right and see if my list of words
appear in the other characters speech. I should also be able to use the Naive
Bayes tool and see if the overall tone of Act 4 compared to an act with Iago in
it (in Othello) has similar results.

What was done in the above screen shot would then be repeated in a specific act in Othello, and the scenes in which Iago is most prominent would be the one that is analyzed; same goes for whatever scene Hamlet appears in.

I have this quiet nagging feeling in the back of my head that is telling me that
none of this will actually work, but I figure I should at least give it a shot.
I mean, it sounds like a decent idea, right? I can at least pretend like I have
some hope left in me.

POA Part 2: the development of the…character…..development…..

After meeting with my group today, I’ve gotten a better sense of how I can personally contribute to the group.  As I mentioned in my last blog post we decided to focus on character development.  Today we came up with the idea for each of us to focus on an individual character(s) with our tool.  Once we’ve come up with some results we hope to collaborate with each others tools in order to get a more well rounded sense of how our characters developed in Hamlet.  I got lucky and am focusing on the characters that I was originally interested in: Claudius and Gertrude.

I knew right off the bat that this wasn’t going to be the easiest task.  If I had trouble trying to get some results by looking a small workset like Act 3 Scene 4, then how the heck am I going to get results by looking at individual characters?  Especially since Monk doesn’t show me the speaker or line numbers when I search up lemmas.  It seems that my best bet at this point is to try and be more creative in my searches, in hopes that Monk will give me something.

First off, I tried to look up Claudius’ moments of speech.  In the classification tool I’m able to look at the text of an individual scene (thanks Kelsey!), which can help me isolate concordances that I might find interesting or relevant.  I thought I’d try to outsmart Monk, and searched up the concordance ‘King’ hoping that it would isolate his moments of speech:

Well, that doesn’t work.  Monk does not recognize the speaker King as a concordance, but only when it is used by another character.

Alright, next.

During our meeting today I decided that in order to figure out how a character has developed, I’m going to need to focus on significant moments in the play that have a direct effect on my characters.  For Claudius, I decided that I wanted to compare his opening speech in Act 1, Scene 2.  And his speech in Act 3, Scene 3 where he admits to murdering his brother.  For Gertrude, I wanted to do a more general comparison of the words that she uses when she speaks to Hamlet in Act 1(specifically Act 1, Scene 2) compared to the words that she uses when she speaks to him again in Act 3, Scene 4.  To do this I created a workset for each scene and used my compare worksets toolset:

Huh, so I created my worksets and tried to use my compare worksets toolset, and this is what I got:

In the main menu I had selected the compare worksets toolset and my workset that was Act 1, Scene 2.  Instead of this workset appearing in the First workset selection box, I got this error message.

Monk teammates, help? Have you guys gotten this error message before?

My interpretation is that the workset is so small that Monk is unable to recognize it as usable data.  I really hope this isn’t the case because I felt that this would’ve been a really good way of trying to figure out how Claudius and Gertrude develop as characters.

Well, I’m hoping that my next blog post will contain more results and success, as opposed to brick walls and frustration.  For now I shall go back to the drawing board and try to figure out how else I can use tool to my advantage.

Who will win?! Will it be Monk: the visually appealing text-analysis tool with too many limitations and pointless help buttons? Or will it be Kate: the angered but determined student who REFUSES TO BACK DOWN.  Find out in her next blog post!



I Get By With A Little Help From My Friends- and Monk…

Today was the second meeting with my group on Act 2 and we spent our time trying to figure out ways in which our tool can be helpful to others and ourselves. I am pretty sure that Monk won’t be very helpful with picking up the slack compared to other tools, but I find myself at a large advantage in that every other tool will be helpful to me. Thus I will learn new things about all the other tools, and I can teach my group my frustrations.

I am having difficulty once again just comparing or looking at Act 2 effectively. So I have decided to branch out further and look at Act 2 with much larger parts of Shakespeare mostly focusing on the tragedies.  I have found that common words associated with Act 2 and tragedies such as Richard III, Macbeth and Hamlet as a whole in comparison to Act 2 and its different parts. I have looked at Richard III Macbeth and Hamlet while sifting Hamlet act 2 words throughout it. I can also see where these words are mentioned in comparison to other plays.



The findings show up as the most common seen throughout Shakespeare and then the other two plays as followed. I found that Hamlet is a noun that shows up most often, which does seem obvious since you are comparing words in Hamlet 2.1 to Hamlet itself, these words show up in black. The words which are more commonly seen in other plays than that compared to Hamlet would be seen as an under use in grey.



Some words which I found interesting would be the under use ones. Words such as God and grace appear in such high numbers, but when you look at the comparison between other plays it occurs much more often however, the word heaven can be seen as an overuse word. This is odd because these three words seem connected but yet there is such a strong disconnect between them as an overuse and an underuse. This makes me think once again of what the context of these words could be used, in this case I would like to ask someone in my group who is able to look at these particular words and see who is speaking them, when they are spoken and the context that they are said in. Once again Monk has done a good job at showing you something interesting but it has left it up to you to decide how to handle the information.


I then wondered if these overuse words or underuse words could have been noted in Naye Bayes discussion tree. I decided to look up the underuse words and see if the language could have interpreted it as something with a strong confidence or a weak confidence. At first I looked at God, grace and brother looking at Hamlet, 2.2, and 2.1 as follows. I was surprised to find that a strong confidence showed up for the word Grace in 2.2. I believe this means that the language used in 2.2 can be seen as language which strongly refers grace and other words associated with it. There was also a soft pink shade which with relation to brother and looking at 2.1 which means the language used could be found as a relation to the word brother.



Afterwards I switched to the more common words seen throughout the text and I decided to look at matter, passion and heaven. I found that heaven has a very strong confidence towards 2.1 and matter has a weak relation and passion has no relation.


I find it very odd that some words that were seen as an overuse had such a strong relation to it with words in the text such as grace. As well as words that were commonly found throughout the text shown up as weak, and a common word found such as passion had no reference to the words related within the text.


After my group meeting I meet with my fellow Monk friend Hannah. We compared the ways in which we are trying to be helpful to look at the tool and some issues that have suddenly come up. I know I can speak for the both of us that sometimes the saved worksets that you make won’t let you compare them with other worksets that you have made, it just shows up as a blank. We have tried switching computers, logging off and on, switching internet browsers, making a new project but nothing seems to fix this issue. Although I am happy that it isn’t just me that is having this issue but other Monk individuals as well.


I hope my relation to words within the text will be helpful in my group. I know I will still be dependent on my fellow Monk individuals to help overcome my struggles and see if I am the only ones having these issues or if it others as well. I am very thankful that I am not the only one using Monk and I am not the only one analyzing Act 2. I think for anyone to be effective we have to rely on one another and help others to understand our findings and help push others forward.

Naive and Decisive actually sums up a lot of MONK!

Phase 2, and a new light… hopefully.

Being the expert on MONK is a tough job. Luckily the bond that comes from quizzically hitting buttons and keys for 9 hours is not an easy one to break. My project screen looks well used and familiar-

The results go on and on. Do we know what all of them mean? Not really 🙂 but we like them.

Meeting the new group in person really revealed how much the other groups liked or disagreed with their tools as well, and the hope is that what one tool lacks, the others will fill. So far we have had an easy time agreeing on regulations and sharing stories, so things are looking good for acing this presentation in a different way than the first, (though my phase one group was completely amazing, and I will miss them).

As for MONK – let’s just say not much has changed, except – the Act! Act 3 is my personal favourite act. Insanity, insults, murder, confrontation, blood, more ghosts, and much more! Really though, it just always seems like the most action packed of all the Acts!
Monk is doing its best to help me support this idea. The word “madness” shows up nine times alone in this Act! Although I did discover a slight annoyance again. I could not get the program to look through a whole act, only through the scenes. So far this is only in “Edit Worksets,” so it could just be a glitch.

Other words that show up quite a bit? Time which shows up 10 times, and “Heaven” shows up 10 times! “Action” – 6. “Go” – 17. Death and murdrer show up quite a bit too, but or course the words pertaining to the future, and action-y words show up more often, which at the very least could tell us that this Act appears in the middle of the play.

I had a very cool discovery too! In the classification tool with NaiveBayes and Decision tree (which you either understand or you do not, there is not much in between) I was able to load my Act 3 workset, which features each scene of act 3 as a different document meaning I can compare them! This is perfect for this Phase!

I rated each act as either comedy or tragedy:

As you can see, scene 1 and 2 have slightly comedic tendencies, and scene 3, being of course about sending a man to heaven or hell, is not a comedy at all… and scene 4 is an absolutely confirmed tragedy, go figure. Anyway, I think this is brilliant! Let us continue…

Now all I have changed is scene 3 from comedy to tragedy:

This is amazing because it seems like Naive Bayes uses the document as points of comparison. Scene 1 is supposed to be less of a comedy than before if scene 3 is a strong tragedy. That makes sense! In conjunction with plotting the murder or the “King,” the word “King” in the first scene seems to be associated with much deeper, darker meanings… Intriguing…

I could honestly go on about this forever, but I doubt every one of my findings would be as interesting for everyone. In summary this just means that I have a way to directly look at all of the scenes together, and that is worth a lot! Anyway, our self assigned homework for the weekend was to read all of each other’s blog posts, and see what we deduce from them, what would work with each other’s programs, etc. The hope is that through self-education we will have a breakthrough in compatibility capabilities… if that makes any sense. I am looking forward to exploring more of my new discovery, and am really going to think about how it can help my group members; that is my self assigned homework for the week. At the very least I can show off my new discovery next time and hope that they think it is as cool as I do.

Until next time, Kelsey ^.^

Let Us Commence!

I am planning on going into Phase 2 with a more optimistic mind set, instead of the angry frustrated version of myself.  Monk and I didn’t get off to the best start (and I do admit I’m still not the biggest fan of it), but it isn’t fair to me or my teammates if I just close off and don’t try to take advantage of what Monk does offer.  One of my teammates asked me today “what exactly does your tool do?”, “Nothing” was my immediate response.  Well we all know from my group’s presentation that that isn’t true!  It does do SOME things, and I shouldn’t disregard them.

For this phase, my teammates and I decided to focus on character development.  Since we were assigned Act 1, we thought this theme would work the best.  That being said we are unable to see how a character develops if we don’t look ahead to later acts in Hamlet, so it seems we’re going to have to dip in to other acts in order to help us get a better idea of how the characters that are introduced in our act will develop.

Hang on a second, this sounds familiar…..

If I wanted to analyze Act 1, and use other acts (or the rest of the play as a whole) as a reference….why yes! This sounds like my Compare Worksets Toolset!

These results actually look exactly like the ones I got when I compared Scene 3.4 to all of Hamlet, so it seems some tweaking may be in order.  In any case this is a decent jumping off point to begin Phase 2.  I plan on working hard to become even more of an expert of my tool, in order to make a good contribution to my group project.

And on that note, here is a list of questions that I would like to answer by the time Phase 2 is complete:

  • Figure out EXACTLY how the Decision Tree works (this will take multiple readings of April’s blog and many trials)
  • Answer the question, how can my tool contribute to my group’s project?
  • Answer another question, how can my tool work with other tools in order to get more in-depth results?  For this one I’m going to have to re-familiarize myself with the other tools in attempts to find a link between mine and another.
  • How to get the knot toolset to work (Come on Monk, at least allow me to use the visually appealing tool, is that too much to ask?).
  • Try and figure out if I can maybe focus on individual characters with my tool.  This will be especially challenging because not only would that be too small of a workset, but my tool is not accommodating to showing me who says what words.
    • The reason I would like to try this out is because that is my personal interest in Hamlet.  If I could focus on the character development of Gertrude in attempts to figure out if she actually knew if Claudius killed the King then I would be such a happy camper!

I think this a good start.  Throughout the rest of my posts for this phase I hope to answer or develop the questions and tasks that I have stated above, and perhaps come up with new ones.  Overall I’m looking forward to this phase.  I think that my team and I are going to be able to come up with some interesting results that we wouldn’t have been able to accomplish by simply reading the text (reading? What is this archaic method you speak of?).

MONK: Hilarious Hamlet

In the first stages of phase 2 of our group projects, I find I am more intrigued by MONK that I had been initially in phase 1, to say in earnest (but not unfounded) honesty. As promoted by the blog posts of the MONK group and throughout our presentation, MONK, as a text mining tool that focuses on statistical analysis and word frequencies, appears to be more cooperative in answering questions about a broader range of data. Though Act V is not as broad as MONK seems to wish it could be, I have found that I am indeed learning new information about Hamlet, Act V than I had known before.

My initial purpose in embarking on my analyzing journey was to discover what was unique about Act V, that I could not deduce from reading, but could learn from using the analytics of MONK.

In my blog posts from phase 1, I was left pondering the question of, “why does MONK, in comparison to all other tragedies, continuously notify me that it is only half confident that Hamlet is a tragedy?” With this question in mind, I endeavoured to determine if perhaps Act V participated in this strange inconsistency.

To begin, I defined my workset to contain As You Like it, The Rape of Lucrece, Hamlet, Julius Caesar, Much Ado About Nothing, and Act V.

Then, selecting my classification toolset and the newly created workset, I began to rate the the training and test sets. As can be seen in the image below, I rated As you Like it, and Much Ado About Nothing as the comedy training sets, and The Rape of Lucrece and Julius Caesar as the tragedy training sets. I left Hamlet and Act V with blank ratings, thus making them my test sets.

This is what I was returned with:


From this image it is easy to have the attention redirected to the fact that according to these queries, Julius Caesar is not a tragedy.

However, MONK’s lack of confidence in Julius Caesar being classified as a tragedy notwithstanding, where the attention must be drawn (as it took me a while to do so), is toward the fact that in a statistical analysis of the plays that are present, MONK has classified both Hamlet and Act V as comedies.

Feeling uneasy about my results, I went back to the user ratings, and removed those anomalies that MONK was picking up, and forced MONK to recognize Hamlet as a tragedy by rating it so.

These were the results I was returned with:

Both analyses were conducted on the basis of nouns.

In classifying Hamlet as a tragedy, and leaving Act V as the test set, MONK returned me with it’s classification that, with a 0% probability of error and 100% confidence, Hamlet is not a tragedy.

However, MONK does believe, that Act V is a tragedy.

The words I was most interested by in the data it used in determining its confidence in the ratings, however, was words like ‘blood.’

The first number displayed, 26.1241, represents the average frequency that the word appeared every 10000 features in the test set, Act V. The second number is the average frequency that the word occurred every 10000 features in the training sets.

From words such as ‘blood,’ MONK has determined that, based on average frequency, act V can be classified as a tragedy.


It was interesting for me to find that based on word frequencies and statistical analysis of noun features, in comparison to other works of Shakespeare, Act V can be classified as a tragedy and Hamlet cannot. Though it would be a worthwhile endeavour to attempt to figure out why MONK refuses to agree that Hamlet is definitely a tragedy, I find (it being my responsibility as a member of the Act V group for phase 2), I am led to research the cause of Act V being classified as more of a tragedy than Hamlet itself.

Because, to me, the subject matter and the words Shakespeare uses in telling the tale of Hamlet’s tragic story, it is difficult for me to understand its classification as anything but a tragedy. Therefore,  I have reached another understanding of MONK that I did not previously have in attempting to analyze 3.4. I wanted, so desperately, for MONK to see and understand Hamlet 3.4 the way I read it. I wanted to force it to read the words on the page in the order that they are in, and take the sentence for what it means.

However, it is this reading that we do as sensible, and feeling people, that leads to an analysis that is incomplete without tools such as MONK, and it is that reading that completes the pure numerical data, which is literally meaningless to any symbolic possibilities that exist in literature.

I digress.


MONK being a tool that uses pure data (and not emotion) in providing a classification, has yet to reveal to me the statistical reasoning for Act V being more tragic than Hamlet as a whole. Although my point here is not that MONK is unable to show me, it is that I have yet to fully understand the reasons it has provided me.

Reading the subject matter, it is rather simple for me to determine why Act V is tragic. The entire cast being wiped out is indeed, quite tragic. However, from reading that same subject matter in Hamlet, I cannot comprehend the reason why the play ISN’T tragic. From the interpretation of Hamlet losing his father, to have his mother marry his uncle, to find out that his uncle-father murdered his mother, and much more, is devastatingly tragic! My point here then, is that my reading and comprehension is not, and cannot always be correct. I would assume, as a university student living in Canada where all people have equal rights, that Othello is a tragedy. However, the audience that Shakespeare wrote for, not knowing a thing about racial equality would consider Othello a comedy.

The evidence of these are in the words, and in the probabilities that MONK discovers. It will classify Othello as a comedy on the basis of words, and in that same way it will classify Hamlet as a ‘half-tragedy.’

It is my hope that we, as the group analyzing Act V, can determine (undeterred by emotional bias) the true nature of Act V in relation to Hamlet by collaborating the various data we get from our digital tools.

I will from here, endeavour to determine why MONK tells me that Act V is so significantly more tragic than the entire text of Hamlet.




New Hopes For New Beginnings!

As sad as I am to leave my previous group members, I was pleasantly surprised at how well my new group meshed together! I have nothing but high hopes for us, as we all seem to have the same ambitions and goals in regards to how we will do with this project. It was no surprise that upon reaching the question in the contract for our anticipated mark, we all said with big smiles, “A+!” I mean, who isn’t aiming for the best possible grade?
Organization seems to come easily to the other girls, which coming from someone who doesn’t naturally have that skill, I am very pleased to say the least. An agenda that we plan on making before each meeting time is sure to keep the ball rolling, and it will make sure we use our time together to the best of our abilities. Procrastination being my middle name, I am thankful to have the necessary pressure of a timeline to keep me focused. Ironically, after I wrote that sentence I focused my attention on “Ellen” for a solid ten minutes. Tsk tsk, will I ever learn?
After rereading Act 4, our designated area of study, I analyzed it more carefully and began to see sort of a pattern within the text. This act is all about anger and harsh tones spoken amongst the characters. Gertrude begins the first scene by explaining the murder of Polonius to Claudius, and how basically there is no hope left for Hamlet. The exasperated feel we get from Gertrude is passed on to Laertes when we see him learn about not only his father’s death but that his only sister has gone completely mad. Exasperation turns to anger, which is followed by the intense need to get revenge on Hamlet for what he has done to him and his family. Claudius participates in Laertes anger by expressing his suspicions against Hamlet who he feels is trying to take the throne from him. A murderous plan is developed between the two characters, and the scene ends with the same amount of anger and anxiety as it did begin with. After seeing the continuing displays of anger, I figured this may be a good start to using my good ol’ faithful (ha!) program Monk to analyze the act more deeply.
Potentially we could use our programs, more specifically we could use Monk, to compare the amount of negative and heated words that are found in act 4 and see if the same feeling is evident amongst other scenes. Although Monk is designated to be used for larger texts, I feel that since we will be comparing this specific act to another act, they will roughly be the same size which should help with producing useful results.

In regards to what my fellow group members may be able to do, I suppose it depends on the specialty that their programs revolve around. Either way I feel like we will be able to get a well rounded amount of results which will help us get right to the bottom of analyzing act 4.

Monk- A Fresh Start


It is a new beginning and I thought a good way to start it off would be to read Act 2 and pick apart some common themes that I found were represented throughout the text. I then thought that I could use the themes that I found to try and see if I could gain more information about them through some analysis tools that Monk has to offer.

While I was reading Act 2 I found that a common theme seen in the text was spying and trying to figure out secrets.  This is seen when Polonius asks Reynaldo to go spy on his son while he is away.  It is also mentioned later in 2.2 when the King and Queen ask Rosencrantz and Guildenstern to spy on Hamlet for them.

After this I decided what information Monk may lend itself to me when associations to the theme spying. I looked up the word concordances and found that there wasn’t any word matches to the word spy or spying in Act 2



I looked back at the language used within the text and I found that the usage of the word spy was not mentioned and a few other synonyms for the word spy weren’t mentioned as well.  I found this to be very odd since when you read the text you know that Reynaldo is sent to go spy on Laertes and Rosencrantz and Guildenstern are sent to go spy on Hamlet, but yet it is not outwardly mentioned. This made me think of the language that Shakespeare himself uses to get across points that we ourselves understand the concepts that differ today.

I then tried to tackle Aprils concept of the Naive Baye’s and look at the language within Act to see if it is compatible to the themes that are easily noted within Act 2.  I looked up at 2 with separation to 2.1 and 2.2 and looked up common words that connect to themes that are seen throughout Hamlet 2.1 and 2.2. to see if the language itself would identify it. I decided to look up “revenge” for Act 2 in a whole,  “spy” in 2.1 and “madness” for 2.2.



As the results show the ideas of revenge by use of the language and words is something that was seen as noticeable in Act 2, but was not very prominent due to the lack of a deeper shade of red. Madness was itself was something that was easily seen within the language due to its darker shade of red. strangely enough the word spy had no language itself noted in 2.1 even though it associates with Polonius asking Reynaldo to spy on his son for him.

I thought it was very odd how some of the common themes that are easily noted within Hamlet are not even noticed or picked up through the language. I may be using the tool wrong or I might not be giving it enough information that I should, but I thought this was very strange.

I think I will find two different aspects of information while working with Monk, one I will find through myself analysis of the text and the other I will find with the analysis through word hoard. Although I wish that some of my personal findings would transfer over to what I find in Monk I think that it brings things on a whole new perspective.

I find that I am going to have to still work closely with my fellow Monk friends just to fully understand concepts and ideas to see if what I am finding may be somewhat correct or if I am going off on something completely wrong. I think it is very helpful to first learn how to use the tool and develop a stronger understanding of it, and with Monk itself you seem to learn more the more you fiddle with it and play with it however it is very tedious.

I hope that I will be helpful to my group, I know Monk isn’t the easiest thing to work with especially on a smaller scale. However I do hope that working with the other tools will help pick up where Monk seems to fall short.

MONK: Truly, “more matter with less art!”

The last time I wrote, the content of my post focused on frustrations that I had experienced with the limitations on the capabilities of MONK, and the difficulty I experienced even approaching my starting question of what our tool could do to provide us with insight about Hamlet 3.4, that we couldn’t get from just reading the text. Needless to say, the content of this post is very different.

For the duration of our team meeting today, we prepared to deliver our presentation on MONK and its capabilities and explain how it led us to new understandings of Hamlet 3.4. When dividing the topics to be discussed, I found myself assigned with the task of explaining the classification methods that MONK uses, Naive Bayes and decision tree induction, and how MONK uses them to provide useful knowledge. These, being concepts that I had a grasp of (a slippery grasp at that), I felt comfortable in explaining to my fellow team members the information I had absorbed from reading the night before.

Well, as I began talking and explaining my findings by referring to the actual process of using the methods, I realized I hardly understood exactly what I was talking about or where my vague and unconfident sentences were taking me. It was after that meeting that I sat down and furiously (or with committed fervour rather), researched, practiced, and practiced again until I understood exactly how these were to be helpful to our analysis. The following is what I found.

Text mining or also called data mining, in its shortest possible form of explanation, is a process that revolves around pure mathematical data analytics in order to return statistical data and probabilities based on patterns and sequences observed in the data. MONK, using Naive Bayes and decision tree induction, is among these text mining methods.

The tutorials for Naive Bayes and decision tree induction provide detailed, technical explanations of what they are and the processes of these analytics. In my attempt to get a better understanding of these analytics, I started with these tutorials. For those of you who read them, you will see that when I say detailed and technical, I mean that it looks like english but there were moments when I doubted that it really was.

This section (below), is only half english.

This one, is most definitely not english.

So, I turned where all students turn for short and quick explanations: Wikipedia. In my brief descriptions to follow, there are terms that I must first address in order for the explanations to be coherent.

  • Training sets– sets of data used to discover parameters that can provide a probability of predictable relationships between two or more sets of data.
  • Test set– A set of data used to asses the strength of the probability that was given by the training sets.
  • Over fitting– Crucial to training sets, are when statistical models (such as those in MONK) emphasize and display the minor fluctuations and random errors in the data instead of the relevant relationship, because there are more parameters than there are potential observations.
Naive Bayes is a classification method that uses two or more “classes” that are assigned to training sets. It builds knowledge and “learns” comparisons between the two classes, and applies them to classify an unknown text. It is useful for 3 things:
  1. Categorizing a text.
  2. Finding features that stand out in a text.
  3. Characteristics of one text that are common to a large body of texts, like a genre.
The MONK tutorial points out that the interesting aspects that can be seen using Naive Bayes, are those that we would consider “misclassifications.” In this way, Naive Bayes is useful for making a hypothesis and testing it, or going through the process to confirm something you believe you already know.
Decision tree inductions take the classifications provided by Naive Bayes, and use them to determine the attributes or characteristics that made them so. Below, is a simplified and understandable image of the basic concept of a decision tree, provided by the MONK tutorial.
This is the process that is applied to the data analytics of the decision tree. It determines which aspects are present and which are not, and then logically produces a ‘tree’ of information that leads to probabilities.
This is where over fitting is a crucial aspect. When this models grows to become too complex, this means the training data will be too detailed, therefore essentially useless in analyzing texts other than the training set. Instead of ‘learning’ the general relationship between the ideas, it memorizes that particular training set and attempts to apply it elsewhere.
The purpose of my explaining the analytics behind the tool, is because once I understood what the tool was searching for, and how it searched, it made it far easier for me to understand how to use the tool. With a body of text, and a tool that compares one body of texts, to one or more other bodies of texts, it is extremely difficult to determine what to look for that could be significant. Being given the probability and frequencies of words in texts is, despite how simple it may sound, a difficult place to start because there are just too many words.
Nevertheless, this is what I learned.
In general, using the classification tools that MONK had to offer, and practicing using them correctly did not further my understanding of Hamlet 3.4 as much as I had hoped, however, it did confirm what I believed, surprise me with things I believed that were wrong, and open for me a door into the digital humanities by showing me its vast capabilities. For example:
In terms of Hamlet 3.4, I attempted to analyze the scene in comparison to the all the tragedies in order to find what of this scene was characteristically tragic in Shakespeare’s language. Unfortunately, the way that worksets are defined, the closest I could get to this kind of analysis was Hamlet compared to all Shakespeare’s tragedies, and 3.4 compared to the remainder of Act 3. There I became faced with a problem also, what parameters do i assign each scene in order to find out something useful about 3.4?
In the section where it says “click to rate” there is a certain parameter that you are setting. If you filled in “love,” “death, and “betrayal” as themes of the first three scenes into the first three spaces, and hit ‘continue’ then it would return to you the conclusion of which theme scene 4 best fit according to the probability determined by Naive Bayes. Doing this, unfortunately returned no substantial results as the interactions within the individual scenes themselves were too varied from scene to scene.
In attempting to compare the nature of Hamlet to the tragedies, I did the following:
After hitting continue, I set the following parameters:
These parameters returned to me the following classifications using Naive Bayes algorithm:
The intensity of the red next to the title of the play indicates the level of confidence, or the lowest probability of error, that its classification is correct. The predicted rating, is the classification that Naive Bayes provides, based on the 2 classes (historical and fictional) that I have set for it.  From this, Naive Bayes shows me that it is fairly certain that based on the data I have provided and the data that it has analyzed, there is a certain % probability that it is a fictional play.
When i click Hamlet and the continue, MONK shows me the data that it has found which explains its confidence level.
The nouns that appear in the far right column are those that have given the Naive Bayes algorithm reason for the presence of probable confidence. The “Avg. Freq. Training” column is the number of times that the word appears in the ‘parameter’ plays that I labelled before, and the “Avg Freq Test” column is the number of times that the word appears in the plays that I left to be classified.
The reason that the confidence is not vibrant red in the predictions however, is because of the infrequent words that appear below:
When I click “Decision Tree,” the image that pops up displays the process by which the analytics flipped the tree over to determine what word could act as a classification.
The results displayed above provides the probability of error of the word “unkindness” as the basis of that classification. This decision tree states that in terms of probability, this word had the lowest error rate, and highest predictive performance.
Therefore, from this data, I can conclude that Naive Bayes and the decision tree have determined that there is a higher probability that Hamlet is a play of fiction, rather than history.
In conclusion, despite the various frustrations the group has experienced and the little bits that we picked up about 3.4 in specific, through Naive Bayes and decision tree induction, I have learned that classifications are a great place to start. Comparing texts in order to determine aspects of one based off another CAN show you something you never knew, or prove you wrong, in order to provide you with some idea of what you need to look for or what research criteria you need to change.
In terms of research, as we’re doing in ENGL203, learning and being wrong…I think that’s a great way to start.




Monk: A Little Less Anger, A Little More Results

As seen in our first blog posts my teammates and I were having difficulties trying to work our tool.  Since then we have come together and rallied against the odds and have been able to scrap up some surprising results that I never expected us to get.

In my first blog post I mentioned my frustrations with Monk not allowing me to use many of the toolsets offered.  I showed a screenshot of the Analysis Tool denying me access because I hadn’t identified any “training data” or hadn’t “rated any items in the worksets”.  When I first read this I almost threw my computer against the wall in frustration. Thankfully I held back and took the more mature approach in trying to figure out what this limitation means.  If Monk is asking me create a training set and to rate my worksets, there must be a way of doing so.

In the Compare Worksets toolset, I had become familiar with differentiating between the Dunnings analysis methods.  In that drop down menu I had two other options that I hadn’t played around with: IDF First workset as training set, and IDF Second workset as training set.

Hey now, don’t I need to create a training set to use the analysis tool?

After thisrevelation I went ahead created my training set.  Based on discussions with my teammates (Monk seems to be a tool that is made for larger amounts of data, i.e. an entire genre of Shakespeare’s plays) I used Shakepeare’s corpus of comedies as my first workset, and his tragedies as my second workset.  An interesting tidbit I found while I was filling out the requirements was this:

These boxes were left blank when I used the Dunnings analysis method, but as soon as I selected the option to create a training set they were automatically filled in for me.  Why? I’m sure understanding this will help me gain more knowledge in figuring out what I can learn from this toolset, but for know it’s a mystery.

Once I’ve filled in the requirements I click compare, then continue. I save my results and return to my main menu.

Now that I’ve created a training set I need to rate my workset.  I use my Classification tool to individually rate my worksets.

NOTE: I can write down ANY WORD I WANT in the user rating column.  I have divided them into their genres, but my teammates and I have put down random words like love, death, and blood.  I originally thought that rating these plays helped the tool in it’s process, but since I am able to put any word I want, it seems that the classification is more for my sake.  I hope to make more sense of this in the near future.

Finally after all these steps I am brought back to the analysis tool. Now is the time to see what this tool can actually do!

Here is what I got:

Ah the sweet screenshot of results.

Now, I’m going to be honest here.  Although my teammates and I have been working to try and understand what the point of the decision tree is, I still don’t have a full understanding of it.  April has been working a lot with it, and she has grasped the statistical side of the decisions trees purpose.  I personally plan on spending more time with this tool, in trying to figure out how I can use it for Phase 2.  So please don’t be wary my Phase 2 teammates, you’re not stuck with “the girl who doesn’t know anything about her tool”, I’m going to figure it out, be patient.

Patience.  This leads me to my final comments about Monk.  Working with this tool wasn’t easy.  It took a lot of trial and error to figure out how to make what seemed like simple tools to work.  With some research we were able to discover that Monk is meant to work with larger sets of data, which would be great if that’s what we were assigned to do.  More information on my teams efforts to work with Act 3 Scene 4 and Monk will be discussed in our presentation, but for now I leave with a warning:


Monk: A Greater Understanding and a Bigger Hurdle

Since the last post, the Monk group has met twice. We have made significant advances with the tools of the program, but have also made a crucial and unfortunate discovery to humble our success.

Firstly; however, our discovery. In the “compare” toolset there is an analysis method that we has not managed to figure out before. It is called “IDF” and it allows you to select a training set. Once you manage to fulfill all of the options to the program’s liking, you are advanced to a screen much like any other one where you can select a work, view it and type in the concordance you desire. Most of the toolsets get to this page and end there. However, for this tool, you are allowed to take the workset you nominated as a “training set” (we recommend selecting the all-encompassing “plays: tragedies” and “plays: comedies” or something for the most options) and from there to re-select a mix of both full plays and even individual scenes and save it as it’s own workset. (Minimum 3 selections).

As usual you hit a dead end on the concordance page, but uniquely, your saved workset becomes useful. Take your new workset with its many parts and load it into the “Classification” Toolset.

From here you must give each document a rating and follow the continue button…

This is the part of the program that Monk specializes in. Naive Bayes and Decision trees. The explanation of which will be one of the major parts of our presentation. After selecting your method you can insert a prediction if desired and…. Voila! You get a complicated rating system of “confidence” and “frequency.”

Very cool – now for the sad part. This tool, from what we understand, is basically used for the identification and classification of author’s works. It particularly focuses on entire play and their characteristics. Poor little Act 3, Scene 4 does not much register in the scale, and the part that does we of course already know its origin and the characteristics of it as a Shakespeare play. So how can we use Monk’s most defining tool as an aide in discovering Act 3, scene 4? That is our current mission. As well as explaining to you all this lovely piece of analysis:

Also since our last posts we have done more research into the purpose and uses of Monk…
We found out that Monk is one of the first of the Digital Humanities programs, almost a prototype for Wordhoard. Through different group member’s findings we have determined that the Classification, Frequency of words and the Concordance searches are specifically meant for analyzing large scale works such as entire plays or collections to find themes throughout historical moments, between writers or characteristics of the writers themselves. As it is, we are not sure how useful it is as a tool to analyze one scene in one play. Our greater understanding of the tool itself has further clarified this. Monk is great at finding certain things within a text, any text, of any size. Although, when it comes to comparing them, it is harder for a small document such as a scene to provide enough information to represent itself against other documents.

For the remaining days, our work will be centered on figuring out how Monk can directly provide insight into Act 3, Scene 4 specifically, and to see if it if possible to use the tool in any depth without comparing the scene to the entire works of Shakespeare’s tragedies –
Because as interesting as our tool can get, our focus must be on the one scene, and we are trying to be optimistic about getting it to work for us!

So, till next time, I leave you with this excerpt from the Monk help buttons.

Monk… One Step Forward Two Steps Back

Monk Blog Post-#2

Since my last association with Monk we have gotten off on better terms. I have learned that Monk is a limited tool and not to expect it to do these extravagant things because that is not what it is built for.  The primary use of Monk is that is a word counter, it locates words and notices the popularity with them and the concordance that occurs. It is also used to compare the words between other parts of text on a larger aspect, the larger the comparison the better results you get from it.

I have tried to upload the text that other classmates have extracted from their tools however I have been unable to do so. Monk freezes up and does not let me bring it up myself. I have tried switching internet users (Google Chrome, Firefox and Internet explorer) and that did not work. It was another let down because I thought that it would be a neat experience to upload text and look at them on a whole while focusing primarily between what was said in the speech while focusing on the text that was needed. Unfortunately that could not be done so it was best to try and work with what I had got.

I have learned the use of Monk and how it can be helpful if you compare on a larger scale. If you look at three different works and comprise them all together you can save them as a workset. From there you can compare the concordance between all three of them and see which words are common among them.  I have been able to look at Hamlet 3.4 with comparison to other texts as well as other groups of plays and works written by authors.

Continue reading

About the Developers

A few people have asked about the contact information for the developers of our various tools. As I said in class, remember a few things before you contact people for help:

  1. Describe your problem in detail, and ask clear and focused questions. Tell them what steps you have taken to try to resolve it yourself.
  2. Be polite and deferential. They are not customer service agents, but professors and experts who have devoted a lot of time to developing these tools and making them freely available to us.
  3. Give them at least 48 hours to respond; if you have nothing by then, take that as your answer or just keep waiting. Don’t send a follow-up for at least a week.
  4. Thank them for their time.
  5. Link to the course blog in your e-mail.

The Developers

Feel free to add other names of helpful people you’ve contacted in the comments; just make sure you tell us which program they were helpful about.


  • Geoffrey Rockwell has a contact page on his blog. He is also on Twitter.
  • Stéfan Sinclair also has a contact page with a form, and here is his Twitter profile.
  • Martin Mueller is the contact person; you can e-mail him directly from the home page.
  • Rockwell, above, is listed as their main/only contact.
  • Kamal Ranaweera <kamal.ranaweera {at} ualberta.ca> manages user accounts.
  • Aditi Muralidharan’s blog has her e-mail and Twitter details.

MONK’s “pranks…too broad to bear with”

Polonius’ sentiments about Hamlet’s ‘recent behaviour’ were perhaps approached in our MONK group today.

Being met with frustration on the first day of collective contribution to learning and mastering MONK, I believe, though my teammates may disagree, was both beneficial and disconcerting. MONK, amongst other capabilities (albeit extremely limited capabilities), immediately bonded us in the united effort to overcome its barricades of text analysis. A united effort that made a modest amount of progress, but progress nevertheless. Our processes, and the obstacles that MONK hurled our way, as depicted and described below, have revealed to us the limitations of MONK’s capabilities.

To begin, I depart with my emphasis on the limitations of MONK’s capabilities, to explain what those capabilities are. Then the limitations which are to follow will be of much more significance and clarity. In a general overview, MONK is an acronym for “Metadata Offer New Knowledge.” It functions on a ‘bag of words model’ in which it takes a digital text and interprets the characters in the entire text as numerical values. The ‘bags of words’ (called worksets from here), are compared with other kinds of bags in order to provide a frequency comparison with other texts. It is an analytic tool, where we enter data so that the tool can give data back. Thus, in summary, MONK is able to search concordances such as lemmas, parts of speech, and spelling, which are all inputs for Dunnings. It is also able to compare the frequency of any of these three between two worksets through the use of toolsets. Those who are interested in further details, or feel that my explanation leaves much to be desired, may proceed to the Monk Tutorial. For those who are interested in Dunnings, and the analytics of it, may proceed here.

We defined our worksets as chunks of text, instead of as lemmas, parts of speech or spelling, as to suit our purposes of analyzing Hamlet 3.4. The worksets that I am currently attempting to work with are the complete text of Hamlet, Act III, and Act III scene iv.

We began our first session by exploring our tool in an attempt to grasp it’s full potential in analytic capabilities. Though not verbally stated, I imagine the question we sought to answer was, ‘what can MONK do to provide me with more insight than what I could get from simply reading the text?’ With this general aim in mind, we started by searching general concordances in Hamlet just to practice using it. We entered, in the concordance search bar, “mother n” in order to search for the frequency at which mother appears throughout the text as a noun:

As you would guess, “mother” as a noun, does not appear this many times in sequences throughout Hamlet. The problem presented here that we continued to experience, was that the findings do not provide us with any line numbers or references to acts. We are left with the general picture of how many times we see the word “mother.”

Regardless, we continued on to see if perhaps the toolset “compare worksets” would provide us more insight into the significance of frequencies in Dunnings as opposed to concordance of just an isolated text. So, upon saving our worksets, we entered into the tool and before starting to even use the tool, we were already faced with another problem: what could we compare Hamlet 3.4 for with in order to obtain useful results?

Because MONK is a comparison tool, we determined the best ways in which we could establish the significance of Hamlet 3.4 to Hamlet in general, was to compare 3.4 to the entire text of Hamlet, 3.4 to Act 3 (excluding 3.4). At this point took our own experimental paths, continuing to share with one another what we found, what problems we experienced, and questioning what we could do to take that result to further analysis. The following is what I found in my own attempts to use MONK. (However, the problems that are described here are ones that all five of us encountered.)


First, the feature comparison has several analysis methods available in the drop menu:

On The left hand side of the screen, I have set the first work set as Act 3.4 and the full Hamlet text as the second. The ‘Analysis Methods’ drop menu contains the options “Dunnings: First workset as analysis; Dunnings: Second work set as analysis; and Frequency Comparison.” The remaining two I have yet to venture into.


The results on the right were the result of selecting “Dunnings: First work set as analysis” and then selecting ‘Lemma’ as a feature, 30 as the minimum frequency, and ‘nouns’ for feature class. These data inputs returned to me the data results on the right, in which the left hand column displays numerical values of the frequencies, and the right displays a visual guide in which grey words are under used, and black overused. The size of the font used reflects the extent of over or under use; the bigger the grey text, the greater the under use and vice versa.

This is where my problems began. To stop myself from rambling, I will just mention in brief that the problems that I experienced in comparing 3.4 to Act 3 workset were the same, if not worse.

In comparing 3.4 to Hamlet as a whole, whether altering the analysis method, changing the minimum frequency, or switching from lemma to spelling in the feature drop menu, there were very little changes that could be noticed in the frequencies on the right hand side.

For example:

This was the result of  the following parameters:

  •     First Workset- Hamlet 3.4
  • Second Workset- Hamlet (full)
  • Analysis Method: Dunnings: First workset as analysis
  • Minimum Frequency: 20.
  • Feature: Lemma

**Please note the bold grey letters, as the list reflects those letters



The parameters set for this second analysis:

  • First Workset: Hamlet 3.4
  • Second Workset: Hamlet (full)
  • Analysis Method: Dunnings: Second workset as analysis
  • Minimum Frequency: 20
  • Feature: Lemma

As you can see, the words are exactly the same, whether you are using the first or second workset as analysis. I assure you, the results are equally baffling. The logic behind our thinking here, was that 3.4 as a significantly smaller body of text, would return different results whether it was the text being analyzed, or the text being compared.

This was just one example of the various parameters I manipulated in order to generate results. This was a problem that we all experienced as a group. In an attempt to determine if we were missing something or otherwise incorrect, we used the same tool to compare Hamlet to the genre of tragedies available in the MONK database. The results varied greatly with this search.


This is what we realized:

MONK is capable of establishing very interesting data on the frequencies of words and lemmas within texts, but only if it is a large and substantial amount of text. This comparison technique is useful for the comparison of genre to genre, as it looks to the general significances of frequencies. However, the frequencies that exist within one scene, one act, or even one play, are difficult to use in establishing an argument. MONK is designed to be used in the broad spectrum of language that Shakespeare employs.

Because of this, when trying to analyze smaller bodies of texts, results became increasingly harder to establish as significant.

In the MONK tutorial, the section titled “Basic Facts on Common and Rare Words” explains the concept of Zipf’s Law, and explains that the words that occur rarest are the ones that will be the most interesting and significant, as opposed to the more common ones.

This being the case, it has been difficult (as of now) for us to look past the limitations and difficulties of MONK and embrace the potential it may have, as the frequency of words in 3.4 compared to Hamlet as a whole, is bound to be among all the rare due to the difference in content.

Nevertheless, as Hamlet says, “There is nothing either good or bad, but thinking makes it so.”

I believe our next step is to question: “In what ways can we manipulate MONK in order to use it in innovative ways in order to draw insight from dunning frequencies and workset comparisons to study Hamlet 3.4?”

Perhaps there are some ideas here.

Innovation: that’s what the Digital Humanities is all about right?


Monk; the bad and the beautiful.

My initial reaction of the text analysis program “Monk” was that I figured compared to the others, it seemed to have a fairly modern feel and look to it. I instantly had high hopes that it would be the most up to date program we had examined out of the five. Unfortunately, my optimistic approach didn’t last as long as I would have liked. I desperetely spent the entire TDFL session trying to log in to the website but had no such luck. Thus began the hell we now call, Monk.
Besides discussing the troubles and frustrations we had all individually encountered, our group did manage to sort out most of the kinks of the program. That being said, there were a lot of glitches discovered in the program as well as an overall sense of confusion. It seems as though the designers/creators of Monk decided it was necessary to make a maze of disaster to get to the final outcome of what you were looking for. To put it simply, nothing comes easy in regards to Monk.
Getting back to the introductory problem, Monk has its fair set of problems when it comes to logging in, in general. If you try to log on to Monk while in internet explorer, it won’t work. Simple as that. A fellow group member suggested I try using google chrome or firefox and only then did it complete the login process. My big question here… why?! I can’t even begin to understand why Monk chooses not to run while in the most basic internet option. Frustrating doesn’t even begin to describe how I was feeling.
Not only does Monk choose to be difficult when it comes to log in in, there is also no save option anywhere on the site although it claims that there is one. This means you have to basically start from scratch every time you want to begin your research on specific lemmas or anything else you have done for that matter.
There are also numerous annoying glitches such as it telling me that I haven’t clicked a workset to work on but I so clearly have. You just have to wrestle with it for a bit before it finally decides to accept the fact that you truly indeed have chosen the workset.

It became very clear that Monk is mostly designated for comparisons. While it allows you to create a workset (a document including texts that you want to analyze), this really only comes in use when you are comparing two texts. If you are looking to examine one piece, for example Act 3 Scene 4, it allows you to click the workset you have previously uploaded and saved, but after choosing it makes you do it all over again. It truly makes no sense, and is a continuing hassle to constantly have to re-choose what you are looking to analyze. The worst part is that it makes you think you won’t have to, but you do! They could at least acknowledge the present problems rather than act as if they don’t exist.
Despite the havoc we encountered while trying to sort things out with Monk, we did manage to get some good ideas rolling for what we want to specifically look at and what we can achieve and discover through the program. Despite the general hatred we feel for Monk, I still feel positive that we will be able to make it work…somehow.

Monk Workbench: Either the most simple or the most complicated tool in the Digital Humanities.

Kelsey Judd, First post.

Today was our first group confrontation of the program MONK.
It began well, with each member contributing what they had learned over the last week, and with all of us piecing together our separate knowledge to unravel the mysteries of the work tools. Within an hour we had discovered all the ins and outs of the program’s most useful components which I will try to explain: “Define Worksets” for finding concordances in lemmas or spelling, and “Compare” for finding frequency and Dunning’s analysis. Unfortunately soon after this we hit something like an impassible brick wall. Either due to out lack of experience or to something we cannot quite figure out in the program there does not seem to be all that much more to it beyond “Define” and “Compare,”…

The define feature is fairly straight forward once you realize one main point: it does not seem to keep an actual record of your “worksets.”

You can choose a tool on its own, or add a workset to work with.

We found that when you choose the “define worksets” tool it does not affect a tool if you choose a workset to go with it. Either way you come up with this page

It goes here whether you have a workset selected or not.

From here there are only two options. You can create a workset, which is basically searching Shakespeare’s works, or various works of American fiction and then saving your search and naming it. The second option is to search for lemmas, spelling or parts of speech; however, this does not seem to do anything. Whenever we try it, it will still ask you for which work you are searching in, even if you defined Hamlet or act 3.4 as your workset on the main work page or within the tool previously.

From this page, when you have selected Act 3, scene 4 comes a very simple little tool where you can search concordance. All you need is for the text to appear in the “advanced viewer” and to of course search on the concordance tab below it. Simple and straightforward. The only problem with this was that while it tells you all of the words or lemmas in which the word appears, and tells you how often they appear, it does not provide the speaker or location of the line, so it is mostly up to context. Now, I am sure there must be more to use in the define/edit worksets tools, but for some reason the five of us could not find it. Sounds like we still have a lot of exploring to do.

The other very useful tool is the “compare worksets” tool. It allows you to pull up specific texts, for example Hamlet as a whole, compared to just Act 3, scene 4.
It allows you to see the frequency or do a Dunning’s analysis of a word or a lemma, with the two variables being Shakespeare’s other works or works within a text. We found this works much better when used on a larger scale, such as comparing Hamlet to another play, or the whole of Shakespeare’s works.

Beware: the words on the far right run together sometimes, so you end up getting excited on finding the new word "actairbed."

As you can see the strange feature of this is that the words sometimes run together, so you think you have found a cool word: “actairbed,” when it’s really just the three close together. Amateur mistake of course. Clicking on the words will take you back to the spelling search and you will once again see the context and frequency with which they are used. The frequencies are quite a neat discovery, I think one of our next projects will be on how to use this tool to discover new and exciting themes in Hamlet act 3, scene 4.

End of the line?

Overall the experience with MONK has been a lot of trial and error, but rewarding when we do manage to find something new. The biggest problem we are having is the feeling that we are missing something crucial; we just seem to be going in circles. After upwards of three hours it may not seem like a lot, but has been quite a journey despite the time. Of course we will be pretty excited when we can successfully report back about new findings, most of all when we figure out how to save results… but for now figuring out the concordance and frequency tools has been rewarding.

Monk: Why Can’t We All Just Get Along?

As I left room 440a of the Taylor Family Digital Library after the in-class Monk workshop, I was pleased that I had had been assigned to work with this tool.  Initially out of all the tools I found Monk to be the most visually appealing, based on it’s simple layout and lack of clutter.  During the workshop I learned that I was able to compare Shakespeare’s comedies and tragedies, and mored specifically identify the nouns used in both plays.

This toolset, entitled Compare Workset, seemed to have a lot of potential and I was looking forward to applying it to Hamlet and seeing what new information I could learn about this play. As I began to work with Monk on my own time, I discovered that there were many other tools just waiting to be used, and I was eager to try them out.

And now, two weeks later, I find myself at a standstill.

Monk provides default toolsets, which can be used to analyze a workset.  A workset is a text, however big or small that the user has saved to his/her project.  My team mates and I created two work sets, the first being the entire play of Hamlet.  The second being Act 3 Scene 4 of Hamlet.  We wanted to use the toolsets to study each workset individually and compare them to each other, and see what results we would get.  I personally had an immediate interest in taking advantage in all the extra toolsets that I could add to my own project.  Toolsets such as Text Viewer, Analysis Tool, and Knot seemed to be helpful and relevant tools to my project so I tried them out.  When I tried to use Knot I got this:

When I tried to use Analysis Tool, I got this:

When I tried to use the Text Viewer, I got this:

I had barely even started using Monk, and already it was limiting me.  How was I supposed to get the most out of using Monk if it won’t even let me have access to all the tools that it offers?  As I made more and more attempts to work with these tools I became more and more frustrated.  A more detailed explanation of my groups attempts and aggravations can be seen in my team mate Hayley Dunmire’s blog: http://engl203.ucalgaryblogs.ca/2012/03/02/monk-my-frustration-and-lack-of-findings/.  I want to focus one of the limitations that my team mates and I discovered.  The toolset Compare Workset allows me to compare my workset of Act 3 Scene 4 to my workset of Hamlet.  I selected the Analysis Method Dunnings: First workset as analysis set, filled out the rest of the requirements and clicked compare.  I got a listing of words from most commonly to rarely used.  By clicking on one of the most commonly used words, I was brought to a page that listed all the times death was used in Hamlet.

I guess that’s interesting.  But what am I supposed to do with that?  I’m able to use the toolset Classification to isolate the use of death in Act 3 Scene 4 to just one time, but that’s all.  The more times I’m using Monk the more I was starting to realize that on smaller scale, it’s difficult to make any major progress in discovering anything of significance.  My team mates and I bounced around ideas such as looking at the type of words Hamlet uses to speak to Gertrude in this scene as compared to the rest of Hamlet.  But the results (as seen above) do not show the speaker or line number.

As angry as I am at Monk and how much it’s limiting me, I don’t want to give up on it.  My group and I came to the conclusion in our meeting today that Monk seems to be a tool that is better for large scale comparisons as opposed to small scale analysis.  For  Phase 1 this puts us in a tight position.  But I’m planning on working with Monk (yes with, not against) over the weekend hoping to come up with new ideas on how we can work with the toolsets that Monk has to offer, instead of trying to fit the toolsets into what we think they can provide.


Monk- My Frustration and Lack of Findings…

My findings with Monk have been one of a love hate relationship. At the beginning with the in class workshop I thought that my tool and I would have a pleasant bond but, I was wrong. I have found that with Monk it sets you up with the basic steps and how to analyze things on a very broad spectrum through the comparison of texts. However if you wish to dig deeper an analyze through smaller parts of plays then you are limited.

The workset itself is very easy and welcoming, it makes it easy to find works, pick them apart and make them your own by modifying them in whatever way you wish. In our group each one of us a workset that is everything in Hamlet but 3.4. This makes it interesting to see what is there and what isn’t when you take away a section of the text. I like how you can go back and easily tweak and re-work worksets and they are easily available for ready use.


However you can only use the worksets that you have created to combine with one another for compare worksets and combine worksets.  Classification only allows you to use one workset at a time which I find very frustrating because I have to write down my findings then switch over to the other workset, do the same thing then manually compare the two together. I also found with my group that if you wish to compare worksets on a smaller level it doesn’t make much sense. We found that when we were looking at the whole play compared with 3.4 the analysis from the compare worksets from looking at both did not seem to make sense and the data would be relatively the same.


I found that the concordance was helpful with picking out certain words in an body of works or a scene however it does not tell you who the speaker is or where it was found. This is also frustrating because if I would like to have a further grasp of the scene and what it means I have to go through the play and find out where the word is mentioned to see who said it and in what context it was said. This makes the concordance useless if I have to go through and look for the specific word I can easily find it on my own. From this I can see that it’s only practical job is to count words.

You would think that Monk would be willing to help you along the way with all the question buttons available to look, however those are utterly useless itself.

I found myself and my group looking on Wikipedia and Google in search of the answers you think the program would offer you, but even that came up short.

I was thrilled to know that Monk had a bunch of fancy work set tools that you can play and make your own into whatever you wish. Unfortunately I have found that they do not work at all, I have tried countless times logging on and off, restarting my computer and switching the file of comparison but nothing seems to make these files work.  It is sad because it would be so neat to see the findings that you get on a different view point or aspect.  It does seem to be another thing Monk fails to do.

I would like to try to see if I am able to upload  entire works said by certain characters or pivotal conversations to compare to Hamlet or Shakespeare’s Tragedies.  I think that this would be very helpful in looking deeper into the text. However this seems like a lot of work to conform to the regulations made by Monk when other tools do this for you.

Overall Monk is handy if you would like to know the word count of a certain body of text or the comparison of two things on a much larger scale. From my point of view it is not made to dissect texts on a small scale but rather a very large one when looking at massive bodies of work. Anything that can bring you details seems to not match up or make sense when you try to pick it apart, and that makes me unsure of any information that I may be receiving from it.  To me Monk is like the tool that sets you up for the much larger tools, it seems to show you the basics and stop there. It then becomes up to you to work with other tools to conform to Monk. That seems a bit unfair since I think the tool should have this option already. I think Monk has failed to meet its standards that it presents and it is all flash and no substance to it, it has left me confused, frustrated and with more questions than answers.

MONK basics

MONK (an acronym for Metadata Offer New Knowledge) is by the developers of WordHoard.

Start with this tutorial. Here are some notes:

  • MONK’s capabilities are summed up in the word “metadata,” which essentially means data about data. Parts of speech and lemmas are different examples of metadata.
  • For example, in the phrase “the Thames ran softly,” we know that ran is a verb (specifically, the past participle of to run); that softly is an adverb, modifying ran; that the Thames is a noun (specifically, a river in southern England).
  • The tutorial tells us that MONK treats all texts as “bags of words.” Think of these like bags of Scrabble tiles, but where every word is copied onto multiple tiles. Continue reading