Digimon and Divination?

It is a grey day. Warm with snowflakes like glitter. Someone down the hall seems to be having a workroom party, which they are all quite content with; you can just tell by the laughter. But we are instead lost in a different world; a digital world, if you will. One with so much information compiled and cross-linked that it encompasses the realm of human experience, and encodes the most significant events, works, and experiences as data. It is a place where you can indulge in the works of a man who lived in the 1600’s, and divine new secrets 400 years later. When you really think about it, it is fantastic; unbelievable almost.
Yet, at the same time it is another new day in the Humanities, and a lot of planning done. Today was devoted to pre-project-planning (say that three times fast). Although there were not too many new discoveries, there was the exploration and expansion of the old ones. Monk; of course, is a mining tool, meaning that the more you work the more you will discover. As is, I have been finding more uses for the NaiveBayes and decision tree tools. They might be unconventional, and a little hit-or-miss, but the results are pretty exciting!
In the classification tool you can find NaiveBayes. Under which you load your worksets and rate them. I found that rating each scene with a theme will give me the words that make the predicted theme true or false. Thus, searching for confirmation of the theme “madness,” elicits words that have some cryptic connection with that theme. Such as the word “armour,” which has to do with the armour of the mind… From there, you have to make some good old fashion English major connections and argue your findings; something that we are all experts at. My idea is that the armour of the mind refers to its sanity.. which is slowly broken down by lies. Etc.
Anyway, you get the point. This is what my program is best at in comparison to the other programs. They have the frequency, concordance, and description tools, but this seems to be a unique feature of Monk. The biggest question now is if it can be useful enough to present. That is the question for next time.

The words are supposed to be suggestive in conjunction to "madness"

It is not the most succinct method of analysis, but there is still time to work with it, and it does prove to be interesting every time. For example, “black” appears five times in Act 3, and it is always in a very negative context:

Results for "Black"

In case you were wondering about the title and the bit of writing at the beginning, it just occurred to me that the premise of one of my absolute favourite childhood shows has an abstract relation to the Digital Humanities. That show was Digimon (I know, I know), where an alternate dimension that housed a world made in the image of the earth, with fictional-type-monster inhabitants existed. If you know the show you might remember that the digital world was created by the compilation of data that is stored in computers and over the internet. First the foundations were laid, and “Over the ensuing years, through the continued growth of the electronic communications network on Earth, the Digital World continued to expand and grow,” (http://digimon.wikia.com/wiki/Digital_World) It’s a little bit silly, but it is an accurate depiction of not only the information amassed on the internet, but of the Digital Humanities itself, which must hold significant portions of the literature that shapes the world we live in. Literature is made in the image of the earth and of human experience, and the characters that inhabit it are in the image of its creatures. The depth that it reaches to is too far to count. It is too far a stretch to say that the universe of data is alternate to the universe of reality?
Just a thought.

Where does the world end and data begins?

Marry, this’ miching malicho; it means mischief.

This is easily one of my favourite lines in any Shakespeare play. Why? Because the words befit the meaning in a style that is all their own. And I cannot hlp but thinking that is Shakespeare himself knew that twenty-five young adults were set free with the power of technology to analyze his plays, he might think that a mischief all its own.
In our own little sect of madness we got off to a bumpy start. We were all “masters” of our respective programs, but how do we compare them? How can we link each advantage and rate the,. How many of the tools overlap in use? And what becomes overshadowed by a newer, better tool?
Most of all, how can we find out?
We needed a common ground. Something inside Hamlet that every person can indentify. Which is of course madness, something every hard-working university student has met with at least once, but besides all that it is a theme within Hamlet that everyone will decipher differently. Is he sane and acting? Is he crazy from the start? Is he driven mad by his own efforts? Hamlet will always be a mystery so long as space-time continues.

Where we are now:
Since we had a goal in mind, we were able to find the means. Within different programs frequency searches, Naive Bayes, concordance searches, “described as” searches have all proved useful. We are able to track down suggestive words through Naive Bayes, and then put them into other searches to divine meanings. The other cool thing that we have been finding is the ability to compare Hamlet to other Shakespeare tragedies. “Madness” appears in Hamlet 22 times! The next most frequent is probably Romeo and Juliet at 11 times. That is a huge jump. So we know that Hamlet is focused on madness, now we just need to find subtle hints, recurring themes and general meanings that can help to indicate the true madness of Hamlet, or the play he puts on for everyone.

The uniqueness of our Act has been comig out slowly as well. We know (not necessarily because of the digital humanities) that our Act contains much of the most important action in the play. The “To Be or Not To Be,” speach appears, as well as “Get thee to a nunnery,” the play performed for Claudius, the confrontation of Gertrude, the murder of Polonius, etc! There is simply a ton of stuff to research and a lot to discover.
Most importantly for next time we must study:
The use of “poison in the ear” as a metaphor.
Any reference to the mind such as:

Every instance that describes a character as “mad.”
And really anything else we can think of.
So that is about it for past 1 of Phase 2. We have a strong Act 3 team, with only a few hiccups,and some illness 🙁 and hopefully there will be more success to report on the next post. Right now there are just to many questions! It’s pretty amazing what we can do though. What has taken minutes on MONK or voyeur, etc, would have taken months in the traditional way. Could you imagine going through every Shakespeare tragedy and noting the use of the words: “mad” or “madness?” It sounds crazy, and yet that is what the creators of these programs have done for us. We are grateful 🙂

Naive and Decisive actually sums up a lot of MONK!

Phase 2, and a new light… hopefully.

Being the expert on MONK is a tough job. Luckily the bond that comes from quizzically hitting buttons and keys for 9 hours is not an easy one to break. My project screen looks well used and familiar-

The results go on and on. Do we know what all of them mean? Not really 🙂 but we like them.

Meeting the new group in person really revealed how much the other groups liked or disagreed with their tools as well, and the hope is that what one tool lacks, the others will fill. So far we have had an easy time agreeing on regulations and sharing stories, so things are looking good for acing this presentation in a different way than the first, (though my phase one group was completely amazing, and I will miss them).

As for MONK – let’s just say not much has changed, except – the Act! Act 3 is my personal favourite act. Insanity, insults, murder, confrontation, blood, more ghosts, and much more! Really though, it just always seems like the most action packed of all the Acts!
Monk is doing its best to help me support this idea. The word “madness” shows up nine times alone in this Act! Although I did discover a slight annoyance again. I could not get the program to look through a whole act, only through the scenes. So far this is only in “Edit Worksets,” so it could just be a glitch.

Other words that show up quite a bit? Time which shows up 10 times, and “Heaven” shows up 10 times! “Action” – 6. “Go” – 17. Death and murdrer show up quite a bit too, but or course the words pertaining to the future, and action-y words show up more often, which at the very least could tell us that this Act appears in the middle of the play.

I had a very cool discovery too! In the classification tool with NaiveBayes and Decision tree (which you either understand or you do not, there is not much in between) I was able to load my Act 3 workset, which features each scene of act 3 as a different document meaning I can compare them! This is perfect for this Phase!

I rated each act as either comedy or tragedy:

As you can see, scene 1 and 2 have slightly comedic tendencies, and scene 3, being of course about sending a man to heaven or hell, is not a comedy at all… and scene 4 is an absolutely confirmed tragedy, go figure. Anyway, I think this is brilliant! Let us continue…

Now all I have changed is scene 3 from comedy to tragedy:

This is amazing because it seems like Naive Bayes uses the document as points of comparison. Scene 1 is supposed to be less of a comedy than before if scene 3 is a strong tragedy. That makes sense! In conjunction with plotting the murder or the “King,” the word “King” in the first scene seems to be associated with much deeper, darker meanings… Intriguing…

I could honestly go on about this forever, but I doubt every one of my findings would be as interesting for everyone. In summary this just means that I have a way to directly look at all of the scenes together, and that is worth a lot! Anyway, our self assigned homework for the weekend was to read all of each other’s blog posts, and see what we deduce from them, what would work with each other’s programs, etc. The hope is that through self-education we will have a breakthrough in compatibility capabilities… if that makes any sense. I am looking forward to exploring more of my new discovery, and am really going to think about how it can help my group members; that is my self assigned homework for the week. At the very least I can show off my new discovery next time and hope that they think it is as cool as I do.

Until next time, Kelsey ^.^

Monk: A Greater Understanding and a Bigger Hurdle

Since the last post, the Monk group has met twice. We have made significant advances with the tools of the program, but have also made a crucial and unfortunate discovery to humble our success.

Firstly; however, our discovery. In the “compare” toolset there is an analysis method that we has not managed to figure out before. It is called “IDF” and it allows you to select a training set. Once you manage to fulfill all of the options to the program’s liking, you are advanced to a screen much like any other one where you can select a work, view it and type in the concordance you desire. Most of the toolsets get to this page and end there. However, for this tool, you are allowed to take the workset you nominated as a “training set” (we recommend selecting the all-encompassing “plays: tragedies” and “plays: comedies” or something for the most options) and from there to re-select a mix of both full plays and even individual scenes and save it as it’s own workset. (Minimum 3 selections).

As usual you hit a dead end on the concordance page, but uniquely, your saved workset becomes useful. Take your new workset with its many parts and load it into the “Classification” Toolset.

From here you must give each document a rating and follow the continue button…

This is the part of the program that Monk specializes in. Naive Bayes and Decision trees. The explanation of which will be one of the major parts of our presentation. After selecting your method you can insert a prediction if desired and…. Voila! You get a complicated rating system of “confidence” and “frequency.”

Very cool – now for the sad part. This tool, from what we understand, is basically used for the identification and classification of author’s works. It particularly focuses on entire play and their characteristics. Poor little Act 3, Scene 4 does not much register in the scale, and the part that does we of course already know its origin and the characteristics of it as a Shakespeare play. So how can we use Monk’s most defining tool as an aide in discovering Act 3, scene 4? That is our current mission. As well as explaining to you all this lovely piece of analysis:

Also since our last posts we have done more research into the purpose and uses of Monk…
We found out that Monk is one of the first of the Digital Humanities programs, almost a prototype for Wordhoard. Through different group member’s findings we have determined that the Classification, Frequency of words and the Concordance searches are specifically meant for analyzing large scale works such as entire plays or collections to find themes throughout historical moments, between writers or characteristics of the writers themselves. As it is, we are not sure how useful it is as a tool to analyze one scene in one play. Our greater understanding of the tool itself has further clarified this. Monk is great at finding certain things within a text, any text, of any size. Although, when it comes to comparing them, it is harder for a small document such as a scene to provide enough information to represent itself against other documents.

For the remaining days, our work will be centered on figuring out how Monk can directly provide insight into Act 3, Scene 4 specifically, and to see if it if possible to use the tool in any depth without comparing the scene to the entire works of Shakespeare’s tragedies –
Because as interesting as our tool can get, our focus must be on the one scene, and we are trying to be optimistic about getting it to work for us!

So, till next time, I leave you with this excerpt from the Monk help buttons.

Monk Workbench: Either the most simple or the most complicated tool in the Digital Humanities.

Kelsey Judd, First post.

Today was our first group confrontation of the program MONK.
http://monkpublic.library.illinois.edu/monkmiddleware/apps/workflow/
It began well, with each member contributing what they had learned over the last week, and with all of us piecing together our separate knowledge to unravel the mysteries of the work tools. Within an hour we had discovered all the ins and outs of the program’s most useful components which I will try to explain: “Define Worksets” for finding concordances in lemmas or spelling, and “Compare” for finding frequency and Dunning’s analysis. Unfortunately soon after this we hit something like an impassible brick wall. Either due to out lack of experience or to something we cannot quite figure out in the program there does not seem to be all that much more to it beyond “Define” and “Compare,”…

The define feature is fairly straight forward once you realize one main point: it does not seem to keep an actual record of your “worksets.”

You can choose a tool on its own, or add a workset to work with.

We found that when you choose the “define worksets” tool it does not affect a tool if you choose a workset to go with it. Either way you come up with this page

It goes here whether you have a workset selected or not.

From here there are only two options. You can create a workset, which is basically searching Shakespeare’s works, or various works of American fiction and then saving your search and naming it. The second option is to search for lemmas, spelling or parts of speech; however, this does not seem to do anything. Whenever we try it, it will still ask you for which work you are searching in, even if you defined Hamlet or act 3.4 as your workset on the main work page or within the tool previously.

From this page, when you have selected Act 3, scene 4 comes a very simple little tool where you can search concordance. All you need is for the text to appear in the “advanced viewer” and to of course search on the concordance tab below it. Simple and straightforward. The only problem with this was that while it tells you all of the words or lemmas in which the word appears, and tells you how often they appear, it does not provide the speaker or location of the line, so it is mostly up to context. Now, I am sure there must be more to use in the define/edit worksets tools, but for some reason the five of us could not find it. Sounds like we still have a lot of exploring to do.

The other very useful tool is the “compare worksets” tool. It allows you to pull up specific texts, for example Hamlet as a whole, compared to just Act 3, scene 4.
It allows you to see the frequency or do a Dunning’s analysis of a word or a lemma, with the two variables being Shakespeare’s other works or works within a text. We found this works much better when used on a larger scale, such as comparing Hamlet to another play, or the whole of Shakespeare’s works.

Beware: the words on the far right run together sometimes, so you end up getting excited on finding the new word "actairbed."

As you can see the strange feature of this is that the words sometimes run together, so you think you have found a cool word: “actairbed,” when it’s really just the three close together. Amateur mistake of course. Clicking on the words will take you back to the spelling search and you will once again see the context and frequency with which they are used. The frequencies are quite a neat discovery, I think one of our next projects will be on how to use this tool to discover new and exciting themes in Hamlet act 3, scene 4.

End of the line?

Overall the experience with MONK has been a lot of trial and error, but rewarding when we do manage to find something new. The biggest problem we are having is the feeling that we are missing something crucial; we just seem to be going in circles. After upwards of three hours it may not seem like a lot, but has been quite a journey despite the time. Of course we will be pretty excited when we can successfully report back about new findings, most of all when we figure out how to save results… but for now figuring out the concordance and frequency tools has been rewarding.