Flushing Out a Thesis and Scalar Searches: Part 1

I’ve been wondering recently how I’m going to approach my final blog post, or final paper, for this class. I’m not sure what kind of questions I could be asking that would be important enough that it could make up an essay of up to 2500 words. It’s a daunting enough task to come up with a paper this big, but it also counts for a huge chunk of my grade, a chunk of a size I care not to see.

Thinking on what subject I could come up with I thought to simply build on the work I’ve done so far. This idea seemed simple enough, so I went on to look at the blog site and look back at and read my blogs again. Now, for those who’ve not seen my blog posts I’ve written a blog on the relations search in Wordseer, a blog describing the problems wordseer faces and things that can remedie them, and a blog describing the limiting aspects of in which contexts you look at a tool and how that affects the results you get.

Now, the idea behind this last blog post really interests me as a possible starting point for finding an argument to make in my last blog post. So now, through all the rest of my posts I’ll flesh this idea out a little bit more so that I can be prepared for my final paper / post / phase 3.

In Hamlet I’ve begun to experiment with this idea. I’ve searched the word ‘die’ in two different collections of documents of varying sizes. I start in the context of Act 3, Scene 1, which includes Hamlet’s “to be, or not to be” speech.

 Searching Hamlet 3.1 with Wordseer

In this scene I found 4 results, which does not offer a very wide opinion of how death and dying is viewed by Hamlet. Instead, this offers a result very specific to the point in time that Hamlet is saying these things. In this case, the results for death show that they are used with the word sleep twice. This is very useful for generating hypotheses or finding points to inspect heavily within Hamlet, but that is not the point here. For now it’s enough to know that these two uses of ‘sleep’ are used in Hamlet’s ‘to be, or not to be’ speech and give a clue as to how Hamlet is viewing death at the time he is giving the speech.

Now, I’ve searched this same word in the larger context of Hamlet, the play, as a whole, and I’ve come up with 17 results.


It is important to note that besides the fact that there are more results, and therefore more views on the word itself, there are far more varied results. These results can be used effectively to flesh out the views of death that Hamlet, the play, portrays with ease and with more accuracy. These varied results show more of a varied view of the play. Showing more aspects of the particular personality of the play allows someone to better and more easily come to understand the play.

Now, I’ve done more searches on the play than this, but I’ve run out of time to analyze them, instead I intend to come back to this subject in my next blog and I’ll better explain some of the differences that I’ve found while looking at different scales of a search.

In the Context of Things: How One Act May Be a Limited View

The third act of Shakespeare’s Hamlet is full of action, energy and great writing. It has strong character dilemmas, some death, powerful speeches and a play within a play. To most people with some interest and experience with Shakespeare’s works, this would seem like an excellent act and play to work with, but is it really enough to base writing on?

Until this point we’ve all been working with larger documents and even more diverse works, with work collections as big as the entirety of Shakespeare’s known works. I most often used the entire work of Hamlet as the basis of my searches on Wordseer, and with that I often got thorough and useful results, but when I started sizing down to searches focusing only on one act, even the incredibly diverse and action filled act that I and my group get to focus on, I’ve been getting less results than I care to admit and far fewer results than I would like.

One possibility is that this will be fixed when I can start to look at the collective tools working together where whatever small results that one tool can find will begin to raise questions for other tools to answer, and I think that this will happen, but even this approach limits the possibilities because no matter how effective a method you have for deriving information from data and no matter how intensely one scrutinizes their data, the results someone can attain are corrupt if their data is corrupt.

I say this because I think that looking at only one act might possibly corrupt the data that we recieve from doing so. For the uncaring this next part might be a bit technical so I’ll use point form to make it more clear.

  • A digital humanities tool is a survey tool that takes polls from texts to see if such and such a word fits under a certain description.

    • Imagine a text as a nation that we want to ask a question to, and all the words in that text as voting or polled individuals.

    • Every time I enter a search into Wordseer, I ask the individual words of the word population of the text nation “Hamlet” whether they apply to such and such a query. For example I would be asking them “do you describe the word “Ophelia”?” and, if they do, they show up in the results of the poll.

  • A survey tool has less accuracy with a smaller polled group.

    • So, if I don’t poll the entire nation of Hamlet, but rather, I ask the constituency “Act 3” or “Scene 1 of Act 5” I’ll get a less accurate result.
    • Within this constituency there are those that abdicate voting (a specific word is not used in that scene/act, but several synonyms appear in its stead) and those that are running for mayor are going to influence their friends and family into voting for them ( an artistic use of repetition over powers the results ) as well as many, many other small things that if the polling group were bigger would be less aparent and would skew the results less.
  • These same quirks and others like them occur all over the place in texts that make small changes which affect the interpretation of that text more as the text becomes smaller, and no one can anticipate or identify ally of those problems.

However, in the writing of this post, I have found that there are positives to polling a smaller sample size or to analyzing with a smaller text. For one, it clearly and effectively shows an opinion or result specific to that group or text, although that is clear in itself. For another, it clearly outlines the smaller, more specific quirks that I mentioned before, allowing for a clearer interpretation of literary methods.

Wordseer: The Problems and the Possibilities

So I was at my group meeting on Friday, and, wouldn’t you guess, our tool, Wordseer, wasn’t up. That’s to be expected occaisonally with any program you find hosted on the internet, because servers crash, updates are installed, tested, etc. but then it happened again today.



When, over the course of a week, the tool is down twice at the very least. It starts to indicate, at least to me, that it has some technical issues to solve. Now, I’m a computer-science major as well as an English one, so I understand technical difficulties, and accept that there are plenty of tools out there with such difficulties… But not all their problems are technical.


For the purpose and remainder of this blog, I’m going to assume a hypothetical next genereation Wordseer and to this Wordseer 2.0 I’m going to attribute as many things that would be helpful as possible. This way I would be suggesting improvments as opposed to criticizing Wordseer for what it is not.


The first and most useful thing that is missing and could be included in a new iteration is a text uploader. This way you could analyze any text that you want. Currently the selection is a) written by Shakespeare, b) written by Stephen Crane, or c) related to slaves. Doing this would give users a far broader volume of text, but also would allow someone to take a text and easily use a tool like Tapor to extract pieces of text, for a more versatile analysis. For example, Dr. Ullyot wanted us to try and find a way to analyse Hamlet 3.4, but lacking any function to do so, our group was incapable of analysing any one portion of Hamlet. If we could upload an xml, text, or html file to be read, we could then upload just 3.4 and analyse the document. With this theoretical addition, one could also upload just one speech, or the lines of one character, or a section that the user has found that is written in a certain meter. Any of these and just about any other selection of text would help a user find more specific, varying, and interesting results.



Another function that could be included would be to report bugs in the software searching for relations, because these do, occaisonally, pop up. This would help the creator of the software to better understand and develop the tool to become more accurate over time. These things happen, it’s easier to report a bug if you just press a button pertaining to one search result that turned up when it doesn’t apply. This would help the creator of the software to help the users of the software to have more varied and more appropriate search results and making his or her experience simpler and more effective.


The last addition I think could be added is the possibility of private and public functions which would apply to such current functions as tags, annotations, and collections. Things not already included that could have both private and public attributes could be saved search results, documents that the user uploaded (as per earlier in this same blog) or even forums or chat. This would enable collaborative work through a) the entirety of the digital Humanities field b) a small group of students or researchers working on a research paper or project or c) just the one user. It would enable the users in the neccessary groups to have access to everything they need or want and eliminate the unneccessary annotations and documents.


There are currently 3 ENGL203 and 4 Hamlet related documents, all of them public.


Now, I realize that this is largely the criticisms, of a computer-science student, but it is also the opinion of a Wordseer user and English student. I think Wordseer has potential as a fun and intensely useful tool that could help students come up with theses for their papers, but right now it is limited to, well, let’s be honest, no one’s going to search the relationships between words in works about slaves, and not too many digital humanists will be interested in Stephen Crane’s works; right now it’s limited to Shakespeare and limited within it by subdividing walls at that.

Words and Their Relations: Wordseer and One of Its Uses

In English 203 I’ve been working with Wordseer as part of a group specializing in that tool. Because I’m new to the digital humanities field, I am also new to the tool Wordseer. In order to better understand Wordseer and how it helps me study the digital humanities, as well as to help along the other students in my class in it’s understanding, I came up with a couple of questions.

The first question I asked, and the one was “what is one use of Wordseer?”. What I found was that Wordseer is unique, among the tools in the Digital Humanities that I’m familiar with, in that it has a search function to find how words interact with each other. This is helpful in finding the opinions of characters towards certain things or other people. It is better to do this with specific people or places or things. Using Hamlet as the text, I entered Ophelia described as blank so as to find how the characters felt about or viewed Ophelia.

Ophelia described as blank

Ophelia described as blank

The results show that Ophelia was fair, poor and sweet. I can see this as a very useful and important tool because it gives me a good idea of how Shakespeare intended us to view Ophelia, as well as the overall opinion that the other characters have of her.

We can also go to the bottom of the page and look at the results in a better context.

This section of the tool is useful because it helps the user to understand the specific situations that the word is used in. The word is shown with a few lines around it, this allows the user to get the mood and the tone of the situation the word is used in. One problem, though, with this section is that it tends to be a limited view of the word, but, by clicking on the indicated icon, you can read the full section of the text that the word is used in and the text from the search page is highlighted to let the user know where the word is in the text.

This allows the user to know who is speaking, also allowing the user to know how that character feels about the word he or she searched for. In my case, from this search and only a few lines around the given sections of text surrounding each use of the word Ophelia I can find out these things about her in a very short amount of time:

  • She is fair of appearance.
  • She grows mad sometime during the play.
  • She drowns sometime during the play.
  • Hamlet in particular thinks her beautiful.
  • When she dies she is deeply missed by Laertes.

From this narrative, I’ve learned one excellent way to interpret a text with Wordseer. Using the search function, a user can interpret what a character place or thing is like. This is a very helpful function in literary analysis in that it can help define a character.