Wordseer: The Problems and the Possibilities

So I was at my group meeting on Friday, and, wouldn’t you guess, our tool, Wordseer, wasn’t up. That’s to be expected occaisonally with any program you find hosted on the internet, because servers crash, updates are installed, tested, etc. but then it happened again today.



When, over the course of a week, the tool is down twice at the very least. It starts to indicate, at least to me, that it has some technical issues to solve. Now, I’m a computer-science major as well as an English one, so I understand technical difficulties, and accept that there are plenty of tools out there with such difficulties… But not all their problems are technical.


For the purpose and remainder of this blog, I’m going to assume a hypothetical next genereation Wordseer and to this Wordseer 2.0 I’m going to attribute as many things that would be helpful as possible. This way I would be suggesting improvments as opposed to criticizing Wordseer for what it is not.


The first and most useful thing that is missing and could be included in a new iteration is a text uploader. This way you could analyze any text that you want. Currently the selection is a) written by Shakespeare, b) written by Stephen Crane, or c) related to slaves. Doing this would give users a far broader volume of text, but also would allow someone to take a text and easily use a tool like Tapor to extract pieces of text, for a more versatile analysis. For example, Dr. Ullyot wanted us to try and find a way to analyse Hamlet 3.4, but lacking any function to do so, our group was incapable of analysing any one portion of Hamlet. If we could upload an xml, text, or html file to be read, we could then upload just 3.4 and analyse the document. With this theoretical addition, one could also upload just one speech, or the lines of one character, or a section that the user has found that is written in a certain meter. Any of these and just about any other selection of text would help a user find more specific, varying, and interesting results.



Another function that could be included would be to report bugs in the software searching for relations, because these do, occaisonally, pop up. This would help the creator of the software to better understand and develop the tool to become more accurate over time. These things happen, it’s easier to report a bug if you just press a button pertaining to one search result that turned up when it doesn’t apply. This would help the creator of the software to help the users of the software to have more varied and more appropriate search results and making his or her experience simpler and more effective.


The last addition I think could be added is the possibility of private and public functions which would apply to such current functions as tags, annotations, and collections. Things not already included that could have both private and public attributes could be saved search results, documents that the user uploaded (as per earlier in this same blog) or even forums or chat. This would enable collaborative work through a) the entirety of the digital Humanities field b) a small group of students or researchers working on a research paper or project or c) just the one user. It would enable the users in the neccessary groups to have access to everything they need or want and eliminate the unneccessary annotations and documents.


There are currently 3 ENGL203 and 4 Hamlet related documents, all of them public.


Now, I realize that this is largely the criticisms, of a computer-science student, but it is also the opinion of a Wordseer user and English student. I think Wordseer has potential as a fun and intensely useful tool that could help students come up with theses for their papers, but right now it is limited to, well, let’s be honest, no one’s going to search the relationships between words in works about slaves, and not too many digital humanists will be interested in Stephen Crane’s works; right now it’s limited to Shakespeare and limited within it by subdividing walls at that.

6 thoughts on “Wordseer: The Problems and the Possibilities

  1. Wow, I’m really impressed with the imagination of your blog post topic despite your tool being down, way to make the best out of your situation!
    On that note, I’d have to say that this was one of the most interesting blog posts this week for the dual perspective you presented! Both fields of study offer the rest of the English student community (outside of the computer-science discipline, such as my technically-challenged-self) an honest understanding of the computer side of digital analysis. I would also like to commend you on your understanding of your tool’s creators – your criticisms are well stated and I would hope be considered if reviewed by the creators of Wordseer.

    Have you tried emailing them at all to ask what problems they are encountering that results in the site being down so frequently? Perhaps, the problems which you have addressed could be fixed during site maintenance… but this is coming from someone who lacks site-building knowledge.

    Great post!

  2. Hi Jesse,

    Sorry about the site being down on Friday, if it happens again, please just email me (aditi@cs.berkeley.edu) with something like “WordSeer is down again” in the subject line, and I’ll get to it. I wasn’t messing with it on Friday, so I’m not sure what happened there.

    A bug reporting feature is a fantastic idea, as is being able to share saved search results.

    As for being able to upload any text you want — we’ve wanted this for WordSeer all along, but it’s been complicated because there’s a lot of computational processing that goes on behind the scenes, and different text collections have different formats etc, so it’s been put on the back burner for now.

    The most annoying limitation all of you seem to be facing is how to restrict to just the sub-section Hamlet 3.4. Honestly, I have no good idea how to do this with WordSeer, which just goes to show how little I, as a computer scientist, really know about the kinds of analyses English scholars need to do. Technically, I created the snippets feature to deal with situations like this, but as nobody can figure out how to use it (me included), I’ll have to think of a better way.

    Thanks for all this — keep up the good work.


  3. I also wanted to say, in defense of the Slave Narratives collection and the Stephen Crane collection, that WordSeer was first developed and conceived *around* the slave narratives collection in close collaboration with the English scholars who were working on it.

    Similarly, the Stephen Crane collection was developed for another scholar at Emory University, who has a specific case study in mind, and the Shakespeare version was developed for your professor, Michael.

    As you’re discovering, it’s still very much in Beta — lots of bugs, not enough features, which is why it’s not really very public-friendly in terms of the text collections it supports.

  4. Writing this post through the eyes of a student who has knowledge in both English and Computer Science made for a very interesting read! It gave me a completely different perspective on how to look at technical difficulties and potential improvements that could be made on all of the tools we are using in general. I really liked the way you phrased the technical difficulties issue; you were very understanding and not too quick to criticize the tool. I found with my group we were easily irritated and frustrated with the problems we faced, and we were quick to judge the tool’s flaws rather than appreciate the positive aspects. The way you chose to make your post about how to make the tool better instead of just tearing it down was a good reminder for me. Things can always be improved, so your positive perspective and choice to offer some helpful opinions to WordSeer was really nice to see. Thank you!

Leave a Reply

Your email address will not be published. Required fields are marked *