As an english major, a lover of the literary, historical, and symbolic, I walked away with a celebratory slide, from anything that involved numbers in any shape or form. I suppose in my mind it was a celebratory slide, however to my math, physics, and chemistry teachers, it must have resembled something of a frantic scrambling flee for the door. This is, I think, something that the majority of my fellow classmates in ENGL203 can attest to; The mistrust of anything that would take a piece of literature and suggest, ” sometimes, a river is just a river.The river moves with this speed, this velocity, because the water demonstrates this amount of viscosity, and it moves in this direction.” As students of the literary, Â I suppose in response we would go on our rants and tangents of the river representing a winding and continuous process of life. My point here, is that there has been an innate and inherent hatred for some of us, if not most of us, towards the mathematical and statistical aspects of the world, and how those aspects take away from the symbolic values that have been metaphorically scattered throughout the universe.
Throughout the course of ENGL203 however, in the midst of my introduction into the world of the Digital Humanities, my understandings of the statistical, quantitative aspects of the literary text such as Hamlet, has consequently enriched my qualitative findings of the text. Digital Humanities, in my mind was the best example of an oxymoron, if I had ever heard a good one. I began this course with the question, “what could I possibly gain from knowing how many times a word shows up in a text?” I have concluded the course with the question, “in what different ways could these statistics and probabilities be applied to this text, or a wide array of texts, to provide me with the best kind of data to answer a series of research questions?”
Working with MONK throughout the semester in analyzing Hamlet, I have acquired a new appreciation for the mathematical aspects of the world. I say ‘appreciation’ without the implication that I have begun to appreciate mathematics, but to mean that I can see the value that it can provide in analyzing a text such as Hamlet, as I continue to have a lingering suspicion toward mathematics.Â Ben Schmidt’s article Treating Texts as Individuals vs. Lumping Them TogetherÂ has provided me with additional insight into my perspectives of the tools that can be used to analyze texts, such as Hamlet,Â in the Digital Humanities.
It is my perspective, and argument, that although the traditional close-reading that we have been taught throughout the years as lovers of the literary has much to offer us in an analysis of literary texts such as Hamlet, the tools that are available in the Digital Humanities that provide us with statistical data and probabilities complete our understandings of the qualitative with the quantitative aspects. I believe that the precedence we place of the qualitative, though understandable, is misguided. The numerical values that we are provided with in our tools, though frightening and confusing for us english majors, complete our analysis in such a way that makes the digital a valuable and effective method in text analysis.
MONK, despite its glitches and imperfections, did not fail to teach me a lesson about the Digital Humanities and the value of statistical data. In the beginning, I suppose I did not feel very different from the way Queen Gertrude did when she responded to Polonius’ melodramatic ramblings by saying, “More matter with less art (2.2.95).” I found MONK to be spewing at me numbers, statistics, probabilities, that provided me with nothing valuable whatsoever.
The images below, provide a pretty clear picture of what I was ‘fleeing’ from the rise of my university career:
THIS, after the entire course, is still lost to me:
I initially believed that I was going to understand nothing about these tools and flunk out of the course, however, it was comforting to find that I was wrong.
An aspect of MONK that I found particularly interesting in the way it contributed to my analysis of Hamlet, was the classification tool and its Naive Bayes analytics and Decision tree as methods of analysis. By using work frequencies of a variety of texts, MONK is able to classify texts into categories.
My immediate understanding of Hamlet, just by reading it, is that it is particularly tragic in its subject matter. Hamlet mopes around the entire text, quips like a madman with incredible mood swings, while everyone around him is scheming against one another, only to have it so everyone dies eventually. This plot, as ridiculous as I have made it seem in my summary of it, can be read as nothing but tragic. However, from the classification tool that MONK provides,Â I discovered that Hamlet‘sÂ word frequencies, were more comedic than tragic. By comparing it to a wide array of different texts, I was able to discover that Hamlet,Â like other texts such as Othello, are anomalous to the tragic genre of Shakespeare’s texts. The question to be considered here is, would I have met these conclusions from just a traditional reading of the text? I doubt it.
The emphasis here, is not on my lack of abilities in close-reading texts…but on the acute abilities of the text mining strategies of tools such as MONK. From word frequencies,Â or the quantitative values of Hamlet, I was able to discern the qualitative aspect of it as being less tragic than the classic tragedy inÂ ShakespeareanÂ texts.
In his article Treating Texts as Individuals vs. Lumping Them Together, Ben Schmidt explores and describes the strengths and weaknesses of various methods of analytics, and their use in answering question in text analysis. He states that the key importance in using tools that employ these methods of analytics is “how to treat the two corpuses we want to compare.Â Are they a single long text? Or are they a collection of shorter texts, which have common elements we wish to uncover?” Interested in analyzing hundred of texts, Schmidt is aware if the imperfections that arise from any division of this large number of texts. He poses the question, ” how far can we ignore traditional limits between texts and create what are, essentially new documents to be analyzed?” At the end of the article, he provides lists of the appropriate uses of Dunnings log-likelihood, Mann-Whitney, and TF-IDF comparisons in texts.
From working with TF-IDF as well as Dunnings log-likelihood in MONK, it was interesting to find that I reached the same conclusions that Ben Schmidt reaches in his article with his analysis of the tools. Attempting to use these analytics in MONK just to analyze Hamlet alone, was a difficult and arduous task, as the text being analyzed was simply to small. Hamlet as an individual text, in comparison to the huge array of texts available in the MONK program, hardly returned information that could provide useful in a text-mining analysis of Hamlet. As many of the MONK users have noted, HamletÂ on its own, was too narrow a data set to find any meaningful data using a broad and wide-scale analysis method such as MONK. As suggested in Schmidt’s article:
Each tool that uses and provides quantitative data has individual strengths and weaknesses. The valuable lesson to be taken away from Ben Schmidt’s article, is the suggestion that there must be a certain amount of care put into using tools such as Dunnings Log- likelihood and IF-IDF comparisons, and even with that care, sometimes these tools cannot be applied in the line of inquiry being pursued. In short, these tools cannot alway be relied on, and should not be the absolute basis of argumentation when it comes to text analysis. That mistrust that all of us share toward the numeric values that can pervade the literary, though extreme at times, is not unfounded. There is value in the qualitative meaning that we gather from traditional readings of texts, when the quantitative just simply does not make sense.
I have learned that, in a sense, neither the traditional reading nor the digital statistics of texts are completely trustworthy.
With the traditional reading, I concluded without being absolutely correct, that Hamlet was completely a tragedy, and that there was simply no other type of text that it could be.
With the digital statistics, I discovered that, although I was returned with data, the methods that I was attempting to use were very picky in the type of data I was inputting, and could return me with skewed conclusions if I did not use them with the utmost care. (Which I donâ€™t believe I did all the time.)
However, in both circumstances, I was able to use the digital to correct my traditional reading, and use the traditional reading to double-check my digital findings.
My purpose in writing all of the above is, therefore, to show that there is much value that can be gained from both methods of analysis. Each method on its own, is in some sense, incomplete. The Digital Humanities, in all of the tools it offers to provide a statistical analysis of probabilities in texts, through methods such as word frequencies, has provided not only a valuable, but legitimate method of analyzing literary texts such as Hamlet. Our fear of the numbers in statistics and probabilities and the automatic assumption that they will not be useful in a literary analysis of a text, though understandable, is misguided. As Hamlet begs of his friends, ” Nay then, I have an eye of you. If you love me, hold not off (2.2.255-257).” A request that many would beg of their endeavours using the digital tools, that they would not hesitate to reveal the value that they have uncovered beneath the text. The trick is in recognizing, to begin with, that there is in fact value, it just simply must be uncovered and laid in plain view for analysts to use.
However,Â once it is found…there is a great amount of valuable knowledge to be gained that can be contributed to our analyses as a whole.
The river does indeed represent the continuous winding and progression of life, and the numerical values of its speed, direction, and viscosity, tell me that this metaphorical river of life, flows at a rapid pace, in one direction decided by destiny, at a speed determined by the hardships and challenges innate to its path. Thus, providing me with a well-rounded, complete analysis, with the symbolic qualitative meaning and the numerical quantitative data, of the way of life.
Shakespeare, William.Â Hamlet. Ed. Ann Thompson and Neil Taylor. London: Arden Shakespeare, 2006. Print.