In each examined Act (Acts 1 and 2) of the Klingon Hamlet, each speech event was marked with an sp element and attributes coding for speaker and addressee. From there, each instance of an honorific (neS) or fear (vIp) affix was marked with an affix element containing a type attribute specifying whether it was an honorific or fear affix. While this may not seem efficient for present purposes, a future project may wish to code for any of the numerous Klingon verbal affixes, in which case this attribute system will be much more efficacious. It should be noted that in each act, only 10-16 tokens of a given affix appeared.

Sample Markup

<sp speaker='Polonius' addressee='multi'>polonyuS pemej<affix type='honorific'>neS</affix>, joH. Satlhob, pemej<affix type='honorific'>neS</affix>, cha'.

SIbI' vIDorchoH. Ha' HIchaw'<affix type='honorific'>neS</affix>, joH.

<stage>mej TLHAW'DIYUS TA', GHERTLHUD TA'BE', toy'wI'pu' je</stage>

<stage>'el HAMLET laDtaHvIS</stage>

DotlhlIj nuq, joHwI'?</sp>

As you can see, a given line was labeled with an <sp> element, with @speaker and @addressee attributes. When multiple characters spoke or were addressed, @speaker or @addressee was given a value of 'multi' to resolve the issue. All affixes were encoded as <affix> elements with @type attributes, each with a value of either 'fear' or 'honorific.' Stage directions were endoded as <stage> elements, and luckily there were 0 tokens of affixes within stage elements.


Once the encoding was complete, the number of instances of each affix were counted, along with how often they appeared with a given pairing or speaker and addressee. It was quickly noted that this would be insufficient to provide interesting data, so the number of instances in which an affix appeared with just a given addressee were also counted. These counts, along with the totals were imported into a separate XML file for ease of translation into the visualizations discussed below.


Tokens for each type of affix were counted and divided according to their host element types (i.e. coded by speaker and addressee). Through use of XSLT and SVG, comparative bar graphs were created for both speaker-addressee pairings and just for the addressees. This second visualization was undertaken after it was noted that addressee was a considerably more effective determining factor for when an affix might be used than speaker or relative pairing. These graphs are based on percentages of the total number of either fear or honorific affixes a given character or character pairing accounted for. All visualizations involving fear were encoded orange, and visualizations involving honorifics were encoded blue.


Direct statistical analysis of a sample as small as this is not really possible, or at least it won't grant significant results. Given that no speaker or addressee ever spoke or received (respectively) a given affix more than 10 times, a chi-squared test to examine significance isn't even possible. This is a proof-of-concept and not actually a research project, but this is a notable limitation. It is anticipated that the relative frequency of the affixes across acts will remain the same, so if these methods were applied to data collected using the entire play, rather than the first third of it, they could potentially yield statistically significant results.

This doesn't stop the results from being interesting; it merely means to take what conclusions have been drawn with a grain of salt. The conclusions do point to noticeable trends in the data, but they only apply to the first third of this particular translation of this particular play and are by no means universal or necessarily representative of a hypothetical "true" Klingon.