Emotions are a crucial part of compelling narratives: literature tells us about people with goals, desires, passions, and intentions. In the past, the affective dimension of literature was mainly studied in the context of literary hermeneutics. However, with the emergence of the research field known as Digital Humanities (DH), some studies of emotions in a literary context have taken a computational turn. Given the fact that DH is still being formed as a field, this direction of research can be rendered relatively new. In this survey, we offer an overview of the existing body of research on sentiment and emotion analysis as applied to literature. The research under review deals with a variety of topics including tracking dramatic changes of a plot development, network analysis of a literary text, and understanding the emotionality of texts, among other topics.
Emotionen sind ein wichtiger Bestandteil überzeugender Erzählungen, Literatur beschreibt schließlich Menschen und ihre Ziele, Wünsche, Leidenschaften und Absichten. In der Vergangenheit wurde diese affektive Dimension hauptsächlich im Rahmen der literarischen Hermeneutik untersucht. Mit dem Aufkommen des Forschungsfeldes Digital Humanities (DH) wurde jedoch in einigen Studien bezüglich des Aspekts der Emotionen im literarischen Kontext eine Wende hin zu komputationellen Methoden vorgenommen. Diese Forschungsrichtung ist aktuell durch die Prozesse in den DH in einer Neugestaltung. In diesem Artikel berichten wir über den aktuellen Forschungsstand zur Sentiment- und Emotionsanalyse zur Analyse von Literatur. Wir behandeln eine Vielzahl von Themen, wie zum Beispiel die Veränderungen der emotionalen Konnotation im Verlauf eines Texts, der Netzwerkanalyse eines literarischen Textes und dem Verständnis der Emotionalität von Texten.
- 1 Introduction and Motivation
- 1.1 Emotions and Arts
- 2 Affect and Emotion
- 2.1 Ekman’s Theory of Basic Emotions
- 2.2 Plutchik’s Wheel of Emotions
- 2.3 Russel’s Circumplex Model
- 3 Emotion Analysis in Non-computational Literary Studies
- 4 Emotion and Sentiment Analysis in Computational Literary Studies
- 4.1 Emotion Classification
- 4.1.1 Classification based on emotions
- 4.1.2 Classification of happy ending vs. non-happy endings
- 4.2 Genre and Story-type Classification
- 4.2.1 Story-type clustering
- 4.2.2 Genre classification
- 4.3 Temporal Change of Sentiment
- 4.3.1 Topography of emotions
- 4.3.2 Tracking sentiment
- 4.3.3 Sentiment recognition in historical texts
- 4.4 Character Network Analysis and Relationship Extraction
- 4.4.1 Sentiment dynamics between characters
- 4.4.2 Character analysis and character relationships
- 4.5 Other Types of Emotion Analysis
- 4.5.1 Emotion flow analysis and visualization
- 4.5.2 Miscellaneous
- 5 Discussion and Conclusion
- Bibliographic References
- List of Figures with Captions
1 Introduction and Motivation
This article deals with emotion and sentiment analysis in computational literary studies. Following Liu, we define sentiment as a positive or negative feeling underlying the opinion. The term opinion in this sense is close to attitude in psychology and both sentiment analysis and opinion mining are often used interchangeably. Sentiment analysis is an area of computational linguistics that analyzes people’s sentiments and opinions regarding different objects or topics. Though sentiment analysis is primarily text-oriented, there are multimodal approaches as well.
Defining the concept of emotion is a challenging task. As Scherer puts it, defining emotion is a notorious problem. Indeed, different methodological and conceptual approaches to dealing with emotions lead to different definitions. However, the majority of emotion theorists agree that emotions involve a set of expressive, behavioral, physiological, and phenomenological features. In this view, an emotion can be defined as an integrated feeling state involving physiological changes, motor-preparedness, cognitions about action, and inner experiences that emerges from an appraisal of the self or situation.
Similar to sentiment, emotions can be analyzed computationally. However, the goal of emotion analysis is to recognize the emotion, rather than sentiment, which makes it a more difficult task as differences between emotions are subtler than those between positive and negative.
Although sentiment and emotion analysis are different tasks, our review of the literature shows that the use of either term is not always consistent. There are cases where researchers analyze only positive and negative aspects of a text but refer to their analysis as emotion analysis. Likewise, there are cases where researchers look into a set of subjective feelings including emotions but call it sentiment analysis. Hence, to avoid confusion, in this survey, we use the terms emotion analysis and sentiment analysis interchangeably. In most cases, we follow the terminology used by the authors of the papers we discuss (i.e., if they call emotions sentiments, we do the same).
Finally, we talk about sentiment and emotion analysis in the context of computational literary studies. Da defines computational literary studies as the statistical representation of patterns discovered in text mining fitted to currently existing knowledge about literature, literary history, and textual production. Computational literary studies are synonymous to distant reading and digital literary studies, each of which refers to the practice of running a textual analysis on a computer to yield quantitative results. In this survey, we use all of these terms interchangeably and when we refer to digital humanities as a field, we refer to those groups of researchers whose primary objects of study are texts.
1.1 Emotions and Arts
Much of our daily experiences influence and are influenced by the emotions we experience. This experience is not limited to real events. People can feel emotions because they are reading a novel or watching a play or a movie. There is a growing body of literature that pinpoints the importance of emotions for literary comprehension,  as well as research that recognizes the deliberate choices people make with regard to their emotional states when seeking narrative enjoyment such as a book or a film The link between emotions and arts in general is a matter of debate that dates back to the Ancient period, particularly to Plato, who viewed passions and desires as the lowest kind of knowledge and treated poets as undesirable members in his ideal society. In contrast, Aristotle’s view on emotive components of poetry expressed in his Poetics differed from Plato’s in that emotions do have great importance, particularly in the moral life of a person. In the late nineteenth century the emotion theory of arts stepped into the spotlight of philosophers. One of the first accounts on the topic is given by Leo Tolstoy in 1898 in his essay What is Art?. Tolstoy argues that art can express emotions experienced in fictitious context and the degree to which the audience is convinced of them defines the success of the artistic work.
New methods of quantitative research emerged in humanities scholarship bringing forth the so-called digital revolution and the transformation of the field into what we know as digital humanities. The adoption of computational methods of text analysis and data mining from the fields of then fast-growing areas of computational linguistics and artificial intelligence provided humanities scholars with new tools of text analytics and data-driven approaches to theory formulation.
To the best of our knowledge, the first work on a computer-assisted modeling of emotions in literature appeared in 1982. Challenged by the question of why some texts are more interesting than others, Anderson and McMaster concluded that the emotional tone of a story can be responsible for the reader’s interest. The results of their study suggest that a large-scale analysis of the emotional tone of a collection of texts is possible with the help of a computer program. There are two implications of this finding. First, they suggested that by identifying emotional tones of text passages one can model affective patterns of a given text or a collection of texts, which in turn can be used to challenge or test existing literary theories. Second, their approach to affect modeling demonstrates that the stylistic properties of texts can be defined on the basis of their emotional interest and not only their linguistic characteristics. With regard to these implications, this work is an important early piece as it laid out a roadmap for some of the basic applications of sentiment and emotion analysis of texts, namely sentiment and emotion pattern recognition from text and computational text characterization based on sentiment and emotion.
With the development of research methods used by digital humanities researchers, the number of approaches and goals of emotion and sentiment analysis in literature has grown. The goal of this survey is to provide an overview of these recent methods of emotion and sentiment analysis as applied to a text. The survey is directed at researchers looking for an introduction to the existing research in the field of sentiment and emotion analysis of a (primarily, literary) text. The survey does not cover applications of emotion and sentiment analysis in the areas of digital humanities that are not focused on text. Neither does it provide an in-depth overview of all possible applications of emotion analysis in the computational context outside of the DH line of research.
2 Affect and Emotion
The history of emotion research has a long and rich tradition that followed Darwin’s 1872 publication of The Expression of the Emotions in Man and Animals. The subject of emotion theories is vast and diverse. We refer the reader to Maria Gendron’s paper for a brief history of ideas about emotion in psychology. Here, we will focus on three views on emotion that are popular in computational analysis of emotions: Ekman’s theory of basic emotions, Plutchik’s wheel of emotion, and Russel’s circumplex model.
2.1 Ekman’s Theory of Basic Emotions
The basic emotion theory was first articulated by Silvan Tomkins in the early 1960s. Tomkins postulated that each instance of a certain emotion is biologically similar to other instances of the same emotion or shares a common trigger. One of Tomkins’ mentees, Paul Ekman, put in question the existing emotion theories that proclaimed that facial expressions of emotion are socially learned and therefore vary from culture to culture. Ekman, Sorenson and Friesen challenged this view in a field study with the outcome that facial displays of fundamental emotions are not learned but innate. However, there are culture-specific prescriptions about how and in which situations emotions are displayed.
Based on the observation of facial behavior in early development or social interaction, Ekman’s theory also postulates that emotions should be considered discrete categories rather than continuous. Though this view allows for conceiving of emotions as having different intensities, it does not allow emotions to blend and leaves no room for more complex affective states in which individuals report the co-occurrence of like-valenced discrete emotions. This and other theory postulates were widely criticized and disputed in literature.
2.2 Plutchik’s Wheel of Emotions
Another influential model of emotions was proposed by Robert Plutchik in the early 1980s. The important difference between Plutchik’s theory and Ekman’s theory is that apart from a small set of basic emotions, all other emotions are mixed and derived from the various combinations of basic ones. He further categorized these other emotions into the primary dyads (very likely to co-occur), secondary dyads (less likely to co-occur) and tertiary dyads (seldom co-occur).
In order to represent the organization and properties of emotions as defined by his theory, Plutchik proposed a structural model of emotions known nowadays as Plutchik’s wheel of emotions. The wheel Figure 1 is constructed in the fashion of a color wheel, with similar emotions placed closer together and opposite emotions 180 degrees apart. The intensity of an emotion in the wheel depends on how far from the center a part of a petal is, i.e., emotions become less distinguishable the further they are from the center of the wheel. Essentially, the wheel is constructed from eight basic bipolar emotions: joy versus sorrow, anger versus fear, trust versus disgust, and surprise versus anticipation. The blank spaces between the leaves are so-called primary dyads – emotions that are mixtures of two of the primary emotions.
The wheel model of emotions proposed by Plutchik had a great impact on the field of affective computing being primarily used as a basis for emotion categorization in emotion recognition from text. However, some postulates of the theory are criticized, for example, there is no empirical support for the wheel structure. Another criticism is that Plutchik’s model of emotion does not explain the mechanisms by which love, hate, relief, pride, and other everyday emotions emerge from the basic emotions, nor does it provide reliable measurements of these emotions.
2.3 Russel’s Circumplex Model
Attempts to overcome the shortcomings of basic emotions theory and its unfitness for clinical studies led researchers to suggest various dimensional models, the most prominent of which is the circumplex model of affect proposed by James Russel. The word circumplex in the name of the model refers to the fact that emotional episodes do not cluster at the axes but rather at the periphery of a circle Figure 2. At the core of the circumplex model is the notion of two dimensions plotted on a circle along horizontal and vertical axes. These dimensions are valence (how pleasant or unpleasant one feels) and arousal (the degree of calmness or excitement). The number of dimensions is not strictly fixed and there are adaptations of the model that incorporate more dimensions. One example of this is the Valence-Arousal-Dominance model that adds an additional dimension of dominance, the degree of control one feels over the situation that causes an emotion.
By moving from discrete categories to a dimensional representation, the researchers are able to account for subjective experiences that do not fit nicely into the isolated non-overlapping categories. Accordingly, each affective experience can be depicted as a point in a circumplex that is described by only two parameters – valence and arousal – without need for labeling or reference to emotion concepts for which a name might only exist in particular subcommunities or which are difficult to describe. However, the strengths of the model turned out to be its weaknesses: for example, it is not clear whether there are basic dimensions in the model nor is it clear what should be done with qualitatively different events of fear, anger, embarrassment and disgust that fall in identical places in the circumplex structure. Despite these shortcomings, the circumplex model of affect is widely used in psychologic and psycholinguistic studies. In computational linguistics, the circumplex model is applied when the interest is in continuous measurements of valence and arousal rather than in the specific discrete emotional categories.
3 Emotion Analysis in Non-computational Literary Studies
Until the end of the twentieth century, literary and art theories often disregarded the importance of the aesthetic and affective dimension of literature, which in part stemmed from the rejection of old-fashioned literary history that had explained the meaning of art works by the biography of the author. However, the affective turn taken by a wide range of disciplines in the past two decades – from political and sociological sciences to neurosciences or media studies – has refueled the interest of literary critics in human affects and sentiments.
We said in Section 1 that there seems to be a consensus among literary critics that literary art and emotions go hand in hand. However, one might be challenged to define the specific way in which emotions come into play in the text. The exploration of this problem is presented by van Meel. Underpinning the centrality of human destiny, hopes, and feelings in the themes of many artworks – from painting to literature – van Meel explores how emotions are involved in the production of arts. Pointing out big differences between the two media in their attempts to depict human emotions (painting conveys nonverbal behavior directly, but lacks temporal dimensions that novels have and use to describe emotions), van Meel provides an analysis of the nonverbal descriptions used by the writers to convey their characters’ emotional behavior. Description of visual characteristics, van Meel speculates, responds to a fundamental need of a reader to build an image of a person and their behavior. Moreover, nonverbal descriptions add important information that can in some cases play a crucial hermeneutical role, such as in Kafka’s Der Prozess, where the fatal decisions for K. are made clear by gestures rather than words. His verdict is not announced, but is implied by the judge who refuses a handshake. The same applies to his death sentence that is conveyed to him by his executioners playing with a butcher’s knife above his head.
A hermeneutic approach through the lense of emotions is presented by Kuivalainen and provides a detailed analysis of linguistic features that contribute to the characters’ emotional involvement in Mansfield’s prose. The study shows how, through the extensive use of adjectives, adverbs, deictic markers, and orthography, Mansfield steers the reader towards the protagonist’s climax. Subtly shifting between psycho-narration and free indirect discourse, Mansfield is making use of evaluative and emotive descriptors in psycho-narrative sections, often marking the internal discourse with dashes, exclamation marks, intensifiers, and repetition that thus trigger an emotional climax. Various deictic features introduced in the text are used to pinpoint the source of emotions, which helps in creating a picture of characters’ emotional world. Verbs (especially in the present tense), adjectives, and adverbs serve the same goal in Mansfield’s prose of describing the characters’ emotional world. Going back and forth from psycho-narration to free indirect discourse provides Mansfield with a tool to point out the significant moments in the protagonists’ lives and establish a separation between characters and narration.
Both van Meel’s and Kuivalainen’s works, separated from each other by more than a decade, underpin the importance of emotions in the interpretation of characters’ traits, hopes, and tragedy. Other authors find these connections as well. For example, Barton proposes instructional approaches to teach school-level readers to interpret character’s emotions and use this information for story interpretation. Van Horn shows that understanding characters emotionally or trying to help them with their problems made reading and writing more meaningful for middle school students.
Emotions in text are often conveyed with emotion-bearing words. At the same time their role in the creation and depiction of emotion should not be overestimated. That is, saying that someone looked angry or fearful or sad, as well as directly expressing characters’ emotions, are not the only ways authors build believable fictional spaces filled with characters, action, and emotions. In fact, many novelists strive to express emotions indirectly by way of figures of speech or catachresis, first of all because emotional language can be ambiguous and vague, and, second, to avoid any allusions to Victorian emotionalism and pathos.
How can an author convey emotions indirectly? A book chapter by Hillis Miller in Exploring Text and Emotions seeks the answer to exactly this question. Using Conrad’s Nostromo opening scenes as material, Hillis Miller shows how Conrad’s descriptions of an imaginary space generate emotions in readers without direct communication of emotions. Conrad’s Nostromo opening chapter is an objective description of Sulaco, an imaginary land. The description is mainly topographical and includes occasional architectural metaphors, but it combines wide expanse with hermetically sealed enclosure, which generates depthless emotional detachment. Through the use of present tense, Conrad makes the readers suggest that the whole scene is timeless and does not change. The topographical descriptions are given in a pure materialist way: there is nothing behind clouds, mountains, rocks, and sea that would matter to humankind, not a single feature of the landscape is personified, and not a single topographical shape is symbolic. Knowingly or unknowingly, Miller argues, by telling the readers what they should see – with no deviations from truth – Conrad employs a trope that perfectly matches Kant’s concept of the sublime. Kant’s view of poetry was that true poets tell the truth without interpretation; they do not deviate from what their eyes see. Conrad, or to be more specific, his narrator in Nostromo, is an example of sublime seeing with a latent presence of strong emotions. On the one hand, Conrad’s descriptions are cool and detached. This coolness is caused by the indifference of the elements in the scene. On the other hand, by dehumanizing sea and sky, Conrad generates awe, fear, and a dark foreboding about the kinds of life stories that are likely to be enacted against such a backdrop.
Hillis Miller’s analysis resonates with some premises from emotion theory that we have discussed previously, namely, Plutchik’s belief that emotions should be studied not by a certain way of expression but by the overall behavior of a person. Considering that such a formula cannot be applied to all literary theory studies about emotions (as not all authors choose to convey emotions indirectly, as well as not all authors tend to comment on characters’ nonverbal emotional behavior), it seems that one should search for a balance between low-level linguistic feature analysis of emotional language and a rigorous high-level hermeneutic inquiry dissecting the form of the novel and its under-covered philosophical layers.
4 Emotion and Sentiment Analysis in Computational Literary Studies
With this section, we proceed to an overview of the existing body of research on computational analysis of emotion and sentiment in computational literary studies. An overview of the papers including their properties is shown in Table 1. The table, as well as this section, is divided into several subsections, each of which corresponds to a specific application of emotion and sentiment analysis to literature. Section 4.1 reviews the papers that deal with the classification of literary texts in terms of emotions they convey; Section 4.2 examines the papers that address text classification by genre or other story-types based on sentiment and emotion features; Section 4.3 is dedicated to research in modeling sentiments and emotions in texts from previous centuries, as well as research dealing with applications of sentiment analysis to texts written in the past; Section 4.4 provides an overview of sentiment analysis applications to character analysis and character network construction, and Section 4.5 is dedicated to more general applications of sentiment and emotion analysis to literature.
4.1 Emotion Classification
A straightforward approach to sentiment and emotion analysis is phrasing them as a text classification. A fundamental question of such a classification is how to find the best features and algorithms to classify the data (sentences, paragraphs, entire documents) into predefined classes. When applied to literature, such a classification may be of use for grouping different literary texts in digital collections based on the emotional properties of the stories. For example, books or poems can be grouped based on the emotions they convey or based on whether or not they have happy endings or not.
4.1.1 Classification based on emotions
Barros et al. aim at answering two research questions: 1) is the classification of Quevedo’s works proposed by the literary scholars consistent with the sentiment reflected by the corresponding poems?; and 2) which learning algorithms are the best for the classification? To that end, they perform a set of experiments on the classification of 185 Francisco de Quevedo’s poems that are divided by literary scholars into four categories and that Barros et al. map to emotions of joy, anger, fear, and sadness. Using the terms joy, anger, fear, and sadness as points of reference, Barros et al. construct a list of emotion words by looking up the synonyms of English emotion words and adjectives associated with these four emotions and translating them into Spanish. Each poem is converted into a vector where each item is a normalized count of words relating to a certain emotion. The experiments with different algorithms show the superiority of decision trees achieving accuracy of almost 60%. However, this result is biased by an unbalanced distribution of classes. To avoid the bias, Barros et al. apply a resampling strategy that leads to a more balanced distribution and repeat the classification experiments. After resampling, the accuracy of decision trees in a 10-fold cross validation achieves 75,13%, thus demonstrating an improvement over the previous classification performance. Based on these results the authors conclude that a meaningful classification of the literary pieces based only on the emotion information is possible.
Reed offers a proof-of-concept for performing sentiment analysis on a corpus of twentieth-century American poetry. Specifically, Reed analyzes the expression of emotions in the poetry of the Black Arts Movement of the 1960s and 1970s. The paper describes the project Measured Unrest in the Poetry of the Black Arts Movement whose goal is to understand 1) how the feelings associated with injustice are coded in terms of race and gender, and 2) what sentiment analysis can show us about the relations between affect and gender in poetry. Reed notes that surface affective value of the words does not always align with their more nuanced affective meaning shaped by poetic, social, and political contexts.
Yu explores what linguistic patterns characterize the genre of sentimentalism in early American novels. To that end, they construct a collection of five novels from the mid-nineteenth century and annotate the emotionality of each of the chapters as high or low. The respective chapters are then classified using support-vector machines and naïve Bayes classifiers as highly emotional or the opposite. The results of the evaluation suggest that arbitrary feature reduction steps such as stemming and stopword removal should be taken very carefully, as they may affect the prediction. For example, Yu shows that no stemming leads to better classification results. A possible explanation is that stemming conflates and neutralizes a large number of discriminative features. The author provides an example of such a conflation with the words wilderness and wild. While the latter can appear anywhere in the text, the former one is primarily encountered in the chapters filled with emotions.
4.1.2 Classification of happy ending vs. non-happy endings
Zehe et al. argue that automatically recognizing a happy ending as a major plot element could help to better understand a plot structure as a whole. To show that this is possible, they classify 212 German novels written between 1750 and 1920 as having happy or non-happy endings. A novel is considered to have a happy ending if the situation of the main characters in the novel improves towards the end or is constantly favorable. The novels were manually annotated with this information by domain experts. For feature extraction, the authors first split each novel into n segments of the same length. They then calculate sentiment values for each of the segments by counting the occurrences of words that appear in the respective segment and that are found in the German version of the NRC Word-Emotion Association Lexicon and divide this number by the length of the dictionary. Finally, they calculate the sentiment score for the sections by taking the average of all sentiment scores in the segments that are part of the section. These steps are then followed by classification with a support-vector machine and the F1 score of 0.73, which the authors consider a good starting point for future work.
4.2 Genre and Story-type Classification
The papers we have discussed so far focus on understanding the emotion associated with units of texts. This extracted information can further be used for downstream tasks and also for downstream evaluations. We discuss the following downstream classification cases here. The papers in this category use sentiment and emotion features for a higher-level classification, namely story-type clustering and literary genre classification. The assumption behind these works is that different types of literary text may show different composition and distribution of emotion vocabulary and thus can be classified based on this information. The hypothesis that different literary genres convey different emotions stems from common knowledge: we know that horror stories instill fear and that mysteries evoke anticipation and anger while romances are filled with joy and love. However as we will see in this section, the task of automatic classification of these genres is not always that straightforward and reliable.
4.2.1 Story-type clustering
Similarly to Zehe et al., Reagan et al. are interested in automatically understanding a plot structure as a whole, not limited to a book ending. The inspiration for their work comes from Kurt Vonnegut’s lecture on emotional arcs of stories. Reagan et al. test the idea that the plot of each story can be plotted as an emotional arc, i.e. a time series graph, where the x-axis represents a time point in a story, and the y-axis represents the events happening to the main characters that can be favorable (peaks on a graph) or unfavorable (troughs on a graph). As Vonnegut puts it, the stories can be grouped by these arcs and the number of such groupings is limited. To test this idea, Reagan et al. collect the 1,327 most popular books from the Project Gutenberg. Each book is then split into segments for which sentiment scores (happy vs. sad) are calculated and compared. The results of the analysis show support for six emotional patterns that are shared between subgroupings of the corpus:
- Rise: the arc starts at a low point and steadily increases towards the end;
- Fall: the arc starts at a high point and steadily decreases towards the end;
- Fall-rise: the arc drops in the middle of the story but increases towards the end;
- Rise-fall: the arc hits the high point in the middle of the story and decreases towards the end;
- Rise-fall-rise: the arc fluctuates between high and low points but ends with an increase;
- Fall-rise-fall: the arc fluctuates between high and low points but ends with a decrease.
Additionally, Reagan et al. find that Icarus, Oedipus, and Man in the hole arcs are the three most popular emotional arcs among readers, based on download counts.
4.2.2 Genre classification
There are other studies that are similar in spirit to the work done by Reagan. Samothrakis and Fasli examine the hypothesis that different genres clearly have different emotion patterns to reliably classify them with machine learning. To that end, they collect works of the genres mystery, humor, fantasy, horror, science fiction and western from the Project Gutenberg.
Using WordNet-Affect to detect emotion words as categorized by Ekman’s fundamental emotion classes, they calculate an emotion score for each sentence in the text. Each work is then transformed into six vectors, one for each basic emotion. A random forest classifier achieves a classification accuracy of 0.52. This is significantly higher than a random baseline, which allows the authors to conclude that such a classification is feasible.
A study by Kim et al. originates from the same premise as the work by Samothrakis and Fasli but puts emphasis on finding genre-specific correlations of emotion developments. Extending the set of tracked emotions to Plutchik’s classification, Kim et al. collect 2,000 books from the Project Gutenberg that belong to five genres found in the Brown corpus, namely adventure, science fiction, mystery, humor and romance. The authors extend the set of classification algorithms beyond random forests using a multi-layer perceptron and convolutional neural networks, which achieves the best performance (0.59 F1-score). To understand how uniform the emotion patterns in different genres are, the authors introduce the notion of prototypicality, which is computed as average of all emotion scores. Using this as a point of reference for each genre Kim et al. use Spearman correlation to calculate the uniformity of emotions per genre. The results of this analysis suggest that fear and anger are the most salient plot devices in fiction, while joy is only of mediocre stability, which is in line with findings of Samothrakis and Fasli.
The study by Henny-Krahmer pursues two goals: 1), to test whether different subgenres of Spanish American literature differ in degree and kind of emotionality, and 2), whether emotions in the novels are expressed in direct speech of characters or in narrated text. To that end, they conduct a subgenre classification experiment on a corpus of Spanish American novels using sentiment values as features. To answer the first question, each novel is split into five segments and for each sentence in the segment the emotion score (polarity values + Plutchik’s basic emotions) is calculated using SentiWordNet and NRC dictionaries. The classifier achieves an average F1 of 0.52, which is higher than the most-frequent class baseline and, hence, provides a support for emotion-based features in subgenre classification. The analysis of feature importance shows that the most salient features come from the sentiment scores calculated from the characters’ direct speech and that novels with higher values of positive speech are more likely to be sentimental novels.
There are some limitations to the studies presented in this section. On the one hand, it is questionable how reliable coarse emotion scoring is that takes into account only presence or absence of words found in specialized dictionaries and overlooks negations and modifiers that can either negate an emotion word or increase/decrease its intensity. On the other hand, a limited view of the emotional content as a sum of emotion bearing words reserves no room for qualitative interpretation of the texts – it is not clear how one can distinguish between emotion words used by the author to express their sentiment, between words used to describe characters’ feelings, and emotion words that characters use to address or describe other characters in a story.
4.3 Temporal Change of Sentiment
The papers that we have reviewed so far approach the problem of sentiment and emotion analysis as a classification task. However, applications of sentiment analysis are not only limited to classification. In other fields, for example computational social sciences, sentiment analysis can be used for analyzing political preferences of the electorate or for mining opinions about different products or topics. Similarly, several digital humanities studies incorporate sentiment analysis methods in a task of mining sentiments and emotions of people who lived in the past. The goal of these studies is not only to recognize sentiments, but also to understand how they were formed.
4.3.1 Topography of emotions
Heuser et al. start with a premise that emotions occur at a specific moment in time and space, thus making it possible to link emotions to specific geographical locations. Consequently, having such information at hand, one can understand which emotions are hidden behind certain landmarks. As a proof-of-concept, Heuser et al. build an interactive map, Mapping emotions in Victorian London, where each location is tagged with emotion labels. To construct a corpus for their analysis, Heuser et al. collect a large corpus of English books from the eighteenth and nineteenth century and extract 383 geographical locations of London that have at least ten mentions each. The resulting corpus includes 15,000 passages, each of which has a toponym in the middle and 100 words directly preceding and following the location mention. The data is then given to annotators who are asked to define whether each of the passages expressed happiness or fear, or neutrality. The same data is also analyzed by a custom sentiment analysis program that would assign each passage one of these emotion categories.
Some striking observations are made with regard to the data analysis. First, there is a clear discrepancy between fiction and reality – while toponyms from the West End with Westminster and the City are over-represented in the books, the same does not hold true for the East End with Tower Hamlets, Southwark, and Hackney. Hence, there is less information about emotions pertaining to these particular London locations. Another striking detail is that the resulting map is dominated by the neutral emotion. Heuser et al. argue that this has nothing to do with the absence of emotions but rather stems from the fact that emotions tend to be silenced in public domain, which influenced the annotators decision.
The space and time context are also used by Bruggman and Fabrikant who model sentiments of Swiss historians towards places in Switzerland in different historical periods. As the authors note, it is unlikely that a historian will directly express attitudes towards certain toponyms, but it is very likely that words they use to describe those can bear some negative connotation (e.g. cholera, death). Correspondingly, such places should be identified as bearing negative sentiment by a sentiment analysis tool. Additionally, they study the changes of sentiment towards a particular place over time. Using the General Inquirer (GI) lexicon to identify positive and negative terms in the document, they assign each document a sentiment score by summing up the weights of negative and positive words and normalizing them by the document length. The authors conclude that the results of their analysis look promising, especially regarding negatively scored articles. However, the authors find difficulties in interpreting positively ranked documents, which may be due to the fact that negative information is more salient.
4.3.2 Tracking sentiment
Other papers in this category link sentiment and emotion to certain groups, rather than geographical locations. The goal of these studies is to understand how sentiment within and towards these groups was formed.
Taboada et al. aim at tracking the literary reputation of six authors writing in the first half of the twentieth century. The research questions raised in the project are how the reputation is made or lost, and how to find correlation between what is written about the author and their work to the author’s reputation and subsequent canonicity. To that end, the project’s goal is to examine critical reviews of six authors’ writing and to map information contained in texts critical to the author’s reputation. The material they work with includes not only reviews, but also press notes, press articles, and letters to editors (including from the authors themselves). For the pilot project with Galsworthy and Lawrence they collected and scanned 330 documents (480,000 words). The documents are tagged for the parts of speech and relevant words (positive and negative) are extracted using custom-made sentiment dictionaries. The sentiment orientation of rhetorically important parts of the texts is then measured.
Chen et al. aim to understand personal narratives of Korean comfort women who had been forced into sexual slavery by Japanese military during World War II. Adapting the WordNet-Affect lexicon, Chen et al. build their own emotion dictionary to spot emotional keywords in women’s stories and map the sentences to emotion categories. By adding variables of time and space, Chen et al. provide a unified framework of collective remembering of this historical event as witnessed by the victims.
Finally, an interesting project to follow is the Oceanic Exchanges project that started in late 2017. One goal of the project is to trace information exchange in nineteenth-century newspapers and journals using sentiment as one of the variables under analysis.
4.3.3 Sentiment recognition in historical texts
Other papers put emphasis not so much on the sentiments expressed by writers but instead focus on the particularities of historical language.
Marchetti et al. and Sprugnoli et al.  present the integration of sentiment analysis in the ALCIDE (Analysis of Language and Content In a Digital Environment) project. The sentiment analysis module is based on WordNet-Affect, SentiWordNet and MultiWordNet. Each document is assigned a polarity score by summing up the words with prior polarity and dividing by the number of words in the document. A positive global score leads to a positive document polarity and a negative global score leads to a negative document polarity. The overall conclusion of their work is that the assignment of a polarity in the historical domain is a challenging task largely due to lack of agreement on polarity of historical sources between human annotators.
Challenged by the problem of applicability of existing emotion lexicons to historical texts, Buechel et al. propose a new method of constructing affective lexicons that would adapt well to German texts written up to three centuries ago. In their study, Buechel et al. use the representation of affect based on the Valence-Arousal-Dominance model (an adaptation of Russel’s circumplex model, see Section 2.3). Presumably, such a representation provides a finer-grained insight into the literary text, which is more expressive than discrete categories, as it quantifies the emotion along three different dimensions. As a basis for the analysis, they collect German texts from the Deutsches Textarchiv written between 1690 and 1899. The corpus is split into seven slices, each spanning 30 years. For each slice they compute word similarities and obtain seven distinct emotion lexicons, each corresponding to specific time period. This allows for, the authors argue, the tracing of the shift in emotion association of words over time.
Finally, Leemans et al. aim to trace historical changes in emotion expressions and to develop methods to trace these changes in a corpus of 29 Dutch language theatre plays written between 1600 and 1800. Expanding the Dutch version of Linguistic Inquiry and Word Count (LIWC) dictionary with historical terms, the authors are able to increase the recall of emotion recognition with a dictionary. In addition, they develop a fine-grained vocabulary mapping body terms to emotions, and show that a combination of LIWC and their lexicon lead to improvement in the emotion recognition.
4.4 Character Network Analysis and Relationship Extraction
The papers reviewed above address sentiment analysis of literary texts mainly on a document level. This abstraction is warranted if the goal is to get an insight into the distribution of emotions in a corpus of books. However, emotions depicted in books do not exist in isolation but are associated with characters who are at the core of any literary narrative. This leads us to ask what sentiment and emotion analysis can tell us about the characters. How emotional are they? And what role do emotions play in their interaction?
Character relationships have been analyzed in computational linguistics from a graph theoretic perspective, particularly using social network analysis. Fewer works, however, address the problem of modeling character relationships in terms of sentiment. Below we provide an overview of several papers that propose the methodology for extracting this information.
4.4.1 Sentiment dynamics between characters
Several studies present automatic methods for analyzing sentiment dynamics between plays’ characters. The goal of the study by Nalisnick and Baird is to track the emotional trajectories of interpersonal relationships. The structured format of a dialog allows them to identify who is speaking to whom, which makes it possible to mine character-to-character sentiment by summing the valence values of words that appear in the continuous direct speech and are found in the lexicon of affective norms. The extension of the previous research from the same authors introduces the concept of a sentiment network, a dynamic social network of characters. Changing polarities between characters are modeled as edge weights in the network. Motivated by the desire to explain such networks in terms of a general sociological model, Nalisnick and Baird test whether Shakespeare’s plays obey the Structural Balance Theory by Marvel et al. that postulates that a friend of a friend is also your friend. Using the procedure proposed by Marvel et al. on their Shakespearean sentiment networks, Nalisnick and Baird test whether they can predict how a play’s characters will split into factions using only information about the state of the sentiment network after Act II. The results of their analysis are varied and do not provide adequate support for the Structural Balance Theory as a benchmark for network analysis in Shakespeare’s plays. One reason for that, as the authors state, is inadequacy of their shallow sentiment analysis methods that cannot detect such elements of speech as irony and deceit that play a pivotal role in many literary works.
4.4.2 Character analysis and character relationships
Elsner aims at answering the question of how to represent a plot structure for summarization and generation tools. To that end, Elsner presents a kernel for comparing novelistic plots at the level of character interactions and their relationships. Using sentiment as one of the characteristics of a character, Elsner demonstrates that the kernel approach leads to meaningful plot representation that can be used for a higher-level processing.
Kim and Klinger aim at understanding the causes of emotions experienced by literary characters. To that end, they contribute the REMAN corpus of literary texts with annotations of emotions, experiencers, causes and targets of the emotions. The goal of the project is to enable the automatic extraction of emotions and causes of emotions experienced by the characters. The authors suggest that the results of coarse-grained emotion classification in literary text are not readily interpretable as they do not tell much about who the experiencer of the emotion is. Indeed, if a text mentions two characters, one of whom is angry and another one who is scared because of that, text classification models will only tell us that the text is about anger and fear. Hence, a finer-grained approach towards character relationship extraction is warranted. Kim and Klinger conduct experiments on the annotated dataset showing that the fine-grained approach to emotion prediction with long short-term memory networks outperforms bag-of-words models (an increase in F1 by 12 pp). At the same time, the results of their experiments suggest that joint prediction of emotions and experiencers can be more beneficial than studying these categories separately.
Barth et al. develop the character relation analysis tool rCAT with the goal of visualization and analysis of character networks in a literary text. The tool implements a distance parameter (based on token space) for finding pairs of interacting characters. In addition to the general context words that characterize each pair of characters, the tool provides an emotion filter to restrict character relationship analysis to emotions only.
A tool presented by Jhavar and Mirza provides a similar functionality: given an input of two character names from the Harry Potter series, the EMoFiel tool identifies the emotion flow between a given directed pair of story characters. These emotions are identified using categorical and continuous emotion models.
Egloff et al. present an ongoing work on the Ontology of Literary Characters (OLC) that allows us to capture and infer characters’ psychological traits from their linguistic descriptions. The OLC incorporates the Ontology of Emotion that is based on both Plutchik’s and Hourglass’s models of emotions. The ontology encodes 32 emotion concepts. Based on their natural language description, characters are attributed to a psychological profile along the classes of Openness to experience, Conscientiousness, Extraversion, Agreeableness, and Neuroticism. The ontology links each of these profiles to one or more archetypal categories of hero, anti-hero, and villain. Egloff et al. argue that, by using the semantic connections of the OLC, it is possible to infer the characters’ psychological profiles and the role they play in the plot.
Kim and Klinger propose a new task of emotion relationship classification between fictional characters. They argue that joining character network analysis with sentiment and emotion analysis may contribute to a computational understanding of narrative structures, as characters are at the center of any plot development. Building a corpus of 19 fan fiction short stories and annotating it with emotions, Kim and Klinger propose several models to classify emotion relations of characters. They show that a deep learning architecture with character position indicators is the best for the task of predicting both directed and undirected emotion relations in the associated social network graph. As an extension to this study, Kim and Klinger explore how emotions are expressed between characters in the same corpus via various non-verbal communication channels. They find that facial expressions are predominantly associated with joy while gestures and body postures are more likely to occur with trust.
Finally, a small body of work focuses on mathematical modeling of character relationships. Rinaldi et al. contribute a model that describes the love story between the Beauty and the Beast through ordinary differential equations. Zhuravlev et al. introduce a distance function to model the relationship between the protagonist and other characters in two masochistic short novels by Ivan Turgenev and Sacher-Masoch. Borrowing some instruments from the literary criticism and using ordinary differential equations, Zhuravlev et al. are able to reproduce the temporal and spatial dynamics of the love plot in the two novellas more precisely than it had been done in previous research. Jafari et al. present a dynamic model describing the development of character relationships based on differential equations. The proposed model is enriched with complex variables that can represent complex emotions such as coexisting love and hate.
4.5 Other Types of Emotion Analysis
We have seen that sentiment analysis as applied to literature can be used for a number of downstream tasks, such as classification of texts based on the emotions they convey, genre classification based on emotions, and sentiment analysis in the historical domain. However, the application of sentiment analysis is not limited to these tasks. In this concluding part of the survey, we review some papers that do not formulate their approach to sentiment analysis as a downstream task. Often, the goal of these works is to understand how sentiments and emotions are represented in literary texts in general, and how sentiment or emotion content varies across specific documents or a collection of them with time, where time can be either relative to the text in question (from beginning to end) or to the historical changes in language (from past to present). Such information is valuable for gaining a deeper insight into how sentiments and emotions change over time, allowing us to bring forward new theories or shed more light onto existing literary or sociological theories.
4.5.1 Emotion flow analysis and visualization
A set of authors aimed to visualize the change of emotion content through texts or across time. One of the earliest works in this direction is a paper by Anderson and McMaster that starts from the premise that reading enjoyment stems from the affective tones of a text. These affective tones create a conflict that can rise to a climax through a series of crises, which is necessary for a work of fiction to be attractive to the reader. Using a list of 1,000 of the most common English words annotated with valence, arousal, and dominance ratings, they calculate the conflict score by taking the mean of the ratings for each word in a text passage. The more negative the score is, the higher the conflict is, and vice versa. Additionally, they plot conflict scores for each consecutive 100 words of a test story and provide qualitative analysis of the peaks. They argue that a reader who has access to the text would be able to find correlation between events in the story and peaks on the graph. However, the authors still stress that such interpretation remains dependent upon the judgement of the reader. Further, other contributions by the authors are based on the same premises.
Alm and Sproat present the results of the emotion annotation task of 22 tales by the Grimm brothers and evaluate patterns of emotional story development. They split emotions into positive and negative categories and divide each story into five parts from which aggregate frequency counts of combined emotion categories are computed. The resulting numbers are plotted on a graph that shows a wave-shaped pattern. From this graph, Alm and Sproat argue, one can see that the first part of the fairy tales is the least emotional, which is probably due to scene setting, while the last part shows an increase in positive emotions, which may signify the happy ending.
Two other studies by Mohammad focus on differences in emotion word density as well as emotional trajectories between books of different genres. Emotion word density is defined as a number of times a reader will encounter an emotion word on reading every X words. In addition, each text is assigned several emotion scores for each emotion that are calculated as a ratio of words associated with one emotion to the total number of emotion words occurring in a text. Both metrics use the NRC Affective Lexicon to find occurrences of emotion words. They find that fairy tales have significantly higher anticipation, disgust, joy and surprise word densities, but lower trust word densities when compared to novels.
A work by Klinger et al. is a case study in an automatic emotion analysis of Kafka’s Amerika and Das Schloss. The goal of the work is to analyze the development of emotions in both texts as well as to provide a character-oriented emotion analysis that would reveal specific character traits in both texts. To that end, Klinger et al. develop German dictionaries of words associated with Ekman’s fundamental emotions plus contempt and apply them to both texts in question to automatically detect emotion words. The results of their analysis for Das Schloss show a striking increase of surprise towards the end and a peak of fear shortly after start of chapter 3. In the case of Amerika, the analysis shows that there is a decrease in enjoyment after a peak in chapter 4.
Yet another work that tracks the flow of emotions in a collection of texts is presented by Kim et al. The authors hypothesize that literary genres can be linked to the development of emotions over the course of text. To test this, they collect more than 2,000 books from five genres (adventure, science fiction, mystery, humor and romance) from Project Gutenberg and identify prototypical emotion shapes for each genre. Each novel in the corpus is split into five consecutive equally-sized segments (following the five-act theory of dramatic acts). All five genres show close correspondence with regard to sadness, anger, fear and disgust, i.e., a consistent increase of these emotions from Act 1 to Act 5, which may correspond to an entertaining narrative. Mystery and science fiction books show increase in anger towards the end, and joy shows an inverse decreasing pattern from Act 1 to Act 2, with the exception of humor.
The work by Kakkonen and Galic Kakkonen aims at supporting the literary analysis of Gothic texts at the sentiment level. The authors introduce a system called SentiProfiler that generates visual representations of affective content in such texts and outlines similarities and differences between them, however, without considering the temporal dimension. The SentiProfiler uses WordNet-Affect to derive a list of emotion-bearing words that will be used for analysis. The resulting sentiment profiles for the books are used to visualize the presence of sentiment in a particular document and to compare two different texts.
In this section, we review studies that are different in goals and research questions from the papers presented in previous sections and do not constitute a category on their own.
Koolen claims that there is a bias among readers that put works by female authors on par with »women’s books«, which, as stated by the author, tend to be perceived as of lower literary quality. She investigates how much »women’s books« (here, romantic novels written by women) differ from novels perceived as literary (female and male-authored literary fiction). The corpus used in the study is a collection of European and North-American novels translated into Dutch. Koolen uses a Dutch version of the Linguistic Inquiry and Word Count, a dictionary that contains content and sentiment-related categories of words to count the number of words from different categories in each type of fiction. Her analysis shows that romantic novels contain more positive emotions and words pertaining to friendship than in literary fiction. However, female-authored literary novels and male-authored ones do not significantly differ on any category.
Kraicer and Piper explore the women’s place within contemporary fiction starting from the premise that there is a near ubiquitous underrepresentation and decentralization of women. As a part of their analysis, Kraicer and Piper use sentiment scores to look at social balance and »antagonism«, i.e., how different gender pairings influence positive and negative language surrounding the co-occurrence of characters (using the sentiment dictionary presented by Liu to calculate a sentiment score for a character pair). Having analyzed a set of 26,450 characters from 1,333 novels published between 2001 and 2015, the authors find that sentiment scores give little indication that the character’s gender has an effect on the state of social balance.
Morin and Acerbi focus on larger-scale data spanning a hundred thousand of books. The goal of their study is to understand how emotionality of written texts changed throughout the centuries. Having collected 307,527 books written between 1900 and 2000 from the Google Books corpus they collect, for each year, the total number of case-insensitive occurrences of emotion terms that are found under positive and negative taxonomies of LIWC dictionary. The main findings of their research show that emotionality (both positive and negative emotions) declines with time, and this decline is driven by the decrease in usage of positive vocabulary. Morin and Acerbi remind us that the Romantic period was dominated by emotionality in writing, which could be the effect of a group of writers who wrote above the mean. If one assumes that each new writer tends to copy the emotional style of their predecessors, then writers at one point of time are disproportionally influenced by this group of above-the-mean writers. However, this trend does not last forever and, sooner or later, the trend reverts to the mean, as each writer reverts to a normal level of emotionality.
An earlier work written in collaboration with Acerbi provides a somewhat different approach and interpretation of the problem of the decline in positive vocabulary in English books of the twentieth century. Using the same dataset and lexical resources (plus WordNet-Affect) Bentley et al. find a strong correlation between expressed negative emotions and the U.S. economic misery index, which is especially strong for the books written during and after the World War I (1918), the Great Depression (1935), and the energy crisis (1975). However, in the present study, the authors argue that the extent to which positive emotionality correlates with subjective well-being is a debatable issue. Morin and Acerbi provide more possible reasons for this effect as well as detailed statistical analysis of the data, so we refer the reader to the original paper for more information.
5 Discussion and Conclusion
We have shown throughout this survey that there is a growing interest in sentiment and emotion analysis within digital humanities. Given the fact that DH have emerged into a thriving science within the past decade, it may safely be said that this direction of research is relatively new. At the same time, the research in sentiment analysis started in computational linguistic more than two decades ago and is nowadays an established field that has dedicated workshops and tracks in the main computational linguistics conferences. Moreover, a recent meta-study by Mäntylä et al. shows that the number of papers in sentiment analysis is rapidly increasing each year. Indeed, the topic has not yet outrun itself and we should not expect to see it vanishing within the next decade or two, provided that no significant paradigm shift in the computational sciences takes place. One may wonder whether the same applies to sentiment analysis in digital humanities scholarship. Will the interest in the topic grow continuously or will it rally to the peak and vanish in a few years?
There is no decisive answer. The popularity of sentiment analysis may have reached a peak but is far from fading. Application-wise, not a lot has changed during the past years: researchers are still interested in predicting sentiment and emotion from text for different purposes. If anything has changed, it is methodology. Early research in sentiment analysis relied on word polarity and specific dictionaries. Modern state-of-the-art approaches rely on word embeddings and deep learning architectures. Having started with simple polarity detection, contemporary sentiment analysis has advanced to a more nuanced analysis of sentiments and emotions.
The situation is somewhat different in digital humanities research. Most of the works rely on affective lexicons and word counts, a technique for detecting emotions in literary text first used by Anderson and McMaster in 1982. Even the most recent works base the interpretation of the results on the use of dictionaries and counts of emotion-bearing words in a text, passage, or sentence. In fact, around 70% of the papers we discussed in Section 4 substantially rely on the use of various lexical resources for detecting emotions (see Table 1 for a summary of methods used in the reviewed papers). We have discussed some limitations of this approach in Section 4.2. Let us reiterate its weakness with the following small example. Consider the sentence ›Jack was afraid of John because John held a knife in his hand‹. Assuming a dictionary of emotion-bearing words is used, the sentence can be categorized as expressing fear, because of the two strong fear markers, afraid and knife. Indeed, the sentence does express fear. But does it do it equally for Jack and John? The answer is no: Jack is the one who is afraid and John holding a knife is the reason for Jack being afraid. Let us assume that a researcher is interested in the emotion analysis of a book that contains thousands of sentences expressing emotions in different ways: some sentences describe characters who feel emotions just as in the sentence above, some are narrator’s digressions filled with emotions, some contain emotion-bearing words (knife, baby) but do not in fact express the same emotion in any given context. No doubt, a dictionary and count-based approach will be helpful in understanding the distribution of the emotion lexicon throughout the story. But is it enough for the interpretation? Can hermeneutics, in its traditional form, make use of such knowledge? Barely. In fact, some of the works that we reviewed pinpoint that the surface affective value of the words does not always align with their more nuanced affective meaning and that sentiment analysis tools make mistakes when classifying a text as emotional or not. If so, how reliable is the interpretation? In other words, what kind of interpretation should we expect from the sentiment and emotion analysis research in the DH community?
We do not have a ready answer to that question. At the one extreme, there is traditional hermeneutics, the examples of which are presented in a Section 3. At the other extreme, there is interpretation in the form of ›Author A writes with more emotion than author B because the numbers say so‹. We do, however, suggest that a balance should be made somewhere between these two extremes. Even as simple as it is, the approach of detecting sentiment and emotion-related words can be used to deliver a high-quality interpretation such as in Heuser et al. or Morin and Acerbi. However, we note again that there are still limits posed by the simplicity of this approach.
This leads us to an outline of the reality of sentiment analysis research in digital humanities: the methods of sentiment analysis used by some of the DH scholars nowadays have gone or are almost extinct among computational linguists. This in turn affects the quality of the interpretation.
However, we admit that this criticism may be unfair. In fact, there is a possible reason why DH researchers have taken this approach to sentiment analysis. Digital humanities are still being formed as an independent discipline and it is easier to form something new in a step-by-step fashion. Resorting to a metaphor from the construction world, one should first learn how to stack single bricks to build a wall rather than starting from the design of a communications system. It is necessary to make sure that appropriate tools and methods are chosen instead of using what proved to be successful in other domains without reflection. It is true that much digital humanities research (especially dealing with text) uses the methods of text analysis that were in fashion in computational linguistic twenty years ago. One may argue that new research in digital humanities should start with the state-of-the-art methods. Indeed, some arguments that methodology is at the root of the interpretation have already been made. So, if there is anything that digital humanities can learn from computational linguistics, it is that methodology cannot stall. What really matters for digital humanities is interpretation, and if methodology is not going forward, the interpretation is not either.
We thank Laura Ana Maria Bostan, Sebastian Padó, and Enrica Troiano for fruitful discussions and the ZfDG team for their help in preparation of this article. This research has been conducted within the CRETA project which is funded by the German Ministry for Education and Research (BMBF) and partially funded by the German Research Council (DFG), projects SEAT (Structured Multi-Domain Emotion Analysis from Text, KL 2869/1-1).