Anger Management: Using Sentiment Analysis To Manage Online Communities

Anger Management:
Using Sentiment Analysis to Manage Online Communities

Sara Owsley Sood Elizabeth F. Churchill
Pomona College Yahoo! Research
185 East Sixth Street 4301 Great America Parkway
Claremont, CA 91711 Santa Clara, CA 95054
sara.owsleysood@pomona.edu echu@yahoo-inc.com
ABSTRACT and incent appropriate behaviors while disincenting disruptive

As online communities grow and user generated content behaviors, and to identify repeat offenders [6][7][8]. As sites
increases, the need for community management also rises. become more popular, the scale of this task obviously grows,
Community management has three main purposes: to create a requiring that community managers and moderators scan vast
positive experience for existing participants, to encourage amounts of content to locate inappropriate contributions and
appropriate, socio-normative behaviors, and to encourage behaviors. Studies of griefers and trolls (people who engage in
potential participants to make contributions. Research indicates transgressive behaviors including posting general negative meta-
that the quality of content a potential participant sees on a site is comments and targeted bully like insults as a form of sport)
highly influential; off-topic, negative comments are a particularly suggest these behaviors are a serious problem for online sites
strong boundary to participation. A problem for community (e.g., see [4]). From this we believe it is worthwhile to distinguish
managers, therefore, is the detection and elimination of such off-topic negative comments from on-topic negative comments
undesirable content. As a community grows this task becomes that, while negative, are offered in the spirit of debate. We also
more daunting. In this paper, we introduce an approach to propose that there is value in distinguishing between expression of
automatic detection of inappropriate negative user contributions. sadness and of anger, anger having far more potential for social
We apply sentiment analysis techniques to the task of classifying discomfort and disruption than sadness.
short comments about news stories, a challenge due to the lack of Recent proposals have been made to automate the detection of
context in the short comments. In combination with relevance inappropriate content, with varying degrees of success (e.g., [9]).
detection techniques, this work will inform a hybrid community Our approach is to combine relevance analyses for detecting on-
management system by automatically flagging off-topic and and off-topic comments with sentiment detection methods. The
hostile comments. latter will help us determine the emotional valence of content, that
is, how positive/negative it is - we are focused on breaking
Keywords negativity down further to identify specifically angry, off-topic
online communities, sentiment analysis, comment threads, user comments, as we feel surfacing these will be the greatest benefit
generated content, emotion, negativity, community management in the community management process. In the next sections, we
present a description and initial test of our approach.
1. INTRODUCTION
User generated content is the essence of the social web. An 2. COMMENTING ON THE NEWS
everyday issue facing technology companies and application 2.1 Characterizing the data
developers is how to incent people to actively participate. One
Our analysis focuses on a news-story commenting site. On sites
barrier to participation is the presence of negative, that ishostile,
like these, community members post stories of interest. Selected
angrycontent on a site; such content can drive existing
stories tend to focus on politics, entertainment and sports. Content
participants away and deter newcomers, but also signals for those
popularity is determined through the votes of other readers, and
remaining that negative contributions are tolerated. There is,
this in turn determines presentation format and duration; very
however, an interesting tension between encouraging debate and
popular postings are promoted to site front pages and are visible
moderating negativity; negative content may discourage
longer, gaining reputation points for the submitter/poster. Such
participation, but a lack of negative content may also indicate lack
sites also allow people to make comments on posted content,
of engaged debate. Completely sanitized sites may not provide an
providing a space for conversation around postings. It is these
interesting forum and thus may not attract repeat visitors. A
comment threads that are of interest to us; these provide new
central task for the community, therefore, is to be able to identify
readers with a feel for the public response to the posting, but also
and possibly eliminate negative content as it is produced, to model
to the site in general.
Our data set consists of comments from 168,095 distinct threads
Permission to make digital or hard copies of all or part of this work for (each based on a news story) from a popular news-story
personal or classroom use is granted without fee provided that copies are recommendation and comment site. Our dataset consisted of
not made or distributed for profit or commercial advantage and that 782,934 comments in total, an average of 4.7 comments per
copies bear this notice and the full citation on the first page. To copy thread (a standard deviation of 27.7). As seen in Figure 1, the
otherwise, or republish, to post on servers or to redistribute to lists, most common (mode) number of comments per story is 1, with a
requires prior specific permission and/or a fee. median of 2. The comments themselves vary greatly in length (as
Conference10, Month 12, 2010, City, State, Country.
Copyright 2010 ACM 1-58113-000-0/00/0010$10.00.
shown in Figure 2). The mean length is 45.7 words (with a
standard deviation of 40.5), the mode length is 8 words, and the
median length is 33 words.
Figure 1: The distribution of comments among threads in the

data set.
Figure 3: Story posted on a community comment
news site. The story focuses on Sarah Palins
response to David Lettermans public confession
about having affairs. This story generated many
emotional comments, with varying degrees of
relevance to the story.
hopefully America realizes she has been right about OH so much

more. (sad)
C2: Sarah Palin : Told ya so! GO SARAH ! (happy)
C3: You go, girl, were behind you 100%. Yes, thats right, blue
dogs are Palin Democrats! (happy)
C4: The number 1 reason Sarah is smiling today, is because she
just got a fat check for $7,000,000 bucks. Who wouldnt be smiling?
Notice too, its listed under the title, ENTERTAINMENT. Thats the
Figure 2: The distribution of comment lengths (in words) in most truthful thing about this story. Obviously, she coughed up a
substantial amount of the millions to get Todd to keep quiet about
the data set. their impending divorce, by buying him a new boat and sending off
into the Bering Sea to do some fishing. What he doesn't know
2.2 Example data yet is, the hull of the boat is made of bamboo and paper machete.
To illustrate the kinds of comments that are posted, and also to (angry)
introduce our analysis method, below we offer an excerpt from a C5: The family that quits together splits apart. (sad)
longer comment thread on a story from a Philadelphia-based C6: Man Palin is a has been stale news, good for small town hicks
online news site from October 20092. The news item relates to a selling news to those like them. To them a pig farting is big news.
development in an ongoing public spat between Sarah Palin, the (happy)
former running mate of the defeated Republican contender in the C7: STOP OFFENDING MY FELLOW AMERICANS. (angry)
US Presidential Elections of 2008, John McCain, and David
C8: Say what you will. Palin rooked her naysayers. You can call
Letterman, a famous talk show host. In this particular story, Palin her book whatever you can to dilute her success but, WHOS
was reported as smiling at the revelation that David Letterman LAUGHING NOW! The Clowns and Comics are presently in
had admitted having extra-marital affairs. The headline is shown conference crying and consoling each other with their big towels
in Figure 3. We selected this example because it illustrates several wiping off their makeup and drying their big old shoes.
Waaaaahhhhhh! (happy)
comment types and varying viewpoints. Following each comment
is an automated mood classification (in parenthesis) using a
system that we describe in section 3.2. : C9: I think the real reason Sarah is smiling is that so many people
3
want to read her book! It is not only the money (everyone wants
C1 : At least he apologized; Palin accepted his apology so Ill follow paid for their work) but the fact that Sarah can get the truth out to
her lead. She was right about Letterman, which we all knew and the Americans! Sarah does not like corruption or deceit! I want to
know the truth too! (angry)

2
http://www.nbcphiladelphia.com/news/politics/Number-One-Reason-Sarah-Palin- C10: The Palin family sucks, always have and always will. There
Is-Smiling-Today.html are a few sex scandals in Sarah & Todds past. Not to mention
3 ethics problems etc. David is probably a nicer person than Sarah &
We have replaced user IDs with C for commenter, with the number denoting the
specific commenter. We have used three dots to denote where additional Todd put together. She can smile at Daves misfortune while
comments were made, that we felt were not useful for our discussion. These were conveniently forgetting her own sins. (sad)
usually off-topic exclamations or spam.
The thread begins with three relatively positive comments in 2. Building a set of patterns associated with positive, negative and
support of Sarah Palin, though the first is sympathetic to David neutral sentiments.
Letterman. Peppered throughout the thread are various angry 3. Training a statistical machine learning system possibly
comments including comments both in support of Palin (e.g. C9) augmented with a suite of interpretation rules that can classify
and angry conversational comments (e.g. C7). Though there is new texts into the desired categories.
much variety in the sentiment conveyed in the posts, they clearly
start out rather positive and become more negative over time. This Much of this work began with, and is still utilizing movie and
is a pattern we observed across the entire dataset, as will be product reviews as labeled training data (where the associated star
discussed below. To set the groundwork for these results, in the rating for a review serves as the label). Many have used this data
next section, we will outline in more detail our theoretical in supervised machine learning systems [12] while others have
framework for the automated analysis through which we aim to taken an unsupervised approach to building such systems,
surface relevance and sentiment, a combination of which will be leveraging relationships between the words in the target document
used to inform community managers as to inappropriate content in in order to calculate an overall sentiment score [17].
the end system. Given that such systems exist and are quite accurate, some as high
as 90% [1], it seems as though utilizing these systems on our data
3. THORETICAL FRAMEWORK AND (user comments on news stories) is a trivial task. However, it is
METHOD well known that sentiment analysis is a domain specific problem.
3.1 Determining Relevance That is, words used in a positive context to describe cars are not
To determine whether or not each comment is on-topic (relevant necessarily the same words used in a positive way to describe
to the focal story), we employ standard information retrieval movies. In fact, many words have opposing emotional
techniques including tfidf [14]. While it was originally created for connotations across domains. For example, a cold beverage is
early search engines as a method to evaluate the relevance of a good while a cold politician is bad [10]. Further, a text may say
document to a query, others have recently used this technique as a that a policy is not at all desirable (negative sentiment), or a
way to form a content query from a document to be passed of to a product is terribly good (positive sentiment); detecting the
search engine to retrieve similar documents [3]. In this way, it underlying sentiment behind these kinds of statements is hard if
can be viewed as a document similarity metric. simple lexicons are used. This implies that, in order to build an
accurate sentiment analysis system, you must have labeled
The tfidf document representation technique treats each document training data from within the same domain. As sufficient data sets
as a bag of words, where words have varying degrees of do not exist in all domains, many have made efforts toward
importance in the document. The importance of a word is boosted building systems to customize sentiment analysis systems to new
by the frequency with which it occurs in the document (term domains without a large amount of labeled training data in the
frequency, TF). The importance of a word is inversely target domain [1][15].
proportional to the frequency with which it occurs in documents
across a corpus (document frequency, DF). In our preliminary The system that embodied our past approach to adapting to new
system, a corpus of news stories from Reuters was used to domains without labeled training data was called Reasoning
compute document frequency values in our system [13]. Through Search. This system characterized a target document by
a sentiment query containing the words in the document with
Our system creates a tfidf representation of the focal story. It then the highest sentiment magnitude (extracted from a statistical
treats each comment as a query and measures the relevance of the model). This query was then posted to a case base of labeled
focal story to that query (comment). The relevance score is movie and product reviews to retrieve (emotionally) related
simply the sum of the tfidf values for each of the query terms in documents. The labels of the retrieved documents are then
the focal story. If the relevance score is above a similarity combined with knowledge of the potential domain of the target
threshold, the comment is considered to be on topic. The real- document in order to compute an overall valence intensity score
valued outputs of such comparisons allow us to evaluate a score ranging from -2 to +2 where -2 is most negative and +2 is
comments by degrees of similarity to the original focal story, in most positive [15].
addition to simply on-topic or off-topic.
Most work in sentiment analysis has focused on building systems
3.2 Sentiment Analysis that simply indicate whether a document is positive, negative or
Since 2002, many researchers have focused their efforts on the neutral. Some have built systems to judge valence on multipoint
task of automatically analyzing the sentiment of a document (how scales [11], but little work has moved beyond the dimension of
positive/negative it is) as expressed by its author by analyzing valence (a measure of the authors sentiment toward a topic how
words that are used in the text. For example, words such as positive/negative they are).
desirable, great, better are associated with positive However, there are a number of more complex models of emotion
sentiments, and words such as awful, terribly, dangerous or sentiment that have been used when classifying human
have been associated with negative sentiments. behavior. The simplest of these is the VAD or PAD Model, which
Sentiment or opinion detection applies to any database or stream characterize emotion on three dimensions representative of
of textual information, especially user-generated content, such as valence (or pleasure), intensity and dominance [2]. The most
blogs, web pages, bulletin boards, emails, chat rooms, and so on. famous is Ekmans six emotion model, which lays out the
The process for detecting attitudes and sentiments requires three following as the 6 basic emotions that human beings experience:
basic steps: happiness, sadness, anger, surprise, disgust, fear [4].
1. Collecting a training corpus of texts, often human annotated as
to their sentiment
Happy Sad Angry 4. RESULTS
Energetic Confused Aggravated 4.1 On relevance
Bouncy Crappy Angry Table 2 shows several comments corresponding to the news story
Happy Crushed Bitchy shown in Figure 3. Although there were many comments in the
Hyper Depressed Enraged thread for this story, we report a sample here to illustrate our
Cheerful Distressed Infuriated technique. Overall the comments on the story skewed angry,
Ecstatic Envious Irate which was one reason why we have selected it for illustrative
Excited Gloomy Pissed off purposes in this paper. In Table 2, we present a relevance score
Jubilant Guilty conveying how similar a comment was to the news story itself;
Giddy Intimidated that is, how on-topic the comment was. As noted above, the story,
Giggly Jealous titled Number One Reason Sarah Palin Is Smiling Today, is
Lonely about Lettermans admission of infidelity, reporting multiple
Rejected affairs some of which were with colleagues. Table 2 illustrates
Sad that the tfidf relevance scores clearly reveal which comments are
Scared on-topic; the top three comments in the table show relatively high
Table 1: Three consistent 'mood clusters' that emerged from tfidf scores while low scores are registered for posts that are
k-means clustering (Sood, et al 2009). clearly off-topic or are spam.
Comment TFIDF
Our past efforts have included moving beyond the valence
Relevance
dimension and classifying documents based on the general mood Score
of the author, using a dataset of blog posts (labeled with the
authors stated mood) from LiveJournal as training data [16]. The Palin family sucks, always have and always will. There 5.239
are a few sex scandals in Sarah & Todds past. Not to
While the LiveJournal dataset was ideal as blog posts were mention ethics problems etc. David is probably a nicer
labeled with the user stated mood, it was less than ideal in that the person than Sarah & Todd put together. She can smile at
author could select from one of 130 possible moods, or write in Daves misfortune while conveniently forgetting her own
their own. Such a selection of classes was not only too large for a sins.
supervised learning system, but it was not evident that there was a
statistical distinction in the use of the mood labels excited and Say what you will. Palin rooked her naysayers. You can 3.754
happy. Additionally, some authors may choose a small number call her book whatever you can to dilute her success but,
of moods to rotate between, making the dataset skewed by author WHOS LAUGHING NOW! The Clowns and Comics are
presently in conference crying and consoling each other
bias. In an attempt to make this dataset more meaningful, we with their big towels wiping off their makeup and drying their
applied k-means clustering, treating the mood labels themselves big old shoes. Waaaaahhhhhh!
as data points where the feature vector for that data point At least he apologized; Palin accepted his apology so Ill 3.717
summarized all of the posts labeled with that mood. Through follow her lead. She was right about Letterman, which we
several clusterings, we found that mood clusters representative all knew--and hopefully America realizes she has been right
of happy, angry, and sad emotions emerged consistently (see about OH so much more.
Table 1 for the contents of these clusters). While clearly not
aligning to emotions as characterized by researchers, these were
expressions of emotions as people use them. The original goal of ClassyMingle.com is the best and largest online personals 0.359
site dedicated to men and women seeking a higher caliber
our system was to classify mood as one of Ekmans six online dating experience.
emotions [4], however, the data from the happy, angry and sad
Whyyyyyyyyyyyyyyy so many people are interested in an 0.696
mood clusters were best utilized in our final supervised learning ageless relationship. young girls want to have fun with 40+
system [16]. It is noted that requiring that a document fall into one man and young guys want to have fun with 40+ women.
of these classes (happy, sad or angry) is limiting (e.g. - one could There are many sites focusing on this kind of relationships
be slightly angry and very sad at the same time), however, such as http://www.Seekingsugar.com !
approaching this as a classification task greatly reduces its Who cares!!! My boyfriend thinks the same with me. He is 0.351
complexity. eight years older than me, lol. We met online at
Agelessmatch.com a nice and free place for Younger
Given our data set and task at hand, a few challenges arise. First, Women and Older Men, or Older Women and Younger Men,
we have a large dataset of user comments and their associated to interact with each other. Maybe you wanna check out or
tell your friends.
new stories, but no sentiment labels for this data. Simply using
systems trained on other domains will lead to some inaccuracies. Table 2: Comments and their tfidf relevance scores. High
Second, the comments themselves are rather short and often numbers are more relevant; low numbers less relevant. Very
conversational. This dataset is quite different from the self- low relevance often indicates spam.
contained longer documents given in movie and product reviews. While these tfidf scores are very helpful in automatically detecting
This creates two interesting sentiment analysis problems. Our whether a post is or is not on topic, we are also interested in
preliminary work, presented here, towards sentiment analysis with detecting the emotional or affective trajectory of a comment. This
this dataset has relied on our past efforts (described above) is useful when used in combination with the tfidf measures of
systems that classify text by sentiment (the Reasoning Through relevance because it allows us to determine whether a post is
Search system) and also by mood (happy, sad, or angry). Future somewhat off-topic (median tfidf score), yet of a positive
work will utilize our vast (unlabeled) data set in training a domain emotional type, which may indicate a conversational comment. As
focused sentiment analysis system. noted in related work, phatic statements and conversational
comments are often the glue that moves a web site from being #2, our answers agreed for 85% of the comments. When
informational to being social [8]. Ideally in a community news compared to the mood classification of each comment, one author
comment site, we would want both of these elements. agreed with the system output for 80% of the comments and the
other author agreed 65% of the time. These results are
4.2 On Sentiment encouraging as the systems (both trained on data from other
As noted in Section 3, where we outline our methods, our domains), performed well on this dataset where the random
approach is to break sentiment down into positive and negative baseline performance would be 50% for question #1 and 33% for
valence and additionally into broad categories: happy, sad and question #2. High agreement scores between the manual
angry. In Figure 4, we show how comments that are illustrated in classifications were encouraging. There is room for improvement
Table 2 fare when being classified by the mood classifier. in this task, perhaps through more domain specific approaches.
Figure 4 Comments classified by the overall mood of the

author. Highlighted features are those that contributed the
greatest to the overall classification (green represents happy,
blue represents sad and red represents angry).
In addition to classifying comments by mood, we also used the Figure 5: Graphing comments in a thread by valence:
Reasoning Through Search system to characterize each Conversations in general start positive and end negative. This
comments valence and intensity (on a scale from -2 to +2). Over is true of the comments on the story in this illustration.
the entire corpus of comments, we found that 54.5% of the
comments were negative, 2.7% were neutral, and 42.8% were
5. DISCUSSION AND FUTURE WORK
positive. The average valence of comments in the corpus was In this paper we have outlined our approach surfacing negative
-0.211. This automated analysis indicates that the comments tend comments in online comment forums. We were impressed with
slightly negative overall. the relative success of the relevance plus sentiment detection
system.
An interesting trend emerged as we examined valence/intensity
changes within individual threads. We found that the most Our approach reveals not just negative and positive valence, but
commonly occurring trend (occurring in 54% of the threads) was offers a richer palette of affect, classifying comments as being
for a thread to start out positive and end negative (as seen in an happy, sad or angry. We believe that introducing this additional
example thread in Figure 5), while only 43% made a negative to nuance offers greater insight into both the overall sentiment felt
positive change and 3% stayed roughly the same throughout the by an aggregate group but also can help us identify inappropriate
course of the thread. This emphasizes our need to detect affect levels (by establishing a level of anger scale). We
negativity, as it seems that these comments bring the discussion conjecture that a post which is off-topic and very angry is a
thread to an end, perhaps creating a boundary to participation. clearly salient for further investigation by community and site
managers. Once conversational and trial posts (e.g., hi) are
As these examples illustrate, the sentiment classification taken into account, we find also that highly off-topic neutral posts
algorithms fared well in classifying the comments. We conducted are often spam. Our technique will also allow us to identify people
a sample test to compare our own performance as human who are consistently posters of angry content across different
classifiers with that of the classification algorithms. We took 20 stories.
comments selected at random from our whole dataset of 168,095
threads and 782,934 comments, and conducted our own manual Given that our mood classification system was trained on
classification of these comments. For each comment, we (each of LiveJournal blog posts, it is not surprising that there were some
the authors) answered two questions 1) Is this comment positive detection errors. In future work we will train our sentiment
or negative? 2) Is this comment happy, sad or angry? We worked classifier with a larger dataset of comment threads taken from the
alone and then compared our classifications with that of the two site we are studying, utilizing this unlabeled data in an
automated sentiment classifiers (the RTS valence and mood Expectation Maximization approach. This will also address the
classification systems). challenge of how well a sentiment detection system can perform
For question #1, our answers agreed for 95% of the documents when trained on short snippets or comments rather than blocks of
(only disagreeing on 1 comment out of 20). When compared to prose.
the RTS classified valence of each comment, each of us agreed We have also begun a hand-coded classification of comments
with the system output for 75% of the comments. For question according to an expanded notion of relevance: on-topic relevance
is where a comment relates to the original text, whereas [4] Chesney, T,. Coyne, I., Logan, B., and Madden. N. Griefing
conversational relevance denotes when a comment refers to a in Virtual Worlds: Causes, Casualties and Coping Strategies,
previous posted comment by another community member. In the Information Systems Journal, 19, 6, 525-548, 2009
latter case, we have found through our analyses that [5] Ekman, P. Emotions Revealed: Recognizing Faces and
conversational relevance can be statements that are directed at Feelings to Improve Communication and Emotional Life,
something someone said, or at the person themselves. Negative, Henry Holt and Company, New York, NY: 2003.
off-topic, conversational relevance, where there is no relevance to
the original text, is usually an insult directed at a personthe [6] Lampe, C. and Resnick, P. (2004) Slash(dot) and Burn:
content refers to a community member themselves (e.g., by name, Distributed Moderation in a Large Online Conversation
using You are) and is combined with known insulting terms. Space. In Proc. CHI 2004, pp 543-550
We are also interested in whether the emotional valence of the [7] Lampe, C. and Johnson, E. (2005) Follow the (Slash)dot:
original text is correlated with the emotional valence of the overall Effects of Feedback on New Members in an Online
comment thread. We are validating our model by hand before Community. In Proc Group05, pp 11-20.
investigating what kinds of linguistic disambiguation will need to
[8] Gazan, R. When Online Communities Become Self-Aware. In
be used in combination with our existing sentiment detection
Proc HICSS09 (HICSS-42), pp 1-10, ACM Press.
model.
[9] Lou, J.K., Chen. K.T. and Lei, C.L. A collusion resistant
In addition, we are intending to use our method to model
automation scheme for social moderation systems. In Proc.
emotional trajectories through comment threads. Specifically we
IEEE Conference on Consumer Communications and
wish to address whether negative comments (crossed by whether
Networking, p 571-575.
they are on or off topic) have an impact on comments that follow:
does negativity beget more negativity? Do conversations go [10] Owsley, S, Sood, S and Hammond, K.J.. Domain specific
south? And if they do, what are the characteristics of a affective classification of documents. In Proceedings of the
conversation that escalates versus one that does not? Through this AAAI Spring Symposium on Computational Analysis of
means we intend to address the issue of whether undesirable Weblogs., pages 181-183, 2006.
behavior does or does not model undesirable behavior in others, [11] Pang, B. and Lee, L.. Seeing stars: exploiting class
and if so what are effective, in thread, remediation strategies relationships for sentiment categorization with respect to
which may be better than simple deletion. We believe these local rating scales. In Proceedings of ACL, pages 115-124, 2005.
strategies over content are the way to effective community
management. With tools that help surface where non socio- [12] Pang, B., Lee, L. and Vaithyanathan, S.. Thumbs up?
normative behaviors are occurring we can support human sentiment classification using machine learning techniques.
community managers work more effectively by automatically In Proceedings of EMNLP, pages 79-86, 2002.
finding and filtering potential violations. [13] Reuters-21578 text categorization test collection
There are of course also open questions, especially when dealing [14] Salton, G. and Buckley, C. (1988). Term-weighting
with people who are regular community visitors. For example, an approaches in automatic text retrieval. Information
open question is about how a single person behaves over time: Processing and Management. 24(5): 513 to 523.
that is, what is the history of an authors sentiment about a topic
[15] Sood, S., Owsley, S. Hammond, K.J. and Birnbaum, L..
over time? What is their authority in the social group, and from
Reasoning Through Search: A Novel Approach to Sentiment
that what is their influence? Clearly some peoples negative
Classification. Northwestern University Tech Report
comments may have more weight than others. Do we see the
Number NWU-EECS-07-05, 2007.
emergence of groups who all share the same sentiment on certain
topics? These questions represent a valuable, but fine-grained and [16] Sood, S..O. and Vasserman, L.. ESSE: Exploring Mood on
socially oriented research program. We believe our approach, to the Web. International Conference on Weblogs and Social
combine relevance, affect/sentiment and descriptions of posting Media Data, 2009.
patterns is a good starting point. [17] Turney, P.D. Thumbs up or thumbs down? Semantic
orientation applied to unsupervised classification of reviews.
6. ACKNOWLEDGMENTS In ACL, pages 417{424, 2002.
We thank our colleagues at Yahoo! for their help with these
analyses. [18] Wans, N., El-Saban, M., Ashour, H. and Ammar W. (2008).
Automatic Scoring of Online Discussion Posts. In
7. REFERENCES WICOW08, pp 19-25. ACM Press.
[1] Aue, A. and M. Gamon. Customizing sentiment classifiers to [19] Weimer, M., Gurevych, I. and Muhlhauser, M. (2007)
new domains: a case study. In Proceedings of RANLP, 2005. Automatically Assessing the Post Quality in Online
[2] Bradley, M.M. and Lang, P.J. Affective norms for English Discussions of Software. In Proc. ACL 2007, Demos and
words (ANEW): Stimuli, instruction manual and affective Posters, pp 125-128. Association for Computational
ratings. Technical Report C-1, Center for Research in Linguistics.
Psychophysiology, University of Florida, Gainesville, [20] Wensel, A. and Sood, S.O. VIBES: Visualizing Changing
Florida, 1999. Emotional States in Personal Stories. ACM MultiMedia
[3] Budzik, J, Hammond, K.J., and Birnbaum, L. Information Workshop on Story Representation, Mechanism and Context,
access in context. Knowledge-Based Systems. 14(1-2): 37-53 2008.
(2001)

Anger Management: Using Sentiment Analysis To Manage Online Communities

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Anger Management: Using Sentiment Analysis To Manage Online Communities

Uploaded by

Copyright:

Available Formats

Anger Management:

Using Sentiment Analysis to Manage Online Communities

ABSTRACT and incent appropriate behaviors while disincenting disruptive

Figure 1: The distribution of comments among threads in the

hopefully America realizes she has been right about OH so much

Figure 4 Comments classified by the overall mood of the

You might also like