Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Sentiment Analysis in Social Networks
Sentiment Analysis in Social Networks
Sentiment Analysis in Social Networks
Ebook632 pages11 hours

Sentiment Analysis in Social Networks

Rating: 5 out of 5 stars

5/5

()

Read preview

About this ebook

The aim of Sentiment Analysis is to define automatic tools able to extract subjective information from texts in natural language, such as opinions and sentiments, in order to create structured and actionable knowledge to be used by either a decision support system or a decision maker. Sentiment analysis has gained even more value with the advent and growth of social networking.

Sentiment Analysis in Social Networks begins with an overview of the latest research trends in the field. It then discusses the sociological and psychological processes underling social network interactions. The book explores both semantic and machine learning models and methods that address context-dependent and dynamic text in online social networks, showing how social network streams pose numerous challenges due to their large-scale, short, noisy, context- dependent and dynamic nature.

Further, this volume:

  • Takes an interdisciplinary approach from a number of computing domains, including natural language processing, machine learning, big data, and statistical methodologies
  • Provides insights into opinion spamming, reasoning, and social network analysis
  • Shows how to apply sentiment analysis tools for a particular application and domain, and how to get the best results for understanding the consequences
  • Serves as a one-stop reference for the state-of-the-art in social media analytics
  • Takes an interdisciplinary approach from a number of computing domains, including natural language processing, big data, and statistical methodologies
  • Provides insights into opinion spamming, reasoning, and social network mining
  • Shows how to apply opinion mining tools for a particular application and domain, and how to get the best results for understanding the consequences
  • Serves as a one-stop reference for the state-of-the-art in social media analytics
LanguageEnglish
Release dateOct 6, 2016
ISBN9780128044384
Sentiment Analysis in Social Networks
Author

Federico Alberto Pozzi

Dr. Federico Alberto Pozzi received the Ph.D. in Computer Science at the University of Milano - Bicocca (Italy). His Ph.D. thesis is focused on Probabilistic Relational Models for Sentiment Analysis in Social Networks. His research interests primarily focus on Data Mining, Text Mining, Machine Learning, Natural Language Processing and Social Network Analysis, in particular applied to Sentiment Analysis and Community Discovery in Social Networks. He currently works at SAS Institute (Italy) as Senior Solutions Specialist - Integrated Marketing Management & Analytics.

Related authors

Related to Sentiment Analysis in Social Networks

Related ebooks

Enterprise Applications For You

View More

Related articles

Reviews for Sentiment Analysis in Social Networks

Rating: 5 out of 5 stars
5/5

2 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Sentiment Analysis in Social Networks - Federico Alberto Pozzi

    analysis.

    Chapter 1

    Challenges of Sentiment Analysis in Social Networks

    An Overview

    F.A. Pozzia; E. Fersinib; E. Messinab; B. Liuc    a SAS Institute Srl, Milan, Italy

    b University of Milano-Bicocca, Milan, Italy

    c University of Illinois at Chicago, Chicago, IL, United States

    Abstract

    In this chapter we provide some background knowledge for the sentiment analysis research field, subsequently providing an overview of the current challenges related to the social network environment. The main content of the chapter is devoted to introducing the reader to some preliminary concepts, which are further detailed in the subsequent chapters.

    Keywords

    Sentiment analysis; Opinion mining; Social networks; Objective sentences; Subjective sentences; Explicit opinions; Implicit opinions

    1 Background

    Sentiment analysis, which is also called opinion mining, has been one of the most active research areas in natural language processing since early 2000 [1]. The aim of sentiment analysis is to define automatic tools able to extract subjective information from texts in natural language, such as opinions and sentiments, so as to create structured and actionable knowledge to be used by either a decision support system or a decision maker.

    Unsurprisingly, there has been some confusion among researchers about the difference between sentiment and opinion, thus debating whether the field should be called sentiment analysis or opinion mining. In Merriam-Webster’s Collegiate Dictionary, sentiment is defined as an attitude, thought, or judgment prompted by feeling, whereas opinion is defined as a view, judgment, or appraisal formed in the mind about a particular matter. The difference is quite subtle, and each of them contains some elements of the other. The definitions indicate that an opinion is more of a person’s concrete view about something, whereas a sentiment is more of a feeling. For example, the sentence "I am concerned about the current political situation expresses a sentiment, whereas the sentence I think politics is not doing well expresses an opinion. If someone says the first sentence in a conversation, we can respond by saying I share your sentiment, but for the second sentence we would normally say I agree/disagree with you. However, the underlying meanings of the two sentences are strictly related because the sentiment depicted in the first sentence is likely to be a feeling caused by the opinion in the second sentence. Conversely, the first sentiment sentence implies a negative opinion about politics, which is what the second sentence is saying. Although in most cases opinions imply positive or negative sentiments, some opinions do not, such as I think he will win at the next presidential election."

    More formally, as defined in [1], an opinion is a quintuple,

       (1.1)

    where ei is the name of an entity, aij is an aspect of ei, sijkl is the sentiment on aspect aij of entity ei, hk denotes the opinion holder, and tl is the time when the opinion is expressed by hk.

    The sentiment sijkl is positive, negative, or neutral, or expressed with different strength/intensity levels, such as the 1–5 stars system used by most review websites (eg, Amazon¹).

    For example, consider that yesterday John bought an iPhone. He tested it during the whole day and when he went home from work (at 19:00 on 2-15-2014) he wrote on his favorite social network the message "The iPhone is very good, but they still need to work on battery life and security issues. Let us index iPhone, battery life, and security" as 1, 2, and 3 respectively. John is indexed as 4 and the time when he wrote the sentence is indexed as 5. Then John is the opinion holder h4 and t5 (19:00 2-15-2014) is the time when the opinion is expressed by h4 (John). The term iPhone is the entity e1, battery life and security issues are aspects a12 and a13 of entity e1 (iPhone), s1245 = neg is the sentiment on aspect a12 (battery life) of entity e1 (iPhone). and s1345 = neg is the sentiment on aspect a13 (security issues) of entity e1 (iPhone’). When an opinion is on the entity itself as a whole, the special aspect GENERAL" is used to denote it.

    From the definition of sentiment analysis reported above, "the aim of sentiment analysis is therefore to define automatic tools able to extract subjective information in order to create structured and actionable knowledge." In line with this, the quintuple-based definition provides a framework to transform unstructured text to structured data (eg, a database table). Then a rich set of qualitative, quantitative, and trend analyses can be performed with traditional database management systems and online analytical processing tools.

    Because of the importance of sentiment analysis to business and society, it has spread from computer science to management science and the social sciences. In recent years industrial activities surrounding sentiment analysis have also thrived: numerous start-ups have emerged, and many large corporations have built their own in-house capabilities (eg, Microsoft, Google, Hewlett-Packard, IBM, SAP, and SAS Global Communications).

    Thanks to its strong applicability and interest in both the academic field and the industrial field, sentiment analysis is nowadays a trending topic. Fig. 1.1 represents the Google Trends data related to the keywords sentiment analysis, clearly demonstrating the continuous and increasing interest in this field.

    Fig. 1.1 Google Trends data related to the keywords sentiment analysis .

    Nowadays, sentiment analysis has gained even more value with the advent of social networks. Their great diffusion and their role in modern society represent one of the most interesting novelties in recent years, capturing the interest of researchers, journalists, companies, and governments. The dense interconnection that often arises among active users generates a discussion space that is able to motivate and involve individuals of a larger agora, linking people with common objectives and facilitating diverse forms of collective action. Social networks are therefore creating a digital revolution, enabling the expression and spread of emotions and opinions through the network, opening a window on others’ respective worlds, and snooping into their lives. Opinionated data on the net, if properly collected and analyzed, allow one not only to understand and explain many complex social phenomena but also to predict them.

    Considering that nowadays the current technological progress enables the efficient storing and retrieval of a huge amount of data, the current focus is now on methods for extracting information and creating knowledge from raw sources. Social networks represent an emerging challenging sector in the context of big data: the natural language expressions of people can be easily reported through short text messages, rapidly creating unique content of huge dimensions that must be efficiently and effectively analyzed to create actionable knowledge for decision making processes.

    The massive quantity of continuously contributing texts in social networks, which should be processed in real time so as to make informed decisions, calls for two main types of radical progress: (1) a change of direction in the research through the transition from a data-constrained to data-enabled paradigm and (2) the convergence to a multidisciplinary area that mainly takes advantage of psychology, sociology, natural language processing, and machine learning. The knowledge embedded in social network content has been shown to be of paramount importance from both user and company/organization points of view: while people express opinions on any kind of topic in an unconstrained and unbiased environment, corporations and institutions can gauge valuable information from raw sources. To make qualitative textual data effectively functional for decision processes, the quantification of what people think becomes a mandatory step.

    However, sentiment analysis is often improperly used when one is referring to polarity classification, which instead is a subtask aimed at extracting positive, negative, or neutral sentiments (also called polarities) from texts. Although an opinion could also have a neutral polarity (eg, "I don’t know if I liked the movie or not. I should watch it quietly."), most work in sentiment analysis usually assumes only positive and negative sentiments for simplicity. Depending on the field of application, several names are used for sentiment analysis (eg, opinion mining, opinion extraction, sentiment mining, subjectivity analysis, affect analysis, emotion analysis, and review mining). A taxonomy of the most popular sentiment analysis tasks is reported in Fig. 1.2. Sentiment Analysis in Social Networks tries to overcome this limitation by (1) collecting and proposing new relevant research work from experts in the field, (2) debating the advantages and disadvantages when one is applying sentiment analysis in social networks, and (3) discussing the progress of sentiment analysis in social networks and future directions.

    Fig. 1.2 Sentiment analysis tasks.

    This book will accurately investigate the above-mentioned needs by providing advanced and specific solutions to address sentiment analysis in social networks. In particular, it presents the latest work by some of the most relevant experts in the field. At the end, a detailed conclusive discussion is provided and the personal and valuable thoughts and opinions of these researchers on future directions are presented. Although polarity classification is usually considered a core task because of its direct utility and applicability in working systems, all the chapters aspire to be relevant with respect to the needs outlined above. This is not a mere protocol declaration: the book has been thought about and designed as a whole, as an indivisible gold mine, which intends to provide contributions highly connected to each other.

    2 Sentiment Analysis in Social Networks: A New Research Approach

    The general trend in research regarding sentiment analysis in social networks is to apply the techniques inherited from traditional sentiment analysis studied since early 2000. However, considering the evolution of the sources where opinions are voiced, the strategies available in the current state of the art are no longer effective for mining opinions in this new and challenging environment. In fact, social network sentiment analysis, in addition to inheriting a multitude of issues from traditional sentiment analysis and natural language processing, introduces further complexities (short messages, noisy content, metadata such as gender, location, and age) and new sources of information not leveraged in traditional approaches.

    In particular, given that social networks are clearly having an impact on language, the daily challenges regarding sentiment analysis mainly focus on the constant evolution of the language used online in user-generated content: the words that surround us every day influence the words we use. Since much of the written language we see is now on the screens of our computers, tablets, and smartphones, language now evolves partly through our interaction with technology. And because the language used in social networks for us to communicate with each other tends to be more malleable than formal writing, the combination of informal, personal communication, and the mass audience afforded by social networks is a recipe for rapid change. Taking into serious consideration the continuous language revolution, we believe sentiment analysis systems should be able to natively adapt to it, or alternatively be adapted by researchers. Being able to juggle these problems requires strong natural language processing and linguistics skills. As a side effect, this language evolution strongly influences the way in which irony and sarcasm is uttered.

    A further daily challenge relates to the nature of social networks, which by definition are dynamic and heterogeneous and the entities involved are connected to each other. Conversely, a representation of real-world data where instances are considered as homogeneous, independent, and identically distributed leads us to a substantial loss of information and to the introduction of a statistical bias. Dealing with relational environments by our taking advantage of social network analysis becomes a mandatory step to go beyond the current state of the art, where only textual content is tackled. For this reason, the combination of content and relationships is a core task of the recent literature on sentiment analysis.

    A final crucial issue, which is usually overlooked, is concerned with visualization and summarization of opinions. This issue becomes more important when opinions need to be concisely presented over large networked environments. Traditional visual analytic tools need to be redesigned according to this novel necessity.

    3 Sentiment Analysis Characteristics

    Sentiment analysis is a broad and complex field of research. In the following, the main characteristics that constitute sentiment analysis are described and discussed in detail.

    3.1 Sentiment Categorization: Objective Versus Subjective Sentences

    The first aim when one is dealing with sentiment analysis usually consists in distinguishing between subjective and objective sentences. If a given sentence is classified as objective, no other fundamental tasks are required, while if the sentence is classified as subjective, its polarity (positive, negative, or neutral) needs to be estimated (see Fig. 1.3). Subjectivity classification [2] is the task that distinguishes sentences that express objective (or factual) information (objective sentences) from sentences that express subjective views and opinions (subjective sentences).

    Fig. 1.3 Sentiment analysis workflow.

    An example of an objective sentence is "The iPhone is a smartphone, while an example of a subjective sentence is The iPhone is awesome. Polarity classification is the task that distinguishes sentences that express positive, negative, or neutral polarities. Note that a subjective sentence may not express any positive or negative sentiment (eg, I guess he has arrived). For this reason, it should be classified as neutral."

    3.2 Levels of Analysis

    As mentioned earlier, the aim of sentiment analysis is to "define automatic tools able to extract subjective information from texts in natural language." The first choice when one is applying sentiment analysis is to define what text (ie, the analyzed object) means in the case of study considered.

    In general, sentiment analysis in social networks can be investigated mainly at three levels (represented graphically in Fig. 1.4):

    Fig. 1.4 Different levels of analysis.

    • Message level: The aim is to classify the polarity of a whole opinionated message. For example, given a product review, the system determines whether the text message expresses an overall positive, negative, or neutral opinion about the product. The assumption is that the entire message expresses only one opinion on a single entity (eg, a single product).

    • Sentence level: The aim is to determine the polarity of each sentence contained in a text message. The assumption is that each sentence, in a given message, denotes a single opinion on a single entity.

    • Entity and aspect level: Performs a finer-grained analysis than message and sentence level. It is based on the idea that an opinion consists of a sentiment and a target (of opinion). For example, the sentence "The iPhone is very good, but they still need to work on battery life and security issues" evaluates three aspects: iPhone (positive), battery life (negative), and security (negative).

    3.3 Regular Versus Comparative Opinion

    An opinion can assume different shades and can be assigned to one of the following groups:

    • Regular opinion: A regular opinion is often referred to in the literature as a standard opinion and it has two main subtypes:

    – Direct opinion: A direct opinion refers to an opinion expressed directly on an entity (eg, "The screen brightness of the iPhone is awesome").

    – Indirect opinion: An indirect opinion is an opinion that is expressed indirectly on an entity on the basis of its effects on some other entities. For example, the sentence "After I switched to the iPhone, I lost all my data! describes an undesirable effect of the switch on my data," which indirectly gives a negative sentiment to the iPhone.

    • Comparative opinion: A comparative opinion expresses a relation of similarities or differences between two or more entities and/or a preference of the opinion holder based on some shared aspects of the entities [3]. For example, the sentences, "iOS is better performing than Android and iOS is the best performing operating system" express two comparative opinions. A comparative opinion is usually expressed with use of the comparative or superlative form of an adjective or

    Enjoying the preview?
    Page 1 of 1