Le nuove tecnologie di Social Networking e le Imprese - Giuseppe Manco
-
Upload
centro-di-competenza-ict-sud -
Category
Documents
-
view
431 -
download
0
description
Transcript of Le nuove tecnologie di Social Networking e le Imprese - Giuseppe Manco
Giuseppe Manco
• Ricercatore presso ICAR-‐CNR • Aree di interesse – Data Analysis, Social Networks, sistemi di recommendaCon
CONNETTERSI, COMUNICARE
Sei gradi di separazione
Myspace: 110 milioni di uten?
Cos’è una rete sociale? Traditional Media
Broadcast Media: One-to-ManyBroadcast Media: One to Many
Communication Media: One-to-One
Cos’è una rete sociale?
• User generated content • User enriched content • User interacCon • Comunicazione ubiqua • CollaboraCve environment • …
Characteristics of Social Media� Everyone can be a media outlet� Disappearing of communications barrier
� Rich User Interaction� User-Generated Contents� User Enriched Contents
User developed widgets� User developed widgets� Collaborative environment� Collective WisdomC� Long Tail
Broadcast MediaFilter, then Publish
Social MediaPublish, then Filter
User generated content
User interac?on
Comunicazione ubiqua
AEvità condivise
I websites più visitaC
• Il traffico internet più alto (daC Alexa, oQobre 2012)
1 Facebook 11 Blogspot
2 Google 12 LinkedIn
3 YouTube 13 Taobao
4 Yahoo 14 Google India
5 Baidoo 15 Yahoo Japan
6 wikipedia 16 Sina.com.cn
7 Windows live 17 msn
8 TwiQer 18 Google hk
9 QQ.com 19 Google de
10 Amazon 20 Bing
Il ruolo fondamentale dei social media
TIPOLOGIE DI RETE
L’Universo Social Media
Social Media
Social Networking
Blogs
Wki Forum
Content sharing
QUALI OPPORTUNITÀ?
L’azienda e i social networks • Pubbliche relazioni • Customer Support • Market Research • Brand MarkeCng • PromoCons • Consumer EducaCon • Sales • New Product Development • Customer RelaConship Management
SOCIAL MEDIA ANALYTICS
DaC, daC, daC • Grossi volumi, grande varietà
– Milioni di utenC, milioni di contenuC – testuale, MulCmediale (immagini,
video, etc.) – Milioni di connessioni – Tendenze, preferenze,
comportamenC, … • I daC sono open e facili da accedere
– Facili da reperire – Di pubblico dominio – Developers APIs – Spidering the Web
Le opportunità
• Ogni utente può condividere e contribuire ai contenuC, esprimere opinioni, collegarsi ad altri
• Questo significa: – Human behavior – MarkeCng analyCcs – Product senCment
� Any user can share and contribute content, express opinions, link to others
� This means: Can data-mine opinions and behaviors of millions of users to gain insights into: � Human behavior � Marketing analytics � Product sentiment
Jure Leskovec:Social Media Analytics (KDD '11 tutorial) 6 8/21/2011 6
Actionable Intelligence
Consumer Generated, Not Edited, Not Authenticated
8/21/2011 Jure Leskovec:Social Media Analytics (KDD '11 tutorial) 7
Applicazioni: ReputaCon management
• Consumer Brand AnalyCcs – Cosa dice la gente sul mio marchio?
• MarkeCng CommunicaCons – Determinare se le campagne che pianifico saranno efficaci
• Product reviews – Estrazione automaCca di review e informazioni su prodom e servizi • Facile da usare, confortevole, prezzo adeguato, …
Applicazioni: Responsività
• CiCzen response • feedbacks su temaCche poliCche • Campagne poliCche – Perché la gente supporta un candidato?
• Law enforcement – MovimenC dissidenC su TwiQer – Minority report hQp://www.nyCmes.com/2011/08/16/us/16police.html?_r=1
Applicazioni: Viral MarkeCng • Viral markeCng: – Raccomandazioni personlizzate
• Il ruolo dei forum online: – 79.2% dei partecipanC ai forum aiutano gli utenC connessi a prendere decisioni relaCve a un prodoQo
– 65% dei partecipanC ai forum condividono consigli (offline o personalizzaC) basaC sulle informazioni che hanno leQo online
hQp://www.socialmediaexaminer.com/new-‐studies-‐show-‐value-‐of-‐social-‐me
Applicazioni: Human Behavior analysis
• Processare I contenuC, e usufruire di tools per – IdenCficare reC sociali: gruppi, membri – IdenCficare topics e senCments
� Process social media content, provide tools for analysts to: � Identify social networks: groups, members
� Identify topics and sentiment
Social Media Content
Link Diagrams
Predictive Modeling
8/21/2011 Jure Leskovec:Social Media Analytics (KDD '11 tutorial) 12
Relevance, Authority, SenCment
• Le tre dimensioni dell’interazione sociale
Finding the Relevant BlogsOUR FIRST OBJECTIVE is to filter the vast blogosphere
from millions to the thousands of blogs most relevant to thetopic of interest. In the simplest case, the “topic” can be a spe-cific product, and the objective is therefore to identify all blogsdiscussing this product and perhaps competing products aswell. More generally, one would like to cast a wider net andinclude blogs that are discussing higher-level concepts relatedto the market addressed by the product(s) of interest. Forexample, IBM offers a social networking product called LotusConnections, but marketing experts may wish to follow all dis-cussions touching on the concept of collaborative software as ameans to understand emerging trends in this space.
The distinction between tracking a specific product andtracking a broader concept impacts the methodology used tofind the relevant set of blogs. If the interest is only in a specificproduct, it is straightforward to identify blogs (e.g. by using ablog search engine) containing references to the product. Suchan approach is less effective for broad topics because discus-sions that touch on such a topic (e.g.“collaborative software”)may not specifically contain these keywords. In practice, it isreasonable to ask marketing experts to identify a small set of“seed” blogs that are highly relevant to the topic at hand. Oneapproach is to use these labeled blogs to build a straightforwardtext classification model to identify other relevant blogs.
Relevant blogs are likely to link to other relevant blogs, and analternative approach to text classification is to exploit the structureof the blog cross-reference graph. One simple approach is to startwith the small set of expert-identified seed blogs,add all the blogsthey link to and then repeat this process for several iterations(degrees of separation). This snowball sampling procedure wasused to identify the blogs shown in Figure 1; note that the second(and third) iteration of this process identified a number of rele-vant blogs not included in the seed population.
Discussions about broad concepts like “collaboration soft-ware” tend to be tightly connected, and hence this simpleapproach is likely to be more efficient than keyword search infinding these blog sub-communities. Using the graph structurealso alleviates the problem when product search terms havemultiple meanings, e.g.“Lotus” is a car, a flower and a softwarebrand – it is unlikely that blogs talking about Lotus the car willreference blogs discussing Lotus Software.
An important consideration is to avoid crawling [computerprogram that browses the Internet in a methodical, automatedmanner] the parts of the relevant blog sub-universe that areirrelevant from a marketing perspective. A practical solution isfocused snowball sampling [3], which explicitly focuses Web
that utilize these analytics capabilities to pro-vide marketing insights from blogs.
Social Media Analytics for Marketing
FROM A MARKETING and marketintelligence perspective, blogs are a veryimportant form of social media becausethey provide access to previously inaccessi-ble information such as specific customer insights and opin-ions. Social media analytics can address several interestingquestions by providing algorithms and approaches for theautomated analysis of blogs and related social media:1. Given the massive size of the blogosphere, how can we
identify the subset of blogs and forums that arediscussing not only a specific product, but also higher-level concepts that are in some way relevant to thisproduct?
2. What sentiment is expressed about a product orconcept in a blog or forum?
3. Who are the most authoritative or influential bloggersin this relevant subspace?
4. What are the novel emerging topics of discussionhidden in the constant chatter in the blogosphere?
A typical blog or micro-blog has one author (the blogger)and consists of multiple entries or posts. It is useful to think ofa blog in a three-dimensional space defined by the first threemetrics above: relevance, sentiment and authority. While thefirst two dimensions, relevance and sentiment, are specific to agiven post or even smaller section of text (“snippet”), thenotion of authority is most naturally assigned at the blog level.A blogger’s authority can also depend on the specific topic.Emerging topics are a property of the blogosphere at large andrequire analysis across many blogs.
All three dimensions are important and they need to be con-sidered in a unified view in order to provide marketing insight.One way to provide such a view is to determine the relevanceand sentiment of each post, and characterize the overall rele-vance and sentiment of the blog as a simple statistic over indi-vidual posts.
Figure 1 captures such a blog-centric view along thesedimensions from a prototype tool at IBM Research. Here, weare interested in a high-level view of blogs relevant to the broadtopic of “social collaboration.” Relevance is shown on the y-axis, and sentiment is on the x-axis – both metrics are com-puted at the post level and aggregated to the blog level. Eachcircle represents a blog, and the size of the circle reflects theblogger’s authority. The output of the model can be interpret-ed as the probability that the post is positive in tone. We aremost interested in extremes of sentiment, so we naturally lookfor authoritative blogs in the upper-left and upper-right quad-rants to find the most relevant blogs with non-neutral senti-ment. Such a view allows marketing people to quickly identifyblogs of interest, and to drill down to obtain more specificunderstanding of the potential marketing impact.
www.orms-today.com 27
Figure 1: Relevance, authority and sentiment at the blog level.
ORMS3701_FTRs 2/3/10 4:56 PM Page 27
IBM’s topic-‐based blog evaluator
SenCment DetecCon
• E’ possibile caraQerizzare (in maniera automaCca) il tono di una discussione?
crawling using only links deemed to be relevant by a text classi-fier. There are many other opportunities to apply large-scale ver-sions of classification models [4] that exploit both graph struc-ture and text content.
Sentiment DetectionWHILE THE RELEVANCE MODEL can help to limit the
universe to thousands rather than a half million blogs, this is still avolume far beyond close scrutiny by a small marketing staff. Butwhich ones of all the relevant blogs should you read? And are themost relevant indeed the most crucial? An immediate concernfrom a marketing perspective is to detect strong sentiment,partic-ularly strong negative sentiment. Since it is impossible to read allblog posts relevant to a particular topic, there is strong motivationto develop automated capabilities to characterize the tone andsentiment in these discussions.
Sentiment detection models are crucial in order to identifyblog posts that may require swift marketing action. Figure 2 illus-trates such a situation identified by an IBM social media analyticstool. Sentiment models are able to detect a negative post (1),resulting in a rapid response (2) from a product executive. Thequick response leads to a very positive statement (3) from theoriginal blogger. This exchange illustrates the role social mediaanalytics can play in allowing marketing to identify and addressnegative sentiment before it can cause more brand damage.
The main challenge in sentiment classification is that theexpression of sentiment tends to be domain specific, and the setof domains to monitor change often. Thus we require sentimentclassifiers that can rapidly adapt to new domains withoutrequiring a large number of manually labeled training examplesof positive and negative sentiment. Treating sentiment detectionas a text classification task has made it possible to adapt to newdomains, provided there are enough training examples in thetarget domain. However, supervision for a sentiment classifiercan be provided not only by labeling documents (e.g. blogposts), but also by labeling words. For instance, labeling a wordsuch as “atrocious” as negative is one way to express our prior
belief of the sentimentassociated with it. It is pos-sible to learn from suchlabeled words in conjunc-tion with labeled docu-ments. Furthermore, theselection of words and doc-uments to be labeled can bemade algorithmically.
Such an approach isknown as active dual super-vision [5], and it can greatlyreduce the effort required tolabel examples in a newdomain. Even though thereare expressions of sentimentthat are domain-specific,
there is still a large amount ofoverlap in how positive and
negative emotion is conveyed across domains. This enables theuse of transfer learning to adapt a classifier trained in one domainto a new domain with little to no labeled data in the targetdomain [6].
Measuring Influence and AuthorityWHILE RELEVANCE AND SENTIMENT provide two
essential filters, it is unlikely that each and every relevant blog withnegative sentiment warrants an action.An important considera-tion is how much does the opinion of one blogger actually mat-ter? A well-known riddle asks,“If a tree falls in a forest and no oneis around to hear it, does it make a sound?”This ultimately trans-lates into the question of how influential is the blog in question –is anybody actually listening (or reading) and is it likely that theseopinions will influence other individuals?
Influential bloggers may or may not be factual experts butnevertheless influence the opinions of others via discussions onthe topic. From a marketing perspective, it is important to iden-tify this set of bloggers, since any negative sentiment theyexpress could spread far and wide. In addition to authorities,there are bloggers who are very well connected, who are mostresponsible for the spread of information in the blogosphere.When presented with a large number of posts relevant to atopic, ordering them by the blogger’s influence assists in infor-mation triage, given that it is not feasible to read all posts. Figure3 shows such a view, where we have found the most authorita-tive blogs relevant to the topic “social collaboration.”
Since reliable blog readership information is difficult to obtain,the links between blogs are commonly used instead to determinea blog’s authority. For instance, Technorati (www.technorati.com/) assigns an authority score to a blog based on the numberof blogs linking to the Web site in the last six months. Similarly,Blogpulse (www.blogpulse.com/) ranks blogs based on the num-ber of times it is cited by other bloggers over the last 30 days.Given that we have a network of directed edges indicating thelinks between posts/blogs, we can apply more complex measuresof prestige from social network analysis. For instance, the author-
OR/MS TODAY28 February 2010
Figure 2: Identifying and addressing negative sentiment.
M A R K E T I N G & S O C I A L M E D I A
ORMS3701_FTRs 2/3/10 4:56 PM Page 28
IBM Social M
edia AnalyCcs
Misurare Influence e Authority
• Chi sono gli utenC suscembili? • Come si propaga un’informazione? • Quando un’opinione è affidabile?
Emerging Topics • Higher-‐level concepts dall’informazione che si distribuisce • Come variano quesC concem?
Most menConed phrases in the US presidenCal campaign
hQp://mem
etracker.org
I social media e le imprese…
• Due prospemve – Nuovi scenari e modelli di interazione – AnalyCcs
• StreQa cooperazione con ricerca e innovazione – Nuovi challenges – Opportunità enormi