L’intelligenza artificiale per l’elaborazione dei dati ... · Intelligenza artificiale Chi...
Transcript of L’intelligenza artificiale per l’elaborazione dei dati ... · Intelligenza artificiale Chi...
L’intelligenzaartificialeperl’elaborazionedeidatinelloscenariodiIndustria4.0
RobertoBellottiDipartimentoInterateneo diFisica”M.Merlin”IstitutoNazionalediFisicaNucleare
Indice
ü IBigDatacomecarburante peri sistemi diIntelligenzaArtificiale (AI)
ü AI&Reti Complesseü Propagandapoliticaü StudiodellemalattieneurodegenerativecontecnichediAIü Valutazione degli investimenti
ü Conclusioni
2
IntelligenzaartificialeChisvilupperalamiglioreintelligenzaartificiale,diventerailpadronedelmondo(Putin,2017)
3
Intelligènza Artificiale (IA)Disciplinachestudiaseeinchemodosipossanoriprodurreiprocessimentalipiùcomplessimediantel'usodiuncomputer.Talericercasisviluppasecondoduepercorsicomplementari:daunlatol'i.artificialecercadiavvicinareilfunzionamentodeicomputerallecapacitàdell'intelligenzaumana,dall'altrousalesimulazioniinformaticheperfareipotesisuimeccanismiutilizzatidallamenteumana(Treccani)
1bit=0/1à Unalettera=1byte(=8bit).Unlibro=unafotodibuonaqualità=circa1Megabyte.1Gibabyte =1.000libri1Terabyte =1.000.000dilibri
Facebook:500Terabyte didatialgiorno,tracuicirca3miliardidi“like”e300milionidifoto.StimadeidatipossedutidaFB:100.000Terabyte.
GoogleeAmazonà oltreunmilionediTerabyte.
4
NumeriedEsempi
Walmart registra più di1milione di“operazioni”all’ora!
UnBoeing737genera,inunviaggioattraversogliStatiUniticirca240Terabytes didati.
CosasonoiBigData?LetreV:
ü Volumeü Varietàü Velocità
ü BigData:ü Introdottonel2013nell’OxfordEnglishDictionaryü Introdottonel2014Merriam-Webster’s Collegiate
5
ü GlossarioGartner:“Bigdatais high-Volume,high-Velocityand/orhigh-Variety informationassetsthat demand cost-effective,innovativeforms ofinformationprocessingthat
enable enhanced insight,decisionmaking andprocess automation”.
ü Historically,most decisions — political,military,business,andpersonal— havebeen madebybrains[that]haveunpredictable logic andoperateonsubjective experiential evidence.“Bigdata”represents aculturalshift inwhichmoreandmoredecisions aremadebyalgorithms withtransparent logic,operating ondocumented immutableevidence.Ithink “big”refers moretothepervasivenatureofthis change than toanyparticular amount ofdata.
[datascience.berkeley.edu/what-is-big-data/]
PopolazionevsDispositivi
6
IlMcKinseyGlobalInstitutestima unacrescita delvolumedei dati prodottipari al40%perannoeunfattoremoltiplicativo di44nel periodo 2009-2020.
Quantidispositivi“connessi”possiedeognunodivoi?
N.B.6.58/0.08>8000(in17anni!)
• Trasporti• Reti elettriche• Mercati finanziari• Sistemi biologici
Bigdatatraloroconnessià reticomplesse
8
e.g.Lanazione A è collegata alla nazione B seA compra/vende unprodotto daB.
A
B
Cosaservepervincereleelezioni?
MessaggiosuLinkedIn delDataScientist Rayid Ghani:
“Siamo in cerca di esperti data scientist che vogliano fare la differenza.
La campagna di Obama vedrà un ampliamento del team “analitico” perrisolvere problemi di data mining su vasta scala e a forte impatto.
Si presentano diverse opportunità di inserimento professionale a tutti i livellidi esperienza.
Cerchiamo esperti di statistica, di apprendimento automatico, di text analyticse di analisi predittiva per lavorare su grandi volumi di dati e contribuire adorientare la strategia elettorale”
9
Joe McGinnis,ComesivendeunPresidente,Mondadori,Milano,1970(sullaElezionediNixon)
61-million-experiment in Social Influence and Political Mobilization - Nature, 2012E’possibileaumentarel’affluenzaalleurneattraversoglionlinesocialnetwork?
Possono,glionlinesocialnetwork,generareun“contagiosociale”?
Sesi,quanto“vale”questoeffetto?
Uneffetto,anchepiccolo,conicollegiuninominalipuòessererilevante? NelleelezioniUSAdel2000GeorgeBushhabattuto
AlGoreinFloridaper537voti(menodel0.01%degliiscrittiallelisteelettoraliinFlorida).
L’esperimento si è svolto il 2Novembre 2010,giornodelle elezioni presidenziali USA,sugli utenti facebookche inquel giorno hanno utilizzato facebook.
L’esperimento è stato condotto ainsaputa degliutentià InItaliaè stato condotto il 4marzo 2018!
L’esperimento “megafono dell’elettore” di Facebook
Tregruppi disgiunti.
Gruppo di“controllo”(=613,096)
Gruppo “informato”(=611.044)
Thesocialmessagegroup(=60.055.176)
Idati “social”sono stati incrociati coni dati reali perunsottocampione dicirca6milioni dielettori.
20%
18%
12
Inoltre: gli amici non sono tutti uguali
Polarization,Partisanship andJunkNewsConsumptionoverSocialMediaintheUS
Studioeffettuatoda:
UniversitàdiOxford,Dipartimento“OxfordInternetInstitute”(oii.ox.ac.uk )
nell’ambitodelprogetto:
COMPUTATIONALPROPAGANDA13
FullIllustrationofUSAudienceGroupsonFacebook
FullIllustrationofUSAudienceGroupsonTwitter
Analisidicirca22milionidiTweets raccoltinelperiodo1-11novembre2016(leelezionisisonosvoltel’8novembre2016) dasitidi“propaganda”https://www.eticaeconomia.it/propaganda-e-manipolazione-nelle-elezioni-politiche-il-ruolo-dei-social-network-e-degli-algoritmi-basati-sulla-intelligenza-artificiale/
StudiodellemalattieneurodegenerativecontecnichediAI
Mild CognitiveImpairment (MCI):apotential precursortoAlzheimer’s Disease
MCI is a condition in which an individual has mild but measurable changes in thinking abilities that are noticeable to the person affected and to family members and friends, but do not affect the individual’s ability to carry out everyday activities.
People with MCI, especially MCI involving memory problems, are more likely to develop Alzheimer’s or other dementias than people without MCI.
16
An average of 32 percent of individualswith MCI developed Alzheimer’sdementia in 5 years.
Identifying which individuals with MCI are more likely to develop Alzheimer’sor other dementias is a major goal of current research.
MCI can develop for reasons other than Alzheimer’s, and MCI does not always lead to dementia.
Hippocampal volumetry as a Biomarker for AD
Hippocampal size over time. Each thin linerepresents one of the 149 participants.Participants who developed AD are markedwith red lines and the other participants aremarked with green lines.
Automated volumetry measuring hippocampalsize at age 69 years and subsequent rate ofchange predicts Alzheimer’s dementiadevelopment
StructuralandFunctional Networks
18
BullmoreandSporns,Nature,2009
Anovel connectivity model
19
For each image a weighted graph was built upon a similarity measurement given bypairwise Pearson’s correlation among the nodes represented by the patches of eachsubject.
A multilayer networkG=(G1,G2,…,Gα,...,GM) is a set ofM graphs Gα=(Nα,Eα) withα=(1,...,M), each of onerepresenting a layer. If the set ofnodes Nα is fixed then we call itby definition a multiplex.
Independent testaccuracy
20
Classification Accuracy
Control- Alzheimer Control- MCI(converter)
0.86 ± 0.01 0.84 ± 0.01
The proposed methodology is intrinsically data-driven and, as a consequence, itcould suffer from typical over-training issues which on turn could mine the reliabilityof the findings. As a further assessment we performed a binary classification (NC-AD and NC-cMCI) on the independent test set.
21
ü Italian Program for the Convergence objective regions (less developedregions in Southern Italy)
ü (Program: National Operative Program (PON) for Project in research & development)
ü Goal: Evaluate the impact of public funding at regional levelTotal cost of the PON Projects à 2500 Million of EurosAbout 300 Different R&D Projects 769 distinct partners
• Available information: Calls and funding measures, projects, proponents and participants, funding, geographical information, etc.
• Data format: open data (xls, XLM, CSV)• Source: : http://www.dati.puglia.it, http://opencoesione.gov.it
TheItalian PublicFunding Program(2007-2013)
8%#
12%#
12%#
16%#12%#
7%#
21%#
12%#Smart#Ci/es#
Cultural#Heritage#&#Ac/vi/es#
Transporta/on#&#Logis/c#
Environment##
Energy#
Nutri/on#
Healthcare#
N.C.#
28%#
7%#
2%#13%#12%#
19%#
13%#
6%#Large#Enterprise#
noFPublic#Research#Ins/tute#
N.C.#
Small#Enterprise#
Public#Research#Ins/tute#
University#
Micro#Enterprise#
Medium#Enterprise#
(a)# (b)#
22
2007-2013Italian PublicFunding Program:fromdataset todatamodels.
769Nodesà Enterprises,Universities,researchinstitutions.4868Linksà Participationinthesameproject.
Projectsà 10104entrieswith52attributes describingprojectinformationaboutprogramreferences,activities,textualdescriptionofprojectscopeandobjectives,detailaboutpartnersandsoon.Locationsà 11390entrieswith8attributes describingdetailsaboutgeographicallocalizationofprojectpartners.Budgetsà 5670entrieswith13attributes describingdetailsaboutamountandstateofprojectfunding.
23
Wefound15mainCommunitiesà
• providesadeepunderstandingofhowthefundallocationcriteriaareabletoinfluencetheeconomicdevelopmentofaRegion;
• discoveringtheexistenceofgroupswithinacertainnetworkofrelationships;
• highlightingsuchgroupscanbeveryimportantfortheanalysisofaproductivesystem;
• ThePONR&Dnetworkshowsstronglyheterogeneouscommunities,withhugelypopulatedgroupsandverysmallones.
• whencommunitiesgrowinsize,theytendtoincludeimportantnodes.Forexample,thelargestcommunityincludestheNationalResearchCouncil(CNR,nextslide)
Thecommunitystructureofthe(giantcomponentofthe)PONR&Dnetwork.15communitiesarehighlighted,foundwiththeNewman-Girvanalgorithm.
Result#1:communitydetection
24
Result#2:itisanetworkwithHubs
Scalefreenetworkà
•Inhomogeneousdegreedistribution,withmanynodeshavingmoreconnectionsthantheaverage(hubs)•Resistanceto“randomfailures”,indeedtheremovalofarandomnodewouldnotsystematicallyaffectthemainhubs•Policymakersareinterestedingeneratingasolidnetworkofrelationshipsbetweenproductiveactorsontheterritory
Strongindicationthatthenetworkoffundedprojectgravitatesaroundlargepolesinvolvingresearchcenters
Result#3:whoarethehubs?
25
Centralityofnodesàidentifiesthemostimportantnodeswithinanetwork
•Dominantroleofpublicresearch•Universitiesandresearchcentersplaytheroleofthe“glue”i.e.theyareresponsibleoftheconnectednessofthenetwork•Ex-postindicator. Thefifteenlargestvaluesofeachvertexcentrality
forthe(giantcomponentofthe)PONR&Dnetwork.Thehighestpositionsareoccupiedbypublicresearchinstitutions.
26
• Low tendency to form “groups of interest" or “lobbies” among important actors.• Hubs are strongly connected to smaller and less connected enterprises/institutions.• It is an interesting result, sincce most social networks show assortative behavior.• Anti-assortative networks are more sensitive to the removal of high-degree nodes,
which is an indication for the policymaker of the importance that public research has in the productive system.
Result#4:thenetworkisanti-assortative
PublicResearchInstitute
LargeEnterprise
Small-MediumEnterprise
27
Isistemiinformativiaziendaliproduconogiàmoltidati.
Isistemidibusinessintelligenceutilizzanoalmeglioidati?
Qualevaloreaggiuntopossonodaresistemisofisticatidigestioneeanalisideidati?
“Frailforteeildeboleèlalibertàcheopprimeelaleggechelibera”
J.H.Lacordaire