Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari...

29
• Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti Andrea Pompei Fabio

Transcript of Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari...

Page 1: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Peer-to-Peer Systems

Content-Based Routing of Path

Queries in Peer-to-Peer Systems

Georgia Koloniari and Evaggelia Pitoura

Ingargiola Salvatore

Montauti Andrea

Pompei Fabio

Page 2: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Sommario

• Introduzione

• Sistemi Peer-to-Peer

• Filtri di Bloom

• Organizzazione reti P2P: Content-Based

• Risultati Sperimentali

• Conclusioni

Page 3: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Introduzione

“Migliorare le performance di information retreaving in sistemi

P2P”Ipotesi di lavoro:

• Contenuto nodi: documenti XML schema-less sintetizzati mediante filtri di Bloom multilivello

• Organizzazione rete: P2P Content-Based

• Tipologia query: path queries (XPath-like)

Page 4: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Sistemi P2P: Stato dell’Arte

• Sistema Peer to Peer

ibrido:

• impiega un server centrale per ottenere

meta-informazioni sull'identità dei Peer che

posseggono le informazioni richieste.

Napster e OpenNap• Sistema Peer to Peer

puro: • I Peer sono connessi direttamente tra di loro.

• Non è necessario un server principale.

• Query routing: flooding, indici decentralizzati.

Gnutella e Freenet

Page 5: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Filtro di Bloom

Un Bloom Filter è un vettore di bit definito da

due parametri :

1.Array di M bit

2.K funzioni di Hash indipendenti con CDom

in [1,M]

I bit del filtro sono usati per codificare in

modo efficiente un insieme di N oggetti.

Per Costruire un Bloom filter si scorrono uno

dopo l’altro gli oggetti dell’insieme.

Applicando K funzioni hash ad ogni oggetto si

ottengono K valori hash, ogni valore

rappresenta la posizione di un bit che deve

essere settato ad uno.

N.B. : più di una chiave può settare lo stesso

bit.

11

1

1

M=1

0 bi

ts

h1(a) = 4

h2(a) = 2

h3(a) = 5

h4(a) = 8

K = 4 funzioni di Hash

Page 6: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Filtro di Bloom: esempio

Bloom Filter

canegatto

Empty filter

pane

vino

pasta

QUERY FALSE POSITIVE

NO MATCH

Page 7: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Filtri di Bloom multilivello

Filtri di Bloom semplici inadeguati per sintetizzare la struttura multilivello dei documenti XML.

Soluzioni proposte:• Breadth Bloom Filters (BBF)

• Depth Bloom Filters (DBF)

Page 8: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Breath Bloom Filter

sport

moto auto

f3000 f1 0 100 0 0 1 1BBF3 1 1

0 001 0 1 0 1BBF2 1 0

XML<sport> <moto> </moto> <auto> <f3000></f3000> <f1></f1> </auto></sport>

1 101 0 0 0 0BBF1 0 0

BBF0 1 101 0 1 1 11 1

Page 9: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• BBF filter-match operationsport

moto auto

f3000 f1

Root query:

/sport/auto/f1

0 100 0 0 1 1BBF3 1 1

0 001 0 1 0 1BBF2 1 0

1 101 0 0 0 0BBF1 0 0

BBF0 1 101 0 1 1 11 1

1 101 0 0 0 0sport 0 0

0 001 0 1 0 1auto 0 0

sport U auto U f1 1 101 0 1 0 11 1

0 000 0 0 0 1f1 1 1

Page 10: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Depth Bloom Filter

sport

moto auto

f3000 f1

XML<sport> <moto> </moto> <auto> <f3000></f3000> <f1></f1> </auto></sport>

1 100 1 1 0 1DBF2 0 0

0 101 0 0 1 1DBF1 01

DBF0 1 101 0 1 1 11 1

Page 11: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• DBF filter-match operationsport

moto auto

f3000 f1

1 100 1 1 0 1DBF2 0 0

0 101 0 0 1 1DBF1 1 0

DBF0 1 101 0 1 1 11 1

Root query:

/sport / auto / f1

sport U auto U f1 1 101 0 1 0 11 0

sport/ auto U auto/ f1 0 101 0 0 1 10 0

sport/ auto/ f1 1 100 0 1 0 00 0

Page 12: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Risultati Sperimentali

BBF ≈ DBF con filtri maggiori di

78.000 bit

Migliori performance di BBF con elevato numero

di elementi

Page 13: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Struttura gerarchica reti P2P

n5

Local filter

1 1 11000 1 0 1

n2

Local filter

0 1 11010 0 0 1

Merged filter

1 1 11010 1 0 1

Local filter n4

1 0 00010 0 0 1

Local filter n5

1 1 11000 1 0 1

Local filter n6

1 0 11000 1 0 1

n1

Local filter

1 1 00011 0 0 1

Merged filter0 1 01010 1 0 1

Merged filter

1 1 11010 1 0 1Merged filter N61 1 11010 1 0 1

Merged filter N71 1 00010 0 0 0

Merged filter N61 1 11010 1 0 1Merged filter N7

1 1 00010 0 0 0Merged filter Nn1 0 01000 1 0 0

Merged filter Nn1 0 01000 1 0 0

Page 14: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Content-Based

Struttura gerarchica organizzata non più sulla prossimità dei nodi bensì sulla similarità dei loro contenuti

Operazioni analizzate:

• Join

• Query routing

• Update: CountSum, BitSum

Page 15: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

Merged filter1 0 00000 1 0 0

Filter nX1 0 01000 1 0 0

• Content-Based: Join

nX

Filter nX1 0 01000 1 0 0

Funzione Hash

XML<foto> <calendario> <modelle> </modelle> </calendario></foto>

Merged filter1 1 01010 1 1 1

Merged filter0 1 01010 1 0 1

Filter nX1 0 01000 1 0 0

Filter nX1 0 01000 1 0 0

Filter nX1 0 01000 1 0 0

n1

Threshold = 6

Similarity = m – d(Nx, Ny)

N1 N6 N7 N2

Nx 9 6 5 7

Page 16: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Content-Based: Query routingPath query//calendario/modelle

Local filter

1 1 00001 1 0 0

Query filter1 0 01010 1 0 1

==

Query filter1 0 01010 1 0 1

A

Page 17: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Content-Based: Query routingPath query//calendario/modelle

Query filter1 0 01010 1 0 1

Query filter1 0 01010 1 0 1

Local filter N7

1 0 00011 1 0 0

Merged filter N7

1 1 00011 1 0 0

Merged filter Nnr

Merged filter N1r

1 1 01011 1 0 1

Page 18: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Content-Based: Query routingPath query//calendario/modelle

Query filter1 0 01010 1 0 1

Query filter1 0 01010 1 0 1

Local filter N1

1 0 00001 1 0 0

Merged filter N1

1 1 01011 1 0 1

Page 19: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Content-Based: Query routing

Query filter1 0 01010 1 0 1

Query filter1 0 01010 1 0 1

Query filter1 0 01010 1 0 1

Local filter N3

1 1 01010 1 0 1

==

XML<foto> <calendario> <modelle> </modelle> </calendario></foto>

Path query//calendario/modelle

Page 20: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Content-Based: Update

CountSum

BitSum

1 0 1 1

1 0 1 2

1 0 1 1

2 0 1 3

0 0 1 0

0 0 2 0

1 0 1 1

3 0 2 5

1 0 1 1

1 0 1 2

1 0 1 1

2 0 1 3

0 0 1 0

0 0 2 0

1 0 1 1

2 0 2 2

Page 21: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

CountSum

BitSum

1 0 1 1

1 0 1 2

1 0 1 1

1 0 1 2

1 0 1 1

3 0 3 5

1 0 1 1

1 0 1 2

1 0 1 1

1 0 1 1

1 0 1 1

2 0 2 2

tempo0

1

0

1

2

Page 22: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

CountSum

BitSum

1 0 1 1

1 0 1 1

1 0 1 1

1 0 1 1

1 0 1 1

3 0 3 4

1 0 1 1

1 0 1 1

1 0 1 1

1 0 1 1

1 0 1 1

2 0 2 2

tempo

1 0 1 1

1 0 1 2

1 0 1 1

3 0 3 5

1 0 1 1

1 0 1 2

1 0 1 1

1 0 1 2

0

1

0

1

2

Page 23: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

CountSum

BitSum

1 0 1 0

1 0 1 0

1 0 1 0

1 0 1 0

1 0 1 1

3 0 3 3

1 0 1 0

1 0 1 0

1 0 1 0

1 0 1 0

1 0 1 1

2 0 2 1

tempo

1 0 1 1

3 0 3 4

1 0 1 1

1 0 1 1

1 0 1 1

2 0 2 2

1 0 1 1

1 0 1 1

1 0 1 1

1 0 1 1

1 0 1 1

1 0 1 1

0

1

0

1

2

Page 24: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

CountSum

BitSum

1 0 1 1

1 0 1 1

1 0 1 1

1 0 1 1

1 0 1 1

3 0 3 4

1 0 1 1

1 0 1 1

1 0 1 1

1 0 1 1

1 0 1 1

2 0 2 2

tempo

1 0 1 1

3 0 3 3

1 0 1 0

1 0 1 0

1 0 1 0

1 0 1 0

1 0 1 1

2 0 2 1

1 0 1 0

1 0 1 0

1 0 1 0

1 0 1 0

0

1

0

1

2

Page 25: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

CountSum

BitSum

1 0 1 1

1 0 1 2

1 0 1 1

1 0 1 2

1 0 1 1

3 0 3 5

1 0 1 1

1 0 1 2

1 0 1 1

1 0 1 1

1 0 1 1

2 0 2 2

tempo

Con il metodo Bitsum meno

scambi di messaggi

1 0 1 1

1 0 1 1

1 0 1 1

3 0 3 4

1 0 1 1

1 0 1 1

1 0 1 1

1 0 1 1

0

1

0

1

2

Page 26: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Risultati Sperimentali

In sistemi content-based

convergenza più veloce

Con filtri multilivello minor numero di

salti per completare la query

Page 27: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Conclusioni

Dal processo dicotomico dell’analisi sperimentale si evince che il miglioramento delle performance nell’information retreaving può essere ottenuto mediante:1. Organizzazione gerarchica Content-Based

2. Filtri di Bloom multilivello BBF

3. BitSum come implementazione dei contatori

Page 28: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Bibliografia

1. G. Koloniari, E. Pitoura. (2004) Content-Based Routing of Path Queries in Peer-to-Peer Systems. In Proceedings of the 9° Intenational Conference on Extending Database Technology, pages 29-47.

2. G. Koloniari, E. Pitoura. Filters for XML-based Service Discovery in Pervasive Computing. Computer Science Dept, University of Ioannina, Greece.

3. X. Gong, W. Qian, Y. Yan, A. Zhou. Bloom Filter-based XML Packets Filtering for Millions of Path Queries. Department on Computer Science and Enngineering Fundan University, Shanghai, China.

4. N. Gioia. (2004) Un sistema Peer-to-Peer per l’interrogazione distribuita di dati XML. Tesi di Laurea, Università degli studi di Pisa.

Page 29: Peer-to-Peer Systems Content-Based Routing of Path Queries in Peer-to-Peer Systems Georgia Koloniari and Evaggelia Pitoura Ingargiola Salvatore Montauti.

• Peer-to-Peer Systems

Ingargiola Salvatore

Montauti Andrea

Pompei Fabio

Content-Based Routing of Path

Queries in Peer-to-Peer Systems

Georgia Koloniari and Evaggelia Pitoura