Universit à degli Studi di Pisa Dipartimento di Informatica

21
Sericce and Resource Discovery Supports over P2P Overlays Emanuele Carlini, Massimo Coppola Patrizio Dazzi, Domenico Laforenza, Laura Ricci SERVICE AND RESOURCE DISCOVERY SUPPORTS OVER P2P OVERLAYS EMANUELE CARLINI, MASSIMO COPPOLA, DOMENICO LAFORENZA, PATRIZIO DAZZI, LAURA RICCI International Conference on Ultra Modern Telecommunications, ICUMT Saint Petersburg, October 12-14th, 2009 Università degli Studi di Pisa Dipartimento di Informatica

description

SERVICE AND RESOURCE DISCOVERY SUPPORTS OVER P2P OVERLAYS EMANUELE CARLINI, MASSIMO COPPOLA, DOMENICO LAFORENZA, PATRIZIO DAZZI, LAURA RICCI International Conference on Ultra Modern Telecommunications, ICUMT Saint Petersburg, October 12-14th, 2009. - PowerPoint PPT Presentation

Transcript of Universit à degli Studi di Pisa Dipartimento di Informatica

Page 1: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

SERVICE AND RESOURCE DISCOVERY SUPPORTS OVER P2P OVERLAYS

EMANUELE CARLINI, MASSIMO COPPOLA,DOMENICO LAFORENZA,

PATRIZIO DAZZI, LAURA RICCI

International Conference on Ultra Modern Telecommunications, ICUMT

Saint Petersburg, October 12-14th, 2009

Università degli Studi di Pisa Dipartimento di Informatica

Page 2: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

INTRODUCTION

• Grid environments exploit a huge amount of geographically scattered computing resources

• Main features of large computational grids– Dynamic environment– Huge amount of heterogeneous resources– Complex middlewares for accessing the resources

• XtreemOS: a research project funded by the European Commission – main goal: definition of an Open Source, Grid enabled Operating System– scalable and transparent management of large computational platforms– federation of several virtual organizations– users exploit the distributed system through a standard operating

system interface

Page 3: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

SRDS: SERVICE AND RESOURCE

DISCOVERY• SRDS: a basic service of XtreemOS providing a highly distributed directory service

• SRDS main features– enables resource look-up and exploitation in a multi-VO environment– hides the effect of scale when exploiting individual systems– may be exploited by different clients

• other modules of XtreemOS• applications

– supports different kind of queries• key-based• multi-attribute • range queries over dynamic attributes

Page 4: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

SRDS ARCHITECTURE

• SRDS exploits a set of P2P overlays where each overlay includes nodes from different virtual organizations

• The choice of the P2P model enables– scalability– low overhead– fault tolerance– management of information in a dynamic environment

• SRDS services are exploited by different clients, each one with different requirements.

– to cope with the diversity of these requirements, several P2P overlays characterized by different features have been defined (Distributed Hash Tables, structured overlays,...)

Page 5: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

SRDS: THE ARCHITECTURE

Facade:

an easy-to-extend multiple interface protocolsQuery Provider (QP):

set of modules for client query translationInformation Management Layer(IML):

common interface to DHT-like overlaysADS(Application Directory Service) =

Facade+ QP + IML

RSS Resource Selection Service

a P2P overlayallowing scalable resource location in large overlays

Scalaris , Overlay Weaver:

DHT with different characteristics

Page 6: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

SRDS MAIN MODULES: ADS AND RSS

RSS (Resource Selection Service) supports resource discovery through queries on constant value attributes

CPU = IA32, MEM 2[4GB;), BANDWIDTH [512Kb=s;), DISK [128GB;),OS fLinux 2.6.19-1.2895, . . . , Linux 2.6.20-1.2944}

ADS (Application Directory Service) supports complex queries over dynamic attributesExample:

the RSS selects a set of resources matching whose static attributes match the query constraints.

the descriptors of these resources are stored in the ADS. the dynamic state of the resources (for instance, current free memory)

is monitored through the ADSRSS acts a machete, while ADS acts like a 'bistury'

Page 7: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

RSS: RESOURCE SELECTION SERVICE

• Supports resource discovery through multi attribute range queries over a set of static attributes, i.e.constant-valued attributes, known at inizialization time.

• RSS main features– each node represents its own attributes in the overlay– no delegation of the resource information to other nodes, like in

DHT-based approaches– speed up resource location

Page 8: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

ADS: THE QUERY PROVIDER (QP)

• Query Provider Layer: provides a set of modules devoted to query translation

• Implements a set of algorithms for the interpretation of the queries of different SRDS clients

• For instance, a job directory service is required to monitor the state of the jobs of an application/VO

– when a new job is created, the client submits an AddJob to the SRDS– the AddJob operation is interpreted by a QP modules which

translates it into a sequence of operations on the underlying DHT• Check of the existence of a proper job directory service, if it

does not exist, it requires its creation • Insertion of the job ID into the DHT • Insertion further information about the jobs under proper keys to

suppor inverse queries– The QP makes all these steps transparent for the user

Page 9: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

ADS: THE INFORMATION MANAGEMENT LAYER (IML)

Namespaces defines the context where the key is used. For instance different name space for different job directories ADS (Application Directory Service)

– provides an implementation of namespaces over DHT– receives from a QP module an abstract operation:

OPQP = { op, keyM, valueM, NSpace, ClientType, ClientID }– provides an implementation of namespaces– generates an operation for the underlying DHT in the proper namespace

OPDHT ={ op, keyD, valueD, auxinfo }

where valueD: generally equals valueM

keyD: may differ from keyM because of namespace implementation auxinfo: data expiration timeouts, user-defined secrets,....

Page 10: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

EXPLOITING NAMESPACES: AN EXAMPLE

• Network coordinates (NC) embedding system embed latency such as round trip times among nodes into some geometric space

• Each node is assigned network coordinates in the geometric space• Unmisured round trip times is estimated by computing the distance

between two nodes in the geometric spaceTo support• direct queries, i.e. given the IP of the nodes return its network coordinates• inverse queries given the X/Y coordinate of the node, find the the IP of the

'nearest' neighbours'the ADS • exploits three different namespaces: IP, X, Y• each namespace may be mapped on a different DHT or on the same DHT

and may have different characteristics

Page 11: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

NAMESPACE IMPLEMENTATION

Different choices for the implementation of the namepsaces: a different DHT for each namespace a set of namespaces on the same DHT

Page 12: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

NAMESPACE IMPLEMENTATION

Single Ring Approach:• DHT key is prefixed by the an identifier of the name space• main drawback: DHT features, like replication strategy, fault repair

strategy,... cannot be tuned according to the name spaceMultiple Ring Approach• On demand ring creation• Parameters and policies of the DHT ring are customized at ring set-up

time• Some rings may always remain active

– include essential key space, for instance resource directories• Smaller rings may have a shorter lifespan

– application rings, for instance job directory for a given application,....

Page 13: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

NAMESPACE IMPLEMENTATION

• The Current version of the ADS exploits two different rings, based on two different DHT, Scalaris, Overlay Weaver

• Scalaris– A transactional based DHT– Provides consistent replication of data

• Overlay Weaver– implements different DHT

Chord, Pastry, CAN,...– define a routing layer common to all the DHTs.

The Overlay Weaver Architecture

Page 14: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

COMPLEX QUERIES ON DHTDHT supports only basic key-value queries More complex queries may be submitted by the SRDS clients Multidimensional range queries on dynamic attributes Examples

exact match query: Arch.='x86' and CPU-Speed='3 Ghz' and RAM='256MB' partial match queries: CPU-Speed='3 Ghz' and RAM='256MB' (and Arch.=*) range queries 1Ghz<CPU-Speed<'3Ghz' and 512MB<RAM<1Gb similarity queries (o nearest neighbour queries)

require the definition of a metric in the attribute space the user submits an exact match query, which defines a point P in the

attribute space. P may not correspond to any resource. output: k resources nearest to P, according to the defined metric

Page 15: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

RANGE QUERY SUPPORT

– an approach based on the MAAN proposal– exploits the Chord DHTResource pubblication – Each resource is described by k pairs (ai, vi)– A locality preserving hashing function maps the value of each attribute onto the DHT H(vi) = (vi - vimin) x (2m -1) / (vimax – vimin)2m : dimension of the key spaceThe descriptor of each resource is publishedonto k DHT nodes

SRDS supports multiattribute range queries

Page 16: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

RANGE QUERY SUPPORT

• Consider a multi attribute range query a1[v1l, v1u], ...ak [vkl,...vku]• The hashing function maps the range of each attribute onto a DHT range• Selectivity of an attribute

Si = 2m/ H(viu) – H(vil) • The dominant attribute ai= [vil,..viu] with the highest selectivity is choosen.

• The query is sent to H(vil) and is propagated on a DHT arc A till it reaches H(viu)

• Each node on the A checks if the query satisfies all the query constraints• The results are collected along A and sent by the H(viu) to the querying node

Page 17: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

PUBLICATION OPTIMIZATION

• SRDS optimizes the publication process of the resources defined by MAAN

• Publication optimization: exploits soft state cache to store the routing results obtained during the publication process

• Routing on the DHT is avoided if the routing path to a node is stored in the cache

Page 18: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

PUBLICATION OPTIMIZATION

• A second optimization is defined to avoid the publication of 'unpopular' attributes

• Popularity of an attribute A = number of times A is chosen as dominant in a query

– depends on the query distribution• Descriptors associated with low popularity attributes are updated

with lower frequency• Popularity is

– dinamically refined in a distributed fashion by the nodes receiving the queries

– estimated at target nodes receiving the query and sent back to publishing nodes by put-reply messages

Page 19: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

SRDS EVALUATION

• testing environment: Grid 5000 Platform, nodes belons to different Grid 5000 clusters• all nodes publish information every 30s• a large fraction of nodes run queries every 100 ms.

Page 20: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

JOB DIRECTORY SERVICE EVALUATION

• 20-120 nodes belonging to two clusters of the Grid 5K platform each node performs publications over the DHT at fixed 30 seconds rate time interval between different requests 200 milliseconds• Latency of different operations are measured

•AddJob requires a set of put/get operations•RequestJob: a single DHT get

Page 21: Universit à degli Studi di Pisa Dipartimento di Informatica

Sericce and Resource Discovery Supportsover P2P Overlays

Emanuele Carlini, Massimo CoppolaPatrizio Dazzi, Domenico Laforenza, Laura Ricci

CONCLUSIONS

• SRDS: a service and resourse discovery support developed for the XtreemOs distributed operating system

• Provides scalable and customisable information query support over large platforms

• Future works:– testing SRDS on a large computing platform– dynamic definition of namespaces on different DHTs– definition of hierarchical name spaces– investigation of further strategies for range queries (multi

attribute range and neighbours query)