ATLAS Italia Calcolo - Istituto Nazionale di Fisica Nucleare · ATLAS Italia Calcolo Overview ATLAS...
Transcript of ATLAS Italia Calcolo - Istituto Nazionale di Fisica Nucleare · ATLAS Italia Calcolo Overview ATLAS...
24-9-2003 L.Perini-CNS1@Lecce 1
ATLAS Italia Calcolo
Overview ATLAS sw e computingCalcolo fatto e previsto - share INFN
Resoconto Milestones
24-9-2003 L.Perini-CNS1@Lecce 2
Schema del talk
• ATLAS Computing org: aree in rifacimento/nuove• Stato e sviluppi sw e tools
– Simulazione, Ricostruzione, Production Environment
• Data Challenges– Calcolo fatto (DC1 etc.) in Italia e previsto (DC2 etc.)– Inquadramento in ATLAS Globale
• Milestones 2003 resoconto• Milestones 2004 proposta
Dario Barberis: ATLAS Organization 3
LHCC Review of Computing Manpower - 2 Sep. 2003
Computing Organization
The ATLAS Computing Organization was revised at the beginning of 2003 in order to adapt it to current needsBasic principles:
Management Team consisting of:
Computing Coordinator (Dario Barberis)Software Project Leader (David Quarrie)
Small(er) executive bodiesShorter, but more frequent, meetingsGood information flow, both horizontal and verticalInteractions at all levels with the LCG project
The new structure is now in place and working wellA couple of areas still need some thought (this month)
Dario Barberis: ATLAS Organization 4
LHCC Review of Computing Manpower - 2 Sep. 2003
New computing organization
Internal organization
being defined this month
Dario Barberis: ATLAS Organization 5
LHCC Review of Computing Manpower - 2 Sep. 2003
Main positions in computing organization
• Computing Coordinator• Leads and coordinates the developments of ATLAS computing in all itsaspects: software, infrastructure, planning, resources.• Coordinates development activities with the TDAQ Project Leader(s), thePhysics Coordinator and the Technical Coordinator through the Executive Board and its subcommittees (COB and TTCC).• Represents ATLAS computing in the LCG management structure (SC2 andother committees) and at LHC level (LHCC and LHC-4).• Chairs the Computing Management Board.
• Software Project Leader• Leads the developments of ATLAS software, as the Chief Architect of theSoftware Project.• Is member of the ATLAS Executive Board, COB and TTCC.• Participates in the LCG Architects Forum and other LCG activities.• Chairs the Software Project Management Board and the Architecture Team.
Dario Barberis: ATLAS Organization 6
LHCC Review of Computing Manpower - 2 Sep. 2003
Main boards in computing organization
• Computing Management Board (CMB):• Computing Coordinator (chair)• Software Project Leader• TDAQ Liaison• Physics Coordinator• International Computing Board Chair• GRID, Data Challenge & Operations Coordinator• Planning & Resources Coordinator• Data Management Coordinator– Responsibilities: coordinate and manage computing activities. Setpriorities and take executive decisions.– Meetings: bi-weekly.
Dario Barberis: ATLAS Organization 7
LHCC Review of Computing Manpower - 2 Sep. 2003
Main boards in computing organization
• Software Project Management Board (SPMB):• Software Project Leader (chair)• Computing Coordinator (ex officio)• Simulation Coordinator• Event Selection, Reconstruction & Analysis Tools Coordinator• Core Services Coordinator• Software Infrastructure Team Coordinator• LCG Applications Liaison• Calibration/Alignment Coordinator• Sub-detector Software Coordinators• Physics Liaison• TDAQ Software Liaison– Responsibilities: coordinate the coherent development of software (both infrastructure and applications).– Meetings: bi-weekly.
Dario Barberis: ATLAS Organization 8
LHCC Review of Computing Manpower - 2 Sep. 2003
Main boards in computing organization
• ATLAS-LCG Team:
– Includes all ATLAS representatives in the many LCG committees. Presently 9 people:
• SC2: Dario Barberis (Computing Coordinator), Daniel Froidevaux (from Physics Coordination)
• PEB: Gilbert Poulard (DC Coordinator)• GDB: Dario Barberis (Computing Coordinator),
Gilbert Poulard (DC Coordinator), Laura Perini (Grid Coordinator)• GAG: Laura Perini (Grid Coordinator), Craig Tull (Framework-Grid integr.)• AF: David Quarrie (Chief Architect & SPL)• POB: Peter Jenni (Spokesperson), Torsten Åkesson (Deputy Spokesperson) • LHC4: Peter Jenni (Spokesperson), Torsten Åkesson (Deputy Spokesperson),
Dario Barberis (Computing Coordinator), Roger Jones (ICB Chair)
– Responsibilities: coordinate the ATLAS-LCG interactions, improve information flow between “software development”, “computing organization” and “management”. Meetings: weekly.
Dario Barberis: ATLAS Organization 9
LHCC Review of Computing Manpower - 2 Sep. 2003
Main boards in computing organization• Architecture Team (A-Team):
– Composition: experts appointed by the Software Project Leader.– Responsibilities: design and set guidelines for the implementation of the software architecture.– Meetings: weekly.
• Software Infrastructure Team (SIT):– Composition: experts appointed by the Software Project Leader.– Responsibilities: provide the infrastructure for software development and distribution.– Meetings: bi-weekly.
• International Computing Board (ICB):– Composition: representatives of ATLAS funding agencies.– Responsibilities: discuss the allocation of resources to the Software & Computing project.– Meetings: 4 times/year (in software weeks).
Dario Barberis: ATLAS Organization 10
LHCC Review of Computing Manpower - 2 Sep. 2003
Organization: work in progress (1)
Data Challenge, Grid and Operationsterms of office of key people coming to an end ~nowDC1 operation finished, we need to put in place an effective organization for DC2Grid projects moving from R&D phase to implementation and eventually production systemswe are discussing how to coordinate at high level all activities:
Data Challenge organization and executions "Continuous" productions for physics and detector performance studiesContacts with Grid middleware providersGrid Application Interfaces Grid Distributed Analysis
we plan to put a new organization in place by September 2003, before the start of DC2 operations
Dario Barberis: ATLAS Organization 11
LHCC Review of Computing Manpower - 2 Sep. 2003
Organization: work in progress (2)Event Selection, Reconstruction and Analysis Tools
here we aim to achieve a closer integration of people working on
high-level trigger algorithmsdetector reconstructioncombined reconstructionevent data modelsoftware tools for analysis
“effective” integration in this area already achieved with HLT TDR work, now we have to set up a structure to maintain constantcontacts and information floworganization of this area will have to be agreed with the TDAQ and Physics Coordinators (discussions on-going)most of the people involved will have dual reporting lines (same as for detector software people)we plan to put the new organization in place by the September 2003 ATLAS Week
Dario Barberis: ATLAS Organization 12
LHCC Review of Computing Manpower - 2 Sep. 2003
Computing Model Working Group (1)
• Work on the Computing Model was done in several different contexts:• online to offline data flow• world-wide distributed reconstruction and analysis• computing resource estimations
• Time has come to bring all these inputs together coherently• A small group of people has been put together to start collecting all existing information and defining further work in view of the Computing TDR, with the following backgrounds:
• Resources• Networks• Data Management• Grid applications• Computing farms• Distributed physics analysis• Distributed productions• Alignment and Calibration procedures• Data Challenges and tests of computing model
Dario Barberis: ATLAS Organization 13
LHCC Review of Computing Manpower - 2 Sep. 2003
Computing Model Working Group (2)
• This group will:• first assemble existing information and digest it• act as contact point for input into the Computing Model from all ATLAS members• prepare a “running” Computing Model document with up-to-date information to be used for resource bids etc.• prepare the Computing Model Report for the LHCC/LCG by end 2004• contribute the Computing Model section of the Computing TDR (mid-2005)
• The goal is to come up with a coherent model for:• physical hardware configuration
• e.g. how much disk should be located at experiment hall between the Event Filter & Prompt Reconstruction Farm
• data flows• processing stages• latencies• resources needed at CERN and in Tier-1 and Tier-2 facilities
Dario Barberis: ATLAS Organization 14
LHCC Review of Computing Manpower - 2 Sep. 2003
Computing Model Working Group (3)
• Group composition: • Roger Jones (ICB chair, Resources), chairman• Bob Dobinson (Networks)• David Malon (Data Management)• Torre Wenaus (Grid applications)• Sverre Jarp (Computing farms)• Paula Eerola (Distributed physics analysis)• XXX (Distributed productions)• Richard Hawkings (Alignment and Calibration procedures)• Gilbert Poulard (Data Challenges and Computing Model tests)• Dario Barberis & David Quarrie (Computing management, ex officio)
• First report expected in October 2003• Tests of the Computing Model will be the main part of DC2 operation (2Q 2004)
Dario Barberis: ATLAS Organization 15
LHCC Review of Computing Manpower - 2 Sep. 2003
Data Management Issues
ATLAS Database Coordination Group recently set upcoordination of:
Production & Installation DBs (TCn)
Configuration DB (online)
Conditions DB (online and offline)
with respect to:
data transfer
synchronization
data transformation algorithms
i.e. from survey measurements of reference marks on muon chambers to wire positions in space (usable online and offline)
members:
Richard Hawkings (Alignment & Calibration Coordinator), chair
Dario Barberis: ATLAS Organization 16
LHCC Review of Computing Manpower - 2 Sep. 2003
Conditions Data Working Group
• The Database Workshop in early February brought the (so far) separate communities together• Convergence of interest was obvious from on-line and off-line sides• Terminology used was rather different, as well as data flow assumptions• Small group of people started addressing these items:
• definition of configuration and conditions data• data flow and rates for configuration and conditions data• relation between online and offline calibrations• input from Detector Control Systems• input from “static” production database, pre-run calibration and survey data
• Members of the group so far:• Richard Hawkings (chair)• David Adams• Steve Armstrong• Mihai Caprini
• David Malon• David Quarrie• RD Schaffer• Igor Soloviev
24-9-2003L.Perini-CNS1@Lecce17
Simulation in ATLAS (A.Rimoldi)Demanding environment
• People vs things• The biggest collaboration ever gathered in HEP• The most complete and challenging physics ever handled
The present simulation in pills:• Fast Simulation: Atlfast• Detailed simulation in Geant3
• In production since 10 years, but frozen since 1995 and used forDC productions until now
• Detailed simulation in Geant4• Growing up (and evolving fast) from the subdetectors side
• Detailed testbeam studies (tb as an ‘old times’ experiment)• For all the technologies represented
• Physics studies extensively addressed since 2001• For validation purposes
• Under development:• Fast/semi-fast simulation, shower parameterizations• Staged detector environment for early studies• Optimizations, FLUKA integration…
24-9-2003L.Perini-CNS1@Lecce18
DC2Different concepts from the different domains about DC2…For the Geant4 Simulation people the DC2 target means a way to state that:
Geant4 is the main simulation engine for Atlas from now onWe have concluded a first physics validation cycle and found that Geant4 is now better or at least comparable with Geant3 We have written enough C++ code to say that the geometry description of Atlas is at the same level of detail as the one in Geant3
The application must still be optimized from the point of view of
• Memory usage@run time• Performance (CPU)• Application robustness
24-9-2003L.Perini-CNS1@Lecce19
DC2 is close We have a functional simulation program based on Geant4 available now for the complete detector
• detector components already collected• Shifting emphasis from subdetector physics
simulations to ATLAS physics simulations after three years of physics validations
Studies under way:Memory usage minimizationPerformance optimizationInitialization time monitoring/minimizationCalorimeters parameterizationA new approach to the detector description through the
GeoModel
We are fully integrated within the Athena framework
24-9-2003L.Perini-CNS1@Lecce20
Complete Simulation Chain Events can be generated online or read inGeometry layout can be chosenHits are defined for all detectors Hits can now be written out (and read back in) together with the HepMC informationDigitization being worked out right nowPileup strategy to be developed in the nearfuture
24-9-2003L.Perini-CNS1@Lecce21
The plan (for Geant4) @short term1.2.7.1.1.1.2.1 geometry of all subdetectors1.2.7.1.1.1.2.1.1 shieldings in place 2 weeks oct- nov 031.2.7.1.1.1.2.1.2 cables & services 4 weeks oct- dec 031.2.7.1.1.1.2.2 performance tests at different conditions 1 week jul- feb 041.2.7.1.1.1.2.3 robustness tests for selected event samples 2 weeks aug- feb 041.2.7.1.1.1.2.4 robustness tests for selected regions1.2.7.1.1.1.2.4.1 barrel 2 weeks sep- dec 031.2.7.1.1.1.2.4.2 endcap 2 weeks sep- dec 031.2.7.1.1.1.2.4.3 transition region 2 weeks sep- dec 031.2.7.1.1.1.2.5 hits for all subdetectors (check and test) 2 weeks sep- dec 031.2.7.1.1.1.2.6 persistency 2 weeks sep- nov 031.2.7.1.1.1.2.6.1 performance tests for all the detectors
components in place 1 week sep- nov 031.2.7.1.1.1.2.6.2 performance tests vs. different conditions 2 weeks sep- nov 031.2.7.1.1.1.2.6.3 robustness tests for all the det. components 2 weeks sep- dec 031.2.7.1.1.1.2.7 packages restructuring for inconsistency
with old structures 3 weeks oct- dec 031.2.7.1.1.1.2.8 cleaning packages area (to attic) 1 week nov 031.2.7.1.1.1.2.9 revising writing rights (obsolete, new) 1 week nov 031.2.7.1.1.1.2.10 documentation 4 weeks sep- dec 03
Emphasis on:Refinement of geometry
missing piecescombined testbeam setup
performance and robustness tests hits & digits persistency pileup
In view of DC2Early tests starting from September withsingle particle beams in order to evaluate the global performances well before DC2 startup
Atlas week, Sep 2003, Prague Alexander Solodkov 22
Reconstruction: algorithms in Reconstruction: algorithms in AthenaAthena
Two pattern recognition algorithms are available for the Inner Two pattern recognition algorithms are available for the Inner DetectorDetector
iPatRec and xKalmanTwo different packages are used to reconstruct tracks in the Two different packages are used to reconstruct tracks in the Muon SpectrometerMuon Spectrometer
MuonBox and MOOREThe initial reconstruction of cell energy is done separately in The initial reconstruction of cell energy is done separately in LAr and TileCal. After that all reconstruction algorithms do notLAr and TileCal. After that all reconstruction algorithms do notsee any difference between LArCell and TileCell and are using see any difference between LArCell and TileCell and are using generic CaloCells as inputgeneric CaloCells as input
Jet reconstruction, ET missSeveral algorithms combine information from tracking Several algorithms combine information from tracking detectors and calorimeters in order to achieve good rejection detectors and calorimeters in order to achieve good rejection factor or identification efficiencyfactor or identification efficiency
e/γ identification, e/π rejection, τ identification, µ back tracking to Inner Detector through calorimeters, …
Atlas week, Sep 2003, Prague Alexander Solodkov 23
High Level Trigger High Level Trigger algorithm strategyalgorithm strategy
Offline model : Event Loop Manager directs an Algorithm: Offline model : Event Loop Manager directs an Algorithm: Here is an event, see what you can do with it
High Level Trigger model: Steering directs an Algorithm:High Level Trigger model: Steering directs an Algorithm:Here is a seed. Access only relevant event data.Only validate a given hypothesisYou may be called multiple times per this one event!Do it all within LVL2 [EF] latency of O(10ms) [O(1 s)]
Possible event-to-event
No event-to-event access
Calibration & Alignment Database Access
Slow and refined approaches
Fast and rough treatment
Performance
Full access to event if necessary
Restricted to Regions-of-Interest
Data AccessEVENT FILTERLEVEL 2ISSUES
Atlas week, Sep 2003, Prague Alexander Solodkov 24
New test beam reconstruction in New test beam reconstruction in AthenaAthena
Inner Detector (Pixel, SCT), calorimeter (TileCal) and whole Inner Detector (Pixel, SCT), calorimeter (TileCal) and whole Muon System are using latest TDAQ software at the test beamMuon System are using latest TDAQ software at the test beam
ByteStream files are produced by DataFlow librariesFormat of the ROD fragment in the output ByteStream file is very close to the one used for HLT performance studies
ByteStreamByteStream with test beam data is available in Athena nowwith test beam data is available in Athena nowByteStreamCnvSvc is able to read test beam ByteStream since July 2003ROD data decoding is implemented in the same way as in HLT converters for MDT and RPC (July 2003) and TileCal (September 2003)Converters are filling new Muon/TileCal EDMRDO => RIO conversion available in Athena before is used at no cost
Reconstruction of Muon TB data is possible in AthenaReconstruction of Muon TB data is possible in AthenaMuon reconstruction is done by MOORE packageNtuples are produced for the analysis
Combined test beam (8 Combined test beam (8 –– 13 Sep 2003)13 Sep 2003)Both MDT and TileCal data are reconstructed in Athena
Atlas week, Sep 2003, Prague Alexander Solodkov 25
Chambers misalignments
180 GeV beam
Barrel sagitta
MOORE MDT segments MOORE MDT segments reconstruction (test beam data)reconstruction (test beam data)
For the full 3For the full 3--D reconstruction the standard MOORE D reconstruction the standard MOORE ntuplentuplecan be usedcan be used
Comparisons with Muonboxare possible
Atlas week, Sep 2003, Prague Alexander Solodkov 26
Reconstruction Task ForceReconstruction Task Force
WhoWhoVéronique Boisvert, Paolo Calafiura, Simon George (chair), GiacomoPolesello, Srini Rajagopalan, David Rousseau
MandateMandateFormed in Feb03 to perform high level re-design and decomposition of the reconstruction and event data modelCover everything between raw data and analysisLook for common solutions to HLT and offline
DeliverablesDeliverablesInterim reports published in April and May.
Significant constructive feedbackFinal report any day now
InteractionInteractionSeveral well attended open meetings to kick off and present reportsMeetings focused on specific design issues to get input and feedbackFeedback incorporated into second interim report
Atlas week, Sep 2003, Prague Alexander Solodkov 27
Very brief overview… please Very brief overview… please read the reportread the reportModularity, granularity, baseline Modularity, granularity, baseline reconstructionreconstructionReconstruction top down design Reconstruction top down design (dataflow)(dataflow)
Domains: sub-systems, combined reconstruction and analysis preparationAnalysis of algorithmic components, identified common toolsIntegration of fast simulationSteering
EDMEDMCommon interfaces between algorithms
e.g. common classes for tracking subsystems
Design patterns to give uniformity to data classes in combined reconstruction domainApproach to units and transformationsSeparation of event and non-event dataNavigation
RTF recommendationsRTF recommendations
Atlas week, Sep 2003, Prague Alexander Solodkov 28
Implementation of RTF Implementation of RTF recommendationsrecommendations
RTF ends with the final reportRTF ends with the final reportGoalsGoals
incorporate first feedback into Release 7.0.0substantial implementation by 8.0.0Ambitious!
HowHowPlanned in subsystems, coordinated in SPMBRequires cross-subsystem cooperation
Because of the nature of the recommendations:common EDM classes, shared tools and patterns
This is already happeningJoint meetings, e.g. recent muon + InDet tracking,jet reconstruction meetingSubdetectors’ WBS already include implementations of RTFSome already implemented, e.g. Calo cluster event/non-event data separation, common indet RIOs are in 7.0.0
Atlas week, Sep 2003, Prague Alexander Solodkov 29
Reconstruction SummaryReconstruction Summary
A complete spectrum of reconstruction algorithms is available A complete spectrum of reconstruction algorithms is available in the Athena frameworkin the Athena framework
They are used both for HLT and offline reconstructionThe same algorithms are being tried for test beam analysis
Ongoing developments:Ongoing developments:Cleaner modularization (toolbox)Robustness (noisy/dead channels, misalignments)Extend algorithms reach (e.g low pt, very high pt)New algorithms
Implementation of RTF recommendations in next Implementation of RTF recommendations in next releases will improve greatly the quality of the releases will improve greatly the quality of the reconstruction software reconstruction software Next challenge: summer 2004, a complete ATLAS barrel Next challenge: summer 2004, a complete ATLAS barrel in the test beam. Reconstruction and analysis using (almost) in the test beam. Reconstruction and analysis using (almost)
24-9-2003 L.Perini-CNS1@Lecce 30
Sviluppo nuovo ATLAS Production environment
• Finora sviluppati diversi tools – Specialmente in contesto Grid US
• Produzioni svolte con tool diversi in posti diversi– Usata molta manpower, scarsa automazione, controlli e correzioni
a posteriori• Decisione di sviluppare sistema nuovo e coerente, seguono
slides di Alessandro De Salvo– Meetings luglio-agosto, finale ristretto 12-8 con De Salvo per
INFN: architettura sistema (con riusi), sharing fra CERN (+nordici), INFN, US
– Per INFN partecipazione da Milano-CNAF (2p EDT, Guido), da Napoli (2p), Roma1 (Alessandro)
24-9-2003 L.Perini-CNS1@Lecce 31
AtlasAtlas Production SystemProduction System
Design of Design of anan automaticautomatic production system production system toto bebe deployeddeployed ATLASATLAS--widewide on on the time scale of DC2 (the time scale of DC2 (springspring 2004)2004)
AutomaticRobustSupport for several flavours of GRID and legacy resources• LCG• US-GRID• NG• Local Batch queues
ComponentsComponentsProduction DBSupervisor/Executors (master-slave system)Data Management System (to be finalized)Production Tools
ToTo bebe defineddefinedSecurity & AuthorizationContinuous Parallel QA SystemMonitoring ToolsExact schemata of the Production DB
24-9-2003 L.Perini-CNS1@Lecce 32
AtlasAtlas Production System Production System detailsdetails (I)(I)
ComponentsProduction DB• Single (logical) DB• Dataset → Task → Task Transf. → Dataset
• Logical File → Job Definition → Job Transf. → Job Execution → Logical File
Supervisor• All the initiatives come from it• Communicates with the DB• Uses several Executors to perform GRID or resource-specific tasks
• Supervisors and executors are logically and physically separated, thus allowingmaximum flexibility and crash-safe actions
Data Management• Single (logical) DMS for all Atlas data• Registration of all files in all facilities• Ability to move data between any of the facilities• Replica Management
Tools• Production request• Production definition
24-9-2003 L.Perini-CNS1@Lecce 33
AtlasAtlas Production System Production System detailsdetails (II)(II)
Task(Dataset)
PartitionTransf.
Definition
TaskTransf.
Definition+ physics signature
Executable nameRelease versionsignature
Task = [job]*Task = [job]*DatasetDataset = [= [partitionpartition]*]* JOB DESCRIPTION
Humanintervention
DataManagement
System
JobRun Info
LucLucGoossensGoossens
KaushikKaushikDeDe
LucLuc GoossensGoossens
LocationHint
(Task)
LocationHint(Job)
Job(Partition)
Supervisor 1 Supervisor 2 Supervisor 3 Supervisor 4
Chimera
US GridExecuter
RobRob GardnerGardner
RB
RB
US Grid LCG NG Local Batch
LCGExecuter
De Salvo NGExecuter
LucLucGoossens
OxanaOxanaSmirnova
AlessandroAlessandroLSF
ExecuterGoossensSmirnovaDe Salvo
Dario Barberis: ATLAS Organization 34
LHCC Review of Computing Manpower - 2 Sep. 2003
ATLAS Computing Timeline• POOL/SEAL release• ATLAS release 7 (with POOL persistency)• LCG-1 deployment• ATLAS complete Geant4 validation• ATLAS release 8• DC2 Phase 1: simulation production• DC2 Phase 2: intensive reconstruction (the real challenge!)• Combined test beams (barrel wedge)• Computing Model paper• ATLAS Computing TDR and LCG TDR• DC3: produce data for PRR and test LCG-n• Computing Memorandum of Understanding• Physics Readiness Report• Start commissioning run• GO!
2003
2004
2005
2006
2007
NOW
Dario Barberis: ATLAS Organization 35
LHCC Review of Computing Manpower - 2 Sep. 2003
High-Level Milestones
10 Sept. 2003 Software Release 7 (POOL integration)31 Dec. 2003 Geant4 validation for DC2 complete 27 Feb. 2004 Software Release 8 (ready for DC2/1)1 April 2004 DC2 Phase 1 starts1 May 2004 Ready for combined test beam1 June 2004 DC2 Phase 2 starts31 Jul. 2004 DC2 ends30 Nov. 2004 Computing Model paper30 June 2005 Computing TDR30 Nov. 2005 Computing MOU30 June 2006 Physics Readiness Report2 October 2006 Ready for Cosmic Ray Run
24-9-2003 L.Perini-CNS1@Lecce 36
DC1 e parte INFN• DC1-1 fatto in 1.5 mesi : terminato settembre 02
– 107 eventi + 3* 107 particelle singole • 39 siti• 30 TB 500 KSi2K * mese• Circa 3000 CPU usate (max)
– CPUs INFN 132 = Roma1 46, CNAF 40, Milano 20, Napoli 16, LNF 10 (SI95=2*2000+800+600=5400)
• INFN circa 5% risorse e 5% share (ma INFN=10% ATLAS)
• DC1-2 pileup fatto in 1 mese: terminato fine 02– 1.2 M eventi di DC1-1
• 10 TB e 40 KSi2K * mese, stessi siti “proporzionalmente”
– Risorse e share INFN come DC1-1 (per costruzione)
24-9-2003 L.Perini-CNS1@Lecce 37
Ricostruzione per HLT TDR
• Fatta su 1.3 M eventi in 15 giorni terminata a maggio 2003– 10 siti (Tier1 o simili)– 30 KSi2K * mese
• Forse + CPU nei vari tests che in prod. finale…
• Frazione del CNAF vicina 10%• Ripetuto in luglio e primi agosto 20 CPU CNAF
– continuata poi ricostruzione per fisica (A0) (vedi monitor agosto CNAF-ATLAS)..
24-9-2003 L.Perini-CNS1@Lecce 38
DC2 in Italia• Inizio in Aprile 2004 fine in Novembre
– Si userà nuovo ATLAS “production environment” • Ricercatori INFN impegnati nello sviluppo
• Impegno ATLAS globale per simul+rec in SI2k*mese circa doppio di DC1, supponendo CPU Geant4=Geant3– CPU INFN richiesta da DC1*4 a DC1*6 (incertezza Geant4)
• Oltre a DC2 calcolo per fisica e rivelatori (come DC1)– Vedi agosto a Mi, Na, Rm
• In DC2 prima volta analisi massiccia e distribuita (Tier3) • Necessità 2004 prevedono (Tabella richieste da Referees)
– 18 kSI95 (5k esistenti + 13k new) in Tier2 (Disco 10.5TB ora + 11 new) New da anticipare a 2003
• A Mi-LCG 120 CPU (70new=6k) a Rm 100 (45new=4k) a Na(45new=4k)
• LNF inizia con 0.2 K + 0.6 k new e 0.9 TB disco– Da 7k a 15k in Tier1 (buffer per prestazioni Geant4) – Aggiunta di 1.5k SI95 e disco a sistema Tier3 (ora solo 700 SI95! E
circa 1 TB in 8 sezioni)
24-9-2003 L.Perini-CNS1@Lecce 39
DC2 in Italia• Importante non accada più che share INFN<10%• Importante partecipare con tutte competenze locali
– Setting up e decisioni ora per il modello di calcolo e di analisi
• Per il 2005 il piano è aumento contenuto rispetto a richieste 2004 in Tier2 e raddoppiare CPU in Tier3– Per Tier2 3 kSI95 e 2 TB disco (niente a Mi e Rm)– Per Tier3 2 kSI95 e 3 TB disco
• Seguono slides (G.Poulard) su situazione DC e planning ATLAS globale per illustrare i vari punti
24-9-2003 L.Perini-CNS1@Lecce 40
DC1 in numbers
(6300)(84)2.5x106Reconstruction+ Lvl1/2
141650224x106Lumi02 Pile-up
296001253x107SimulationSingle part.
60
21
23
TB
Volume of data
51000 (+6300)
37506000
30000
CPU-days(400 SI2k)
kSI2k.months
4x106
2.8x106
107
No. of events
CPU TimeProcess
690 (+84)Total
50Reconstruction78Lumi10 Pile-up
415SimulationPhysics evt.
24-9-2003 L.Perini-CNS1@Lecce 41
Contribution to the overall CPU-time (%) per country
1,41%
10,92%
0,01%
1,46%9,59%2,36%
4,94%
10,72%
2,22%
3,15%
4,33%
1,89%
3,99%
14,33%
0,02%
28,66%
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
ATLAS DC1 Phase 1 : July-August 20023200 CPU‘s110 kSI9571000 CPU days
5*10*7 events generated1*10*7 events simulated3*10*7 single particles30 Tbytes35 000 files
39 Institutes in 18 Countries1. Australia
2. Austria3. Canada4. CERN5. Czech Republic6. France7. Germany8. Israel9. Italy10. Japan11. Nordic12. Russia13. Spain14. Taiwan15. UK16. USA
grid tools used at 11 sites
24-9-2003 L.Perini-CNS1@Lecce 42
Primary data (in 8 sites)
6%
20%
6%
31%
4%
4%
4%
25%
1
2
3
4
5
6
7
8
Total amount of primary data: 59.1 TBytes
Alberta ( 3.6)
BNL (12.1)
CNAF (3.6)
Lyon (17.9)
FZK (2.2)
Oslo (2.6)
RAL ( 2.3)
CERN (14.7)Data (TB)Simulation: 23.7 (40%)Pile-up: 35.4 (60%)Lumi02: (14.5)Lumi10: (20.9)
Pile-up: Low luminosity ~ 4 x 106 events (~ 4 x 103 NCU days)High luminosity ~ 3 x 106 events ( ~ 12 x 103 NCU days)
Data replication usingGrid tools(Magda)
24-9-2003 L.Perini-CNS1@Lecce 43
DC2 resources (based on Geant3 numbers)
5553006000.5107Reconst.
28 (+38)
26 (+57)
42 (+57)
8704352107Total2
22
months
Time span
18
(75)24TB
Volume of data
18
(25)8TB
AtCERN
350520
kSI2k.months
CPU TIME
12
(50)16TB
Offsite
kSI2k
107
107
107
No. of events
CPU power
Process
Byte-stream
175Pile-up (*)Digitization
260Simulation
* To be kept if no “0” suppression
24-9-2003 L.Perini-CNS1@Lecce 44
DC2: July 2003 – July 2004
At this stage the goal includes:Full use of Geant4; POOL; LCG applicationsPile-up and digitization in AthenaDeployment of the complete Event Data Model and the Detector DescriptionSimulation of full ATLAS and 2004 combined test beamTest the calibration and alignment proceduresUse widely the GRID middleware and toolsLarge scale physics analysisComputing model studies (document end 2004)Run as much as possible the production on LCG-1
24-9-2003 L.Perini-CNS1@Lecce 45
Task Flow for DC2 data
AthenaGeant4
AthenaGeant4
Pyth
ia6
H ➠ 4 mu
(Athena-ROOT)Athena-POOL
HepMC
HepMC
HepMC
Eventgeneration
DetectorSimulation
HitsMCTruth
HitsMCTruth
Athena-POOL
Digitization(Pile-up)
Reconstruction
AthenaPile-up+Digits
AthenaPile-up+Digits
Digits
Digits
Digits
Byte-stream
AthenaGeant4
Athena-POOL
ESDAOD
HitsMCTruth
Athena
ESDAODAthena
AthenaPile-up+Digits
ESDAOD
Athena
24-9-2003 L.Perini-CNS1@Lecce 46
DC2:Scenario & Time scalePut in place, understand & validate:
Geant4; POOL; LCG applicationsEvent Data ModelDigitization; pile-up; byte-streamConversion of DC1 data to POOL; large scale persistency tests and reconstruction
Testing and validationRun test-production
Start final validation
Start simulation; Pile-up & digitizationEvent mixingTransfer data to CERN
Intensive Reconstruction on “Tier0”Distribution of ESD & AODCalibration; alignmentStart Physics analysisReprocessing
End-July 03: Release 7
Mid-November 03: pre-production release
February 1st 04: Release 8 (production)
April 1st 04:
June 1st 04: “DC2”
July 15th
24-9-2003 L.Perini-CNS1@Lecce 47
ATLAS Data Challenges: DC2
We are building an ATLAS Grid production & Analysis systemWe intend to put in place a “continuous” production systemo If we continue to produce simulated data during summer
2004 we want to keep open the possibility to run another “DC” later (November 2004?) with more statistics
We plan to use LCG-1 but we will have to live with other Grid flavors and with “conventional” batch systems Combined test-beam operation foreseen as part of DC2
24-9-2003 L.Perini-CNS1@Lecce 48
Milestones 2003• 1 - Completamento del 10% INFN della simulazione Geant3
per HLT TDR nel quadro del DC1 Aprile 2003– Completato al come da slides presentate, gia’ entro febbraio,
ma 5% (coerente con CPU disponibile)
• 2 - Completamento della ricostruzione e analisi dei dati simulati di cui al punto precedente Giugno 2003– I dati sono stati ricostruiti entro maggio senza trigger code e questi
dati sono stati trasferiti al CERN e usati per un’analisi rapida prima della pubblicazione del HLT TDR. In luglio e fino ad inizio agosto sono stati ri-ricostruiti con l’aggiunta del trigger-code
– Completata al 90% perchè è stato rimandato a data da destinarsi il primo test realistico di analisi distribuita che pensavamo di realizzare per il HLT TDR.
24-9-2003 L.Perini-CNS1@Lecce 49
Milestones 2003 (2)• 3 - Simulazione di 10**6 eventi mu con GEANT4 e lo stesso
layout usato per HLT TDR Giugno 2003– Dal gruppo di simulazione di Pavia sono stati processati 4.5M di
eventi di muoni singoli a 20 GeV (con subsample extra a 200 GeV) in testbeam 2002 mode con tempo stimato per evento di .1 s/ev su una macchina PentiumIII 1.26GHz. I dati simulati sono stati poi processatidai programmi di ricostruzione del muon system (Calib e Moore) e sono stati confrontati con i dati reali del testbeam 2002. In relazionea questa produzione di eventi e' stata effettuata una analisi ed e' stataposta in pubblicazione una nota interna di Atlas (ATLAS-COM-MUON-2003-014) (quattro nomi 2 da Pavia, 1 da Cosenza e unoCERN).Sono inoltre stati a prodotti, sempre a Pavia, 1M di eventi dimuoni singoli e circa 2X10**4 eventi di Z-> mu mu ed altrettantidi W->mu nu con muon system in versione aggiornata (versioneP03 del database dei muoni Amdb_SimRec) nella regione centraledello spettrometro a muoni per tests di robustness
– Completata al 100% (o più se possibile)
24-9-2003 L.Perini-CNS1@Lecce 50
Milestones 2003 (3)• 4 - Ripetizione di una delle analisi HLT TDR sui dati mu generati con
GEANT4 Dicembre 2003– Come riportato al punto precedente I dati di mu generati con GEANT4 sono
già stati validati con analisi e confronto con dati reali. Il confronto con i risultati di GEANT3 è tuttora previsto.
• 5 - Inserimento dei TierX di ATLAS nel sistema di produzione di LCG, e test dell'inserimento con le prime produzioni del DC2 di Atlas.
Dicembre 2003– Il DC2 di ATLAS risulta spostato in avanti di 7 mesi rispetto alla data
prevista in luglio 2002 e l’accesso degli esperimenti a LCG-1 sta per avvenire ora ( inizio settembre 2003) a fronte di una previsione per aprile-maggio (4 mesi circa).
– I Tier2 già attivati in ATLAS Italia (Milano, Roma1, Napoli) intendono comunque installare LCG-1 e sperimentarne l’utilizzo entro il 2003. Milano come Tier2 già committed a LCG installerà LCG-1 entro settembre e parteciperà alle attività concordate fra ATLAS e LCG; Roma1 e Napoli parteciperanno ai test di LCG-1 in un quadro puramente ATLAS. Dopo che questa prima fase di tests sarà stata completata con successo proporremo l’inserimento ufficiale di Roma1 e Napoli in LCG (primavera 2004?).
24-9-2003 L.Perini-CNS1@Lecce 51
Milestones 2004• -1- entro maggio 2004: pronto s/w production quality per start
DC2 (Geant4, Athena release 8, production environment LCG)– GEANT4:
• ottimizzazione prestazioni rispetto ad attuale fattore 2 rispetto a GEANT3 (ma non c'e' un target fissato)
• raffinamento geometria (cavi, servizi, etc.)• finalizzazione digitizzazione e persistenza
– Data Management (contributo INFN trascurabile)• integrazione con POOL e con SEAL dictionary (rappresentazione ATLAS event
Model)• Persistenza per ESD, AOD, Tag Data in Pool• Common geometry model for Reconstruction and Simulation• Support for event collections and filtering ( ma questo puo' andare a luglio)
– Production Environment• Nuovo sistema di produzione per ATLAS, automatizzato e che acceda in modo
coerente alle DB di metadati di produzione (ora AMI), catalogo files, virtual data. Interfaccia utente uniforme per tutto ATLAS, interfacciato a LCG (responsabilita' INFN),US-GRID(Chimera), NorduGrid e Plain Batch.
24-9-2003 L.Perini-CNS1@Lecce 52
Milestones 2004 (2)• 2- entro ottobre 2004 completato DC2 simulazione,
ricostruzione ed eventuale reprocessing– Partecipazione di Tier1,2 (CNAF, Milano, Napoli, Roma1) a fasi
simulazione, pileup e ricostruzione eseguendo il 10% di ATLAS globale in Italia.
– Da aprile i siti Tier2 sono tutti LCG-capable, cioe' in tutti il s/w e' installato e testato in mini-produzione italiana. (Milano inseritoinLCG da prima del 2004).
– Analisi in Tier3 in collaborazione con Tier1,2: report per fine 2004
– Contributo INFN a TDR computing
24-9-2003 L.Perini-CNS1@Lecce 53
Tabella richieste 2003+4 (anticipo!)
13,8a) 9b) 4,8
a) 600 SI95 CPUb) Dischi per un totale di 0,8 TB
LNF
59,7a) 1,2b) 45c) 1,5d) 12
a) 1 Rack 42Ub) 15 Nodi di calcolo con biprocessori PIV 2.5 GHz per 3kSI95 in
totalec) Switch Ethernet 10/10071000 24 ported) Server + controller RAID 5 +dischi per un totale di 2 TB
Napoli
78a) 60b) 18
a) Nodi di calcolo per 4 kSI95 CPUb) controller + dischi per un totale di 3 TB
Roma1
107,5a) 73,5b) 24c) 10
a) Nodi di calcolo per 6kSI95 CPUb) controller + dischi per un totale di 5 TBc) switch
Milano(rivisto)
Totale(kEuro)
RichiesteFINANZ.
(kEuro)
RISORSE HW RICHIESTE (anno 2003+2004)
Sezione
24-9-2003 L.Perini-CNS1@Lecce 54
Tabella richieste 2003+4 (anticipo?)
334,3TOTALE
Udine
7,5a) 1,5b) 6
a) 100 SI95 CPUb) Dischi per un totale di 1 TB
Roma3
66Dischi per un totale di 1 TBRoma2
19,5a) 7,5b) 12
a) 500 SI95 CPUb) Dischi per un totale di 1 TB
Pisa
12,3a) 4,5b) 4,2c) 3,6
a) 300 SI95 CPUb) Dischi per un totale di 0,6 TB 2003c) Dischi per un totale di 0,6 TB 2004
Pavia
9a) 3b) 6
a) 1 Nodo di calcolo (1 biprocessore PIV 2.5 GHz)b) Dischi per un totale di 1 TB
Lecce
99NAS File severs + dischi per un totale di 1.5 TBGenova
12a) 10b) 2
a) 5 PC da 100 SI95 cad. + monitorsb) Dischi per un totale di 0,3 TB
Cosenza
24-9-2003 L.Perini-CNS1@Lecce 55
MainMain AtlasAtlas Roma1 Roma1 FarmFarm ActivitiesActivities((JunJun--AugAug 2003)2003)
DC1 “Conventional” Reconstruction• 2.7.105 pileup events (QCD di-jets, Et >17 GeV)
DC1 EDG Reconstruction (CNAF, Cambridge, Lyon, MI, RM)• Reconstruction of 4.105 low luminosity pileupped events
(QCD di-jets, Et < 560 GeV)
Muon Trigger studies
H8 Testbeam data analysis
EDG/DC1 Atlas Software packages production (RPMs)
24-9-2003 L.Perini-CNS1@Lecce 56
CPU Load ATLAS@CNAF
20 available CPUs
CPU Load in Cnaf Atlas Tier1
0
5
10
15
20
25
30
35
27/07
/2003
, 16:
00:00
31/07
/2003
, 00:
00:00
04/08
/2003
, 00:
00:00
08/08
/2003
, 00:
00:00
12/08
/2003
, 00:
00:00
16/08
/2003
, 00:
00:00
20/08
/2003
, 00:
00:00
24/08
/2003
, 00:
00:01
date andtime
numbersof CPUs
CPU Load 1CPU Load 5CPU Load 15
24-9-2003 L.Perini-CNS1@Lecce 57
CPU Load ATLAS@Milano
44 available CPUs
CPU Load in Milan
0
10
20
30
40
50
60
27/07
/2003
, 00:
00:00
31/07
/2003
, 00:
00:00
04/08
/2003
, 00:
00:00
08/08
/2003
, 02:
00:01
12/08
/2003
, 06:
00:01
16/08
/2003
, 06:
00:00
20/08
/2003
, 06:
00:00
24/08
/2003
, 08:
00:00 date and
time
numberof CPUs CPU Load 1
CPU Load 5CPU Load 15
24-9-2003 L.Perini-CNS1@Lecce 58
CPU load ATLAS@Napoli
24-9-2003 L.Perini-CNS1@Lecce 59
Roma1 Roma1 AtlasAtlas FarmFarm UsageUsage StatisticsStatistics
FarmFarm infoinfo//descriptiondescription::httpshttps://classis01.roma1.infn.://classis01.roma1.infn.itit//atlasatlas--farmfarm
24-9-2003 L.Perini-CNS1@Lecce 60
PBS Server Status ATLAS Farm in Milan(Updated every 120 minutes)
Total Jobs
Queued Jobs
Running Jobs
Exiting Jobs
Waiting Jobs
24-9-2003 L.Perini-CNS1@Lecce 61
CPUs status on atlcluster-mi(Updated every 120 minutes)
% User% Free% System% Nice
24-9-2003 L.Perini-CNS1@Lecce 62
PBS Server Status @ CNAF(Updated every 120 minutes)
Total Jobs
Queued Jobs
Running Jobs
Exiting Jobs
Waiting Jobs