Esperienze di sistemi meteorologici numerici in configurazione di servizio basati su calcolo ad alte...
-
Upload
adolfo-salerno -
Category
Documents
-
view
213 -
download
0
Transcript of Esperienze di sistemi meteorologici numerici in configurazione di servizio basati su calcolo ad alte...
Esperienze di sistemi meteorologici Esperienze di sistemi meteorologici numerici in configurazione di numerici in configurazione di
servizio basati su calcolo ad alte servizio basati su calcolo ad alte prestazioni in APATprestazioni in APAT
Franco ValentinottiQuadrics, Ltd.
Attilio ColagrossiAPAT
CAPI’04 Milano, 24-25 Novembre 2004
IndiceIndice
APAT e servizi meteo: storia e stato attuale Il dualismo servizio-ricerca: criteri ed architetture I sistemi di calcolo
- La catena operativa e la complessità dei modelli
- Il modello meteorologico BOLAM
- QBolam e il sistema basato su APE100
- PBolam e il sistema basato su ALTIX350
- Esperienze su altri sistemi di calcolo Conclusioni
APAT e servizi meteo: APAT e servizi meteo: storia e stato attualestoria e stato attuale
APATAgenzia per la Protezione dell’Ambiente e i Servizi
Tecnici
Istituita nel 2002
Svolge attività tecnico-scientifiche di interesse nazionale per la protezione dell’ambiente, dell’acqua e del suolo.
Incorpora le competenze precedentemente attribuite all’ANPA ed al Dipartimento dei Servizi Tecnici Nazionali – Servizio
Idrografico e Mareografico Nazionale, Servizio Geologico Nazionale, Biblioteca
APAT e servizi meteo: APAT e servizi meteo: storia e stato attualestoria e stato attuale
1998: il Dipartimento dei Servizi Tecnici Nazionali avvia il Progetto Idro-Meteo-Mare in collaborazione con ISAC-CNR
e ENEA
OBIETTIVI
Analisi e previsione della situazione meteorologica sul territorio e dello stato del mare Mediterraneo
Monitoraggio in tempo reale, produzione di analisi e previsione dei campi di interesse, valutazione dei fenomeni
idrometeorologici e dei rischi associati
APAT e servizi meteo: APAT e servizi meteo: storia e stato attualestoria e stato attuale
MODELLI UTILIZZATI
BOLAM (inizializzato sulle analisi ECMWF)
WAM
POM
FEM
APAT e servizi meteo: APAT e servizi meteo: storia e stato attualestoria e stato attuale
Requisito fondamentale:
Esecuzione dei modelli in CONFIGURAZIONE DI SERVIZIO
ECMWF BOLAM
WAM
POM
FEM…… dal 2001
APAT e servizi meteo: APAT e servizi meteo: storia e stato attualestoria e stato attuale
Ambiente di calcolo basato su
computer ad alte prestazioni:
inizialmente….. APE 100
ora…………. ALTIX 350
Il dualismo servizio-ricerca: Il dualismo servizio-ricerca: criteri ed architetturecriteri ed architetture
router alpha server 4100
APE100
ECMWF
Sun spark station
Unità di storage
alpha
Internet
LAN di palazzo
ADSLVenezia
Servizio Laguna Veneta
Il dualismo servizio-ricerca: Il dualismo servizio-ricerca: criteri ed architetturecriteri ed architetture
ora….
script di basso livello, file system, elaborazioni…’ad hoc’
tra poco…..
Open technologies: Linux, Apache, MySQL, PHP, Java
The operational chain: The operational chain: the modelsthe models
HHH...RRR...BBBOOOLLLAAAMMM
VVV...HHH...RRR...BBBOOOLLLAAAMMM
NNNEEESSSTTTIIINNNGGG
PPPOOOSSSTTT---PPPRRROOOCCC...
PPPRRREEE---PPPRRROOOCCC...
WWWAAAMMM PPPOOOMMM VVVLLL---FFFEEEMMM
EEECCCMMMWWWFFF The 3D meteorologicalmeteorological model BOLAM running at two different resolutions:• High Resolution: 30 km grid spacing• Very High Resolution: 10 km grid spacing
3 oceanocean models:• WAM: a 2D model for the prediction of
amplitude, frequency and direction of the sea waves ;
• POM: a shallow-water circulation model for the prediction of surface elevation and horizontal velocities ;
• VL-FEM: a 2D high res. circulation model using finite elements to better describe the Venice Lagoon morphology.
Sea
elevation
Boundarydata
Initialdata
HHH...RRR...BBBOOOLLLAAAMMM
VVV...HHH...RRR...BBBOOOLLLAAAMMM
NNNEEESSSTTTIIINNNGGG
PPPOOOSSSTTT---PPPRRROOOCCC...
PPPRRREEE---PPPRRROOOCCC...
WWWAAAMMM PPPOOOMMM VVVLLL---FFFEEEMMM
EEECCCMMMWWWFFF
Boundarydata
Initialdata
Wind
stress
m.s.l.pressure
sea surfacewind
The computational domainThe computational domain
POMPOM: grid covering the whole Adriatic Sea with about 4000 pts. and a variable resolution, with grid size decreasing when approaching Venice (from 10 to 1 km).
VL-FEMVL-FEM: mesh covering the whole Venice Lagoon with more than 7500 elements and a spatial resolution varying from 1 km to 40 m.
H.R. BOLAMH.R. BOLAM: coarse grid with 160×98×40 pts. and 30 km of resolution.
V.H.R. BOLAMV.H.R. BOLAM: fine grid with 386×210×40 pts. and 10 km of resolution.
WAMWAM: grid covering the whole Mediterranean Sea with about 3000 pts. and 30 km of resolution.
The BOLAM computational costThe BOLAM computational cost
2 days of forecast in ~1 hour
The operational requirementThe operational requirement
V.H.R. BOLAM
103 flop / grid pt. / t. step 3·106 grid pointsTime step of 80 s
~ 7 TFlop / 2-days
~ 2GFlops sustained~ 2GFlops sustained
The Meteorological Model BOLAM The Meteorological Model BOLAM GENERAL FEATURE
•A 3D primitive equations (momentum, mass continuity, energy conservation) model (in the hydrostatic limit)
•Prognostic variables: U, V, T, Q, Ps
NUMERICAL SCHEME
•Finite difference technique in time and space
•Advection: Forward-Backward Advection Scheme (FBAS), explicit, 2 time-levels, centered in space
•Diffusion: - horizontal: 4th order hyperdiffusion on U, V, T, Q2nd order divergence damping on U, V
- vertical: implicit scheme on U, V, T, Q
PHYSICS ROUTINES
• They only involve computations along the vertical direction.
Quadrics QH1Quadrics QH1 128 processors 6.4 GFlops 512 MByte
Server DEC 4100Server DEC 4100
Year 1997Year 1997: the : the QBolam and APE100 choiceQBolam and APE100 choice
HHH...RRR...BBBOOOLLLAAAMMM
VVV...HHH...RRR...BBBOOOLLLAAAMMM
NNNEEESSSTTTIIINNNGGG
PPPOOOSSSTTT---PPPRRROOOCCC...
PPPRRREEE---PPPRRROOOCCC...
WWWAAAMMM PPPOOOMMM VVVLLL---FFFEEEMMM
GeneralGeneral featuresfeatures SIMD Single Instruction
Multiple Data Topology 3D cubic mesh Module 2 × 2 × 2 processors Scalability from 8 to 2048 processors Connections 3D first neighbours,
periodic at the boundaries
ProcessorProcessor MAD Multiplier & Adder Device
Pipeline50 MFlops of peak
Memory 4 MByte per processor(distributed)
Master ControllerMaster Controller Z-CPU Integer operation
Memory addressing
The parallel code QBolam The parallel code QBolam
Boundary between PEs
Grid box
Physical subdomain of thecentral PE
Physical subdomain of thefirst neighbouring PEs
Frame containing data fromfirst neighbouring PEs
Frame containing data fromcorner PEs
Physical subdomain of thecorner PEs
Data Distribution StrategyData Distribution Strategy Static Domain Decomposition
• N. of subdomains = N. of PEs• Subdomains of same shape and dimensions
Connection between subdomains using Frame Method
• Boundary data of the neighbouring sub-domains copied into the frame of local domain
Column Data Type Structure
• “Ad hoc” libraries for communications and arithmetical operations between columns
The BOLAM code has been redesigned for the SIMD architecture and rewritten in TAO language
QBolam Performance on QBolam Performance on Quadrics/APE100Quadrics/APE100
machine type QH1 QH1 QH4*
N. of processors 128 128 512
QBolam model HR VHR VHR
resolution 30 km 10 km 10 km
N. of ops./time step 0.57 GFlop 2.90 GFlop 2.90 GFlop
Time step 240 s 80 s 80 s
Execution time/time step 0.297 s 1.333 s 0.392 s
Performances 1.92 GFlops 2.12 GFlops 7.21 GFlops
% of peak performance 30 % 33 % 28 %
days of simulation 2,5 days 2 days 2 days
elapsed time 8' 16'' 1h 53' 35'' 48' 25''
* Misure effettuate su Quadrics/APE100 QH4 del centro di calcolo ENEA Casaccia - Roma
The goalgoal of the project is to substitute the existing previsional operational system with a new one in the next future:
Simplify the operational chain: all models and interfaces will be executed on one machine only
Parallel architecture upgrade: Cluster Linux, Open Source
Simulation model upgrade: the first result is the PBolam code development, a parallel meteorological model.
4 dual CPU node 1.4 GHz ItaniumII 44.8 GFlops of peak 8 GByte of memory
(physically distributed) SMP thanks to the
NUMAFlex technology (6.4 GBytes/s) “single system image”
OpenMP, MPI
Year 2004Year 2004: : PBolamPBolam and and Cluster LinuxCluster Linux
HHH...RRR...BBBOOOLLLAAAMMM
VVV...HHH...RRR...BBBOOOLLLAAAMMM
NNNEEESSSTTTIIINNNGGG
PPPOOOSSSTTT---PPPRRROOOCCC...
PPPRRREEE---PPPRRROOOCCC...
WWWAAAMMM PPPOOOMMM VVVLLL---FFFEEEMMM
SGI Altix 350SGI Altix 350
The parallel code PBolam The parallel code PBolam PBolam is a parallel version code for distributed memory architecture
of the meteorological model BOLAM.
Portable• Fortran90• MPI• standard Posix
Versatile• any number of processors• any number of grid points
Easy to maintain• same data type structure as BOLAM • same variables/subroutine name as BOLAM
General FeaturesGeneral Features Static Domain Decomposition
• Number of subdomain equal to number of processes, but not fixed as QBolam
• Parallelepiped subdomains, but they may have differents shape and dimension
Data Distribution Strategy• All vertical levels on the same process • Subdivision on the horizontal:
NLon / PLon NLat / PLat NLev where P = PLon PLat (number of
processes) are choose to minimize communication time
Frame Method• Boundary data of the neighbouring sub-domain copied into the frame of local domain: exchange in North-South / East-West
Parallelization strategyParallelization strategy
VHR PBolam performance on AltixVHR PBolam performance on Altix
0,00,20,40,60,81,01,21,41,61,8
8x1
4x2
2x4
1x8
7x1
1x7
6x1
3x2
2x3
1x6
5x1
1x5
4x1
2x2
1x4
3x1
1x3
2x1
1x2
Number of processes (P_lon x P_lat)
exec
uti
on
tim
e(s)
Step Physics Comm.
Execution time vs. number of processes/data distribution
Execution time for all possible PLon PLat = P [2,8] combination was measured
• Execution time of one step decreases when P increases
•Communication time is quite constant when P increases
•Execution time of the physics phase is quite constant, for a fixed P
• Execution time of one step, for a fixed P, is minimum when also communication time is minimum
VHR PBolam performance on AltixVHR PBolam performance on Altix
0
2
4
6
8
10
0 2 4 6 8 10
Number of processes
Sp
eed
up
SpeedUp
Ideal
0,01
0,10
1,00
10,00
0 2 4 6 8 10
Number of processes
exec
uti
on
tim
e (s
)
Step
Comm.
% of Comm.
•Communication time increases slowly when P increases, because total data involved in the exchage increase slowly too•Since total execution time is 1/P, communication time increases from 1% to 10%
This behaviour is evident also in the speed up curve:
S = (timeNP =1 ) / (timeNP =P )
Execution time vs. P for the best data distribution
0
200
400
600
800
1000
1200
1400
1600
1800
2000
4 16 64 256 1024 4096 16384 65536 262144 1048576
Message Size (Bytes)
Ban
dw
idth
(M
byte
s/sec)
2 x QsNetI I
4 x QsNet
QsNetI I
2 x QsNet
QsNet
MPI Latency 1.8 s
MPI Bandwidth 900 MB/s
Elite4 Switch
Elan4 NIC
AMD Cluster with QsNetAMD Cluster with QsNetIIII
8 dual CPU node 2.2 GHz Opteron 70.4 GFlops of peak performance 8 GByte of distributed memory QsNetII interconnect
Altix vs. AMD ClusterAltix vs. AMD Cluster
0,01
0,10
1,00
10,00
0 2 4 6 8 10 12 14 16
Number of processes
execu
tio
n tim
e (s) .
Altix Step Altix Comm.
Altix % of Comm. AMD Step
AMD Comm. AMD % of Comm.
SpeedUp
0
2
4
6
8
10
12
14
16
18
0 2 4 6 8 10 12 14 16
Number of processes
AMD Altix Ideal
• Itanium is faster than Opteron (Preliminary results show a 1.5 factor)
• QsNetII shows better performance
•Less increase of percentage of communication on the total time means best speed up curve, especially when the number of processes grows
ConclusionConclusion•The VHR execution time has been reduced with Altix 350:
PBolam performance (6.3 GFlops, 14% of peak) is 3 time QBolam perf.
from 100 min. to 20 min. of elapsed time, including I/O
•Now APAT has a meteorological parallel code PBolam, portable on several cluster Linux
•In the next future, all the previsional chain will be simplified because all models and interfaces will be executed on one machine only
•SW architecture more suitable to perform both research and service activities