Die Präsentation wird geladen. Bitte warten

Die Präsentation wird geladen. Bitte warten

Introduction to Grid computing Bohol Philippines Kilian Schwarz, Florian Uhlig GSI.

Ähnliche Präsentationen

Präsentation zum Thema: "Introduction to Grid computing Bohol Philippines Kilian Schwarz, Florian Uhlig GSI."—  Präsentation transkript:

1 Introduction to Grid computing Bohol Philippines Kilian Schwarz, Florian Uhlig GSI

2 table of contents ● Physics motivation to Grid computing ● The Grid: what is it all about ? ● History of Grid computing ● from research to deployment: EU projects ● Grid components: example LCG ● Grid Security

3 Dr. Rüdiger Berlich FZK, Karlsruhe Dr. Kilian Schwarz GSI, Darmstadt Physics Motivation behind Grid Computing Duration: ca. 1.5 h

4 Physics Motivation: Explaining the Universe

5 13.7 billion years ago (or so they say)

6 The Big Bang ≈ 1 km quark- hadron transition }

7 13.699.999.638 years later: The universe is back on track ! Sir Isaac Newton is born and devises the physics laws of mechanics. It appears as if the world is explicable The universe is regarded as a static entity with an infinite lifetime 1642 - 1727 Sir Isaac Newton Deduction from observation. Manual data collection and analysis. Data = O(KByte)

8 1895: William Röntgen discovers X-Rays (“Röntgenstrahlen”) One of the first signs for physics beyond Newton's world view 1896: Henri Becquerel discovers radioactivity 1874: The electron is introduced by Stoney and Helmholtz as the “atom of electricity” 1898: Marie and Pierre Curie extract pure radioactive elements for the first time Henri Becquerel William Röntgen Marie Curie

9 Foundation of modern particle physics: quantum mechanics and theory of relativity 1900: Max Planck introduces the „quantisation of electromagnetic radiation. Light has properties properties of both wave and particle Only very few scientiest took Max Planck serious. One of them was Albert Einstein. 1905: Albert Einstein (1902 – 1907 assessor at the Bern patent office) Special theory of relativity (special case of the general theory of relativity, 1916) Equivalence of mass and energy, E = mc 2 Mass can be transformed into other forms of energy ! The speed of light is the same in all reference frames Max Planck Albert Einstein

10 13.699.999.921 years after the Big Bang (1925): Edwin Hubble discovers that most galaxies in the universe move away from each other The universe is expanding ! Or, in other words: it must have started from a a single origin The age of the universe can be calculated from the expansion speed. Modern particle physics can provide a good estimate of the age of the universe and what has happened after the big bang. Edwin Hubble

11 particle “zoo” Now known: electrons, protons and neutrons Many more particles have since been discovered, some being “real” elementary particles, some being assembled from others The discovery of the positron makes clear: every particle has an antiparticle with opposite characteristics (Elementary) particles can be ordered according to their characteristics Goal: development of a “theory of everything” Discovery of new particles with higher masses requires higher energies Systematic searches (“scans”) cannot rely on bubble chambers.  Era of storage rings, linear accelerators and modern particle physics experiments The quark idea

12 Particle Physics: Accelerators and Detectors Going beyond 1 PByte

13 Particle Collisions Energy becomes matter e-ppe-pp e+ppe+pp Which particles?Where ? What happens ? Electron – PositronCERN / LEPe + e - Annihilation SLAC / PEP-2 Proton – AntiprotonFermilab / TevatronQuark-Antiquark Annihilation Proton – Proton CERN / LHCGluon- Gluon Fusion E = mc 2 New stable and unstable particles

14 Working Principles Particle Detectors Magnetic Field In modern particle detectors: sub-detectors are arranged in the form of “onion-skins” (goal: 4  solid angle detection – example: Alice experiment).

15 Geneva Jur a Lake Geneva LEP and LHC

16 LEP (end 2000) -> LHC (start 2009)

17 ATLAS in Comparison to the CERN LHC Building

18 LHC / ALICE Automatic data collection Data = O(1 PByte = 10 15 Byte) Alice: 10 9 files / year, 1 file <= 2 GByte Distributed Storage But: trivial to run in parallel ! BaBar ALIC E

19 Alice Setup Ti me Pro ject ion Ch am ber Inn er Tr ack ing Sys te m L3- Magnet Expected lifetime of the eperiment: 25+ years...

20 70 538 27 4603 637 55 22 87 10 Europe: 267 Institutes, 4603 Users Other: 208 Institutes, 1632 Users Over 6000 LHC Scientists world wide

21 Gesellschaft für Schwerionenforschung (GSI) Home of the Future Project „FAIR“- The largest German accelerator facility for the foreseeable future

22 So we have at LHC (and GSI/FAIR): 10 times 10 15 bytes per year, that need to be stored, accessed and analysed (and that's just a fraction of the data that is actially being produced by the detector...) Experiment lifetimes of 25+ years 6000 physicists that want to have transparent access to all the data transparently, regardless of physical location. Help... *) *)... but don't worry. We are physicists, we can do this !

23 “The Grid - Blueprint for a new Computing Infrastructure” Jointly they can be considered to be the fathers of „the Grid“ KesselmanFoster New Opportunities Demand New Technology Ian Foster and Carl Kesselman are instrumental in the design of the Globus middleware.

24 The Grid ● what is it all about ?

25 New Opportunities Demand New Technology ● Foster and Kesselman – the vision: – Computing power from a „plug in the wall“ – Origin of the term „Grid Computing“ – Analogy – „electrical power grid“ – Transparent exchange of computing power – „ When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances “ (Gilder Technology Report, June 2001) Ian Foster: “Grid Computing is resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations”






31 Virtual Organisations Collaborating Computer Centres Alice VO CMS or Atlas or LHCb VOs LCG

32 Grid Computing - Clustering Clusters

33 LHC Computing Model The computing requirements cannot be satisfied at a single site –need to exploit geographically distributed resources –utilise established computing expertise and infrastructure –tap into funding sources not usually available to particle physics LHC Computing Model: –hierarchical structure with the CERN at the top tier –Large national or intra-national centres –National or institutional centres –Desktop Close integration of computing resources at external institutions with those at the accelerator laboratory has not been attempted -> very loose coupling ! The challenge now is to builld a network infrastructure capable of handling petabytes of data to build a software infrastructure (“middleware”) that makes the use of these resources transparent for the user to educate users so that they actually use the Grid

34 Tier0 – raw data storage (large capacity); first calibration; reconstruction Tier1 further calibrations; reconstruction; analysis; Monte-Carlo data generation; data distribution; large storage capacity “national” or “supranational” Tier2 and “lower” balance of simulation and analysis; as important asTier1’s “national” or “intranational” Tier3 (“institutional”) Tier4 (end-user workstations) LHC Computing Model

35 A Tier-1 example: GridKa@FZK

36 Grid-Geschichte Kilian Schwarz Rüdiger Berlich

37 in historischer Zeit... ● Gab es auch schon “Grids”. Man nannte sie damals “Fischernetze” Und auch damals wurden schon sinnvolle Sachen damit gemacht...

38 ARPANET ● 1969 – das ARPANET (Advanced Research Projects Agency Network – US Verteidigungsministerium) wurde mit 50 kb/s gebaut. ● “Resource Sharing Computer Network” unter Verwendung von Telefonleitungen und NCP (Network Control Program – schon mit telnet und FTP) ● 1973 – erste internationale Einbindung in das ARPANET

39 Basistechnologien ● 1973 – 1. Ethernetdemo von Xerox ● 1975 – TCP (Transmission Control Protocol)über Satellit ● 1976 – Cray-1 at LANL (8 MB RAM, 133 mfl) ● 1981 – RPC (Remote Procedure Call) ● 1982 - “Internet” definiert als ein auf dem TCP/IP Protokoll basierendes “Verbundnetz” ● 1984 – DNS (Domain Name Server) definiert

40 Supercomputer und Internetzeitalter ● 1988 – 60000 Rechner im Internet ● 1988 – NFSnet backbone 1,5 Mbs ● 1988 – IRC (Internet Relay Chat) ● 1988 - Cray Y-MP (2 GB RAM, 2,67 Gflops) ● 1988 – Start des Condor - Projekts

41 Condor ● Ein spezialisiertes WorkloadManagement System. Die Jobs werden in eine Queue gestellt, Condor wählt aus, wo und wann die Jobs laufen sollen, monitort den Jobverlauf und informiert den Benutzer. ● Condor Class Ads: Maschinen stellen ihre Fähigkeiten via “Classified Adverts” zur Verfügung, die Condor mit den Benutzeranforderungen vergleicht. Neben Anforderungen können auch Präferenzen angegeben werden.

42 Class Ads &JDL Requirements = ( other.Type == "machine" ) &&(member(other.Packages,"AliRoot") ); Packages = "AliRoot"; Arguments = "--round 2002-02 --run 00071 --event 269 --version v3.08.02 -–grun G+F"; Executable = "/Alice/bin/AliRoot.sh"; InputFile = { "LF:/alice/simulation/2002-02/v3.08.02/00071/Config.C", "LF:/alice/simulation/2002-02/v3.08.02/00071/grun.C" }; Type = "Job"; An Example – JDL file to run Alice Simulation job:

43 1994-1997 ● 1995 -- Netscape IPO 3.höchste in NASDAQ Geschichte ● 1995 – Java von SUN gestartet ● 1995 – I-Way (Information Wide Area Year) Vorläuferprojekt von Globus schon mit I. Foster und C. Kesselman ● 1996 -- SETI@homeSETI@home ● 1997 – Globus in Entwicklungsphase ● 1997 – Start des UNICORE Projekts (BMBF)

44 Uniform Interface für Computer Ressourcen ● Ein BMBF-Projekt ● Ziel: Vernetzung der Supercomputerzentren und einen einheitlichen Zugang zu schaffen unter Verwendung existierender Techniken ● Geschrieben in Java (Portabilität) ● Bietet Middleware-Funktionalität und ein Portal (GUI) ● Bietet Job Vorbereitung, Monitoring und Kontrolle, komplexe Jobzusammenhänge (Workflows), File Management, Zertifikatunterstützung, Resource Broker... bald

45 1997 - 1998 ● 1997 -- Condor bei NCSA eingesetzt. ● 1997 -- “Building a computational Grid” Workshop in Argonne National Lab (ANL) ● 1997 – Storage Resource Broker (SRB) (San Diego Supercomputer Center SDSC) ● 1998 – Foster/Kesselman: “The Grid Book” ● 1998 – XML v1.0

46 www.globus.org Okt. 1998 Globus v1.0.0 ● Kein vollständiges Grid, aber der führende Grid-Baukasten (die meisten Grid-Projekte basieren auf Globus v2.x) ● ANL, University of Chicago, University of Edinburgh,..., IBM, Microsoft,... ● Definiert und implementiert Standards für – Grid-Sicherheit (GSI) – Datenzugang und – Transfer (GASS und GridFTP) – Ressourcenverwaltung und – Verwendung (GIS und GRAM) – Jobausführung auf entfernten Clustern (globusrun)



49 www.globus.org Globus Security Infrastructure ● GSI sehr weit verbreitet – sozusagen zum Standard avanciert. ● Basiert auf dem frei verfügbaren SSLeay Paket und benutzt X.509 Zertifikate, die auf PKI (Public Key Infrastructure) ● Ermöglicht das “Single sign-on” auf dem Grid, die Benutzeridentität ist mit einem einzigen Zertifikat sichergestellt (also kein x-faches einloggen in verschiedene Ressourcen mit unterschiedlichen Passwörtern) ● (mehr Details in einem Extra-Abschnitt der Vorlesung)

50 1999 – 2001 große Projekte stehen in den Startlöchern ● 1999 Grid Forum 1 ● 2000 Eurogrid startet ● 2000 SUN Grid Engine ● 2000 NASA IPG (Information Power Grid), das high performance computing und data grid der NASA ● 2001 Start des EDG-Projekts ● 2001 Start von AliEn ● 2001 Global Grid Forum 1 (amerikanisches Grid Forum + Asia Pacific + europäisches Grid Form eGRID)

51 Geography HRZ Zentren ● FZ Jülich (D, A dmin. Coord.) ● CSCS Manno (CH) ● ICM Warsaw (PL) ● IDRIS Paris (F) ● Univ Bergen (N) ● Univ Manchester (UK)

52 The EGEE Project From Grid Research to Grid Deployment EGEE is funded by the European Union under contract IST-2003-508833 Dr. Rüdiger Berlich Forschungszentrum Karslruhe ruediger@berlich.de with special thanks to Marcus Hardt, Dr. Ulrich Schwickerath, Dr. Andreas Heiss / FZK, National e-Science Institute / Edinburgh Dr. Kilian Schwarz / GSI, Darmstadt

53 Grid components – Globus2 Basic functionality : submit job -> like simple batch submission system Does not contain a Resource Broker (see EDG) The Globus Toolkit includes among other components : GSI (Grid Security Infrastructure) : Authentication GASS (Global Access to Secondary Storage) Uses RSL (Resource Specification Language) to specify resources (e.g. min. size of memory, OS, etc.) When GRID computing becomes more mature, more services will migrate from the middleware into the Operating System. Needed for seemless integration !



56 Condor ● Ein spezialisiertes WorkloadManagement System. Die Jobs werden in eine Queue gestellt, Condor wählt aus, wo und wann die Jobs laufen sollen, monitort den Jobverlauf und informiert den Benutzer. ● Condor Class Ads: Maschinen stellen ihre Fähigkeiten via “Classified Adverts” zur Verfügung, die Condor mit den Benutzeranforderungen vergleicht. Neben Anforderungen können auch Präferenzen angegeben werden.

57 Class Ads &JDL Requirements = ( other.Type == "machine" ) &&(member(other.Packages,"AliRoot") ); Packages = "AliRoot"; Arguments = "--round 2002-02 --run 00071 --event 269 --version v3.08.02 -–grun G+F"; Executable = "/Alice/bin/AliRoot.sh"; InputFile = { "LF:/alice/simulation/2002-02/v3.08.02/00071/Config.C", "LF:/alice/simulation/2002-02/v3.08.02/00071/grun.C" }; Type = "Job"; An Example – JDL file to run Alice Simulation job:

58 Grid components – European DataGrid Joint three-year project of European Union Built on the Globus Middleware Goal: Development of methods for the transparent distribution of data and programs Needed in particle physics, biology (genome project), earth observation... 21 members, 15 compute centers (2-32 CPUs, up to 1 Terabyte of mass storage) LCG based on EDG2 – accesses resources with thousands of CPUs, e.g. GridKa at Forschungszentrum Karlsruhe Major new component: resource broker Project was finished in March 2004. Successor is EGEE (“Enabling Grids for eScience in Europe” - see below)

59 Grid components – European DataGrid

60 The AliEn middleware – Grid and Open Source Pure Open Source project, started as part of ALICE collaboration (CERN) Small development team (very different from EDG) Pragmatic approach (what do we have, how can we make it work) 3 Million lines of code (cmp. Linux kernel: ca. 5.5 Mio LOC) 99 % of the code taken from publicly available packages, mostly Perl Only about 1 % of the code had to be developed in addition Similar functionality to EDG framework Based on WebServices (SOAP, XML) Used in other projects, e.g. MammoGrid (UK), a breast cancer database See http://alien.cern.ch

61 Grid projects http://www.cordis.lu/ist/grids/projects.htm Many brilliant people with many brilliant (but incompatible) ideas

62 EGEE manifesto: Enabling Grids for E-science in Europe EGEE Applications Geant network Geant Research Network Grid infrastructure Goal Create a wide European Grid production quality infrastructure on top of present and future EU RN infrastructure Build on EU and EU member states major investments in Grid Technology International connections Several pioneering prototype results Large Grid development teams in EU Requires major EU funding effort Approach Leverage current and planned national and regional Grid programmes Work closely with relevant industrial Grid developers, NRENs and US-AP projects From Grid Research to Grid Deployment

63 EGEE Implementation ● From day 1 (1 st April 2004) Production grid service based on the LCG infrastructure running LCG-2 grid middleware. LCG2 = EDG2 + additional components LCG-2 will be maintained until the new generation has proven itself (fallback solution) ● In parallel develop a “next generation” grid facility „gLite“ Produce a new set of grid services according to evolving standards (Web Services) Run a development service providing early access for evaluation purposes Will replace LCG-2 on production facility in 2005 Globus 2 basedWeb services based EGEE-2EGEE-1LCG-2LCG-1 EDGVDT... LCG EGEE...AliEn

64 Networking NA1-5): 28% Emphasis in EGEE is on operating a production grid and supporting the end-users Mware/security/QA (JRA1-4): 24% Grid operations (SA1,2): 48% EGEE activities: relative sizes 80 70 2 years 32M Euro EGEE Duration Partners Deliverables EU funding

65 Summary / Conclusion











76 Grid Security ● how do we do secure Grids ?






















Herunterladen ppt "Introduction to Grid computing Bohol Philippines Kilian Schwarz, Florian Uhlig GSI."

Ähnliche Präsentationen