Hadoop-as-a-Service (HDaaS)

Slides:



Advertisements
Ähnliche Präsentationen
FlashCopy Lösungen für mySAP™ Business Hugo Boss
Advertisements

Ubuntu Enterprise Cloud (UEC) Thorben Lindhauer,
Private Cloud Services für Ihr Geschäft
Virtual Storage Das Fundament für Total Enterprise Virtualization.
Zusammenarbeit in Office mit den SharePoint Technologien Michael Carpi
Verwendungszweck: Diese Folien dienen zur Unterstützung von Microsoft Dynamics NAV-Vertriebsmitarbeitern und -Partnern für Produktpräsentationen beim Kunden.
Microsoft Dynamics NAV-Themenfolien
Virtualisierungslösungen
Bewertung von Cloud-Anbietern aus Sicht eines Start-ups
SKALIERBARE HARDWARE UNABHÄNGIGE LÖSUNGEN FÜR HSM, ARCHIVIERUNG UND SICHEREN DATENAUSTAUSCH YOUR DATA. YOUR CONTROL.
Microsoft Cloud OS Auf dem Weg zum optimierten Rechenzentrum
Netzwerke | Serversysteme | Client-Service | Groupware Darmstadt The Game Changer Microsofts Hyper-V v3 & HPs Insight Online Thorsten Podzimek,
VIRTUALISIERUNG MIT HYPER-V UND SYSTEM CENTER 2012
Copyright © 2013 DataCore Software Corp. – All Rights Reserved.. Mit Speichervirtualisierung mehr Effizienz, Performance und Kostenreduktion erreichen.
Virtual Data Centre – Claranet Managed Cloud
GRAU DataSpace 2.0 – DIE SICHERE KOMMUNIKATIONS- PLATTFORM FÜR UNTERNEHMEN UND ORGANISATIONEN YOUR DATA. YOUR CONTROL.
Thats IT!. Titelmasterformat durch Klicken bearbeiten Über uns Mit uns bekommen Sie: Beratung – Doing - Betreuung langjährige Erfahrung umfassende Beratung.
Grid applications, environments and programming models Präsentiert von: Beikircher Wolfgang Schgaguler Evelyn.
1 Teil 2 Einblicke in Exchange 2003 (Basis Beta 2)
Cloudlösungen für den Mittelstand –
Windows Server 2008 Kurzüberblick Dr. Richtmann+Eder AG Olschewskibogen München.
Technisches Update Veeam Backup & Replication Version 7 Cloud Edition
Ralf M. Schnell Technical Evangelist Microsoft Deutschland GmbH
Die 7 wichtigsten Punkte zur Volumenaktivierung mit Windows 7, die Sie beachten sollten © 2009 Microsoft Corporation. Alle Rechte vorbehalten. Als IT-Experte.
Dariusz Parys Developer Evangelist Microsoft Deutschland GmbH Christian Weyer Solutions Architect thinktecture.
| Basel Wege in die Cloud: Office 365 Dennis Hobmaier, Technical Solutions
1© Copyright 2011 EMC Corporation. Alle Rechte vorbehalten. EMC NETWORKER – BACKUP UND RECOVERY FÜR VM WARE.
Template v5 October 12, Copyright © Infor. All Rights Reserved. 1 Hospitality Cloud Services Marcus Schmid, Director ICS.
Microsoft Student Partners
EMC NETWORKER UND DATENDEDUPLIZIERUNG
Windows Server 2012 R2 Upgrade-Potential
Einheitliche Backup- und Recovery-Software
Webhosting an der Universität Zürich
0 Univention und Fujitsu Die zuverlässige Infrastruktur für Ihre Lösungen Christian Dettmers Alliance Management Copyright 2012 FUJITSU.
– TC-Jahreskongress.
EMC End-to-End-Funktionen für Microsoft
Grundlagen Virtualisierung und Green IT
Sitz Unterschleißheim Die Kernkompetenzen der teracuda…
EMC VSPEX™-Lösungen für MITTELSTÄNDISCHE UNTERNEHMEN
Softwaredefined Enterprise Axel Gronert März 2015 Schneller reagieren in einer dynamischen Welt © 2015 VMware Inc. Alle Rechte vorbehalten.
Was spricht für EMC für SQL?
HADOOP – a Brief overview
VRealize Operations Insight. Sehen & analysieren Sie all Ihre IT-Daten Structured Data Metrics Alerts Events VMware vRealize Operations Kapazität, Leistungs-
System Center 2012 Automatisierung von IT-Prozessen Sinja Herbertz Education Support Centre Deutschland.
Jan Rohe Education Support Centre
Ralf M. Schnell Technical Evangelist Microsoft Deutschland GmbH.
Distributed Database Systems Parallele Datenbanksysteme von Stefan Schneider.
Zürcher Fachhochschule Einführung in Big Data I Kurt Stockinger 1.
Cloud Entwicklung: Web Services
SAN & NAS © Nicole Waibel Index NAS SAN Quellen
Lync and Learn mit Manfred Helber Thema heute: Überblick der Leistungserweiterungen von Windows Server 2016 Start: 9.30 Uhr 30 Minuten Vortrag & Demo 30.
VMware vCloud Director / Connector
1© Copyright 2014 EMC Deutschland GmbH. Alle Rechte vorbehalten. ORACLE-CLOUD-COMPUTING MIT EMC.
Oracle Exadata und HP Oracle Database Machine © 2008 Oracle Corporation – Proprietary and Confidential Alfred Schlaucher (Oracle Data Warehouse) EXTREME.
Rechen- und Kommunikationszentrum (RZ) TSM vs. inSync Seminarvortrag am von Nicole Temminghoff Betreut von: Prof. Dr. Andreas Terstegge Dr.
© 2013 IBM Corporation1 IBM PureSystems Familie Februar 2013 Paul Höcherl
1 VMware vCenter Chargeback Manager Rüdiger Melzer Senior Systems Engineer, Alliance Management VMware
Das Software Defined Datacenter Rüdiger Melzer Senior Systems Engineer, Alliance Management VMware
Wir befinden uns inmitten einer Zeit des Wandels.
VMware vCloud Automation Center Rüdiger Melzer Senior Systems Engineer, Alliance Management VMware
Alfred Schlaucher, Data Warehouse Architect, Oracle Oracle Data Warehouse.
WS2016: Container von A bis Z
Azure Backup, Azure Backup Server und Azure Site Recovery
Azure Active Directory und Azure Active Directory Domain Services
Erweiterte Azure Dienste
Herzlich willkommen! Windows Server 2016, System Center 2016 & Windows 10 Berlin,
1.
Digitale Transformation
Enterprise Search Solution
Computer Services Herausforderung ABK hatte sich zum Ziel gesetzt, den Kundenbedarf nach schnelleren Zahlungen zu erfüllen und trotz rapiden Wachstums.
 Präsentation transkript:

Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer @ EMC Marius Lohr – Systems Engineer @ EMC

Fallbeispiel: CIO eines DAX Unternehmens klassische IT Dienste: neue IT Dienste:

Verbesserung operatives Geschäft Die Möglichkeiten neue Geschäftsfelder Risikominimierung Verbesserung operatives Geschäft Umsatzsteigerung hjdfhjdsfh

Die Herausforderungen Kostendruck ggü. Cloud Anbietern Fehlendes Wissen über Hadoop Infrastrukturen Schnelles Deployment Anforderungen und Workloads mehrere Mandanten Hochverfügbarkeit und Datensicherheit

Klassische Hadoop Architektur Sqoop PIG Mahout Hive HBase NameNode Job Tracker Task Tracker DataNode 2nd NameNode Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node Ethernet NameNode Data Node + Compute Node Data Node + Compute Node Data Node + Compute Node

Klassische Hadoop Architektur dedizierte Serverumgebung mit lokalem Storage Hardware und Kapazität nur für Hadoop Daten gedacht Effizienz schlechte CPU Auslastung da auf Lastspitzen zugeschnitten 3-fach Spiegelung (300% Brutto) durch Hadoop Architektur Skalierungsmöglichkeiten starres Verhältnis von Compute Node zu Data Node Enterprise Class Dienste Fehlende Datensicherungskonzepte wie Snapshots, Replikation, Backup Keine logische Trennung von Mandanten One challenge associated with traditional deployments of Hadoop, is that it has largely been done on a dedicated infrastructure and not integrated with or connected to any other applications. In effect, a silo’d environment, often outside the realm of the IT team. This poses a number inefficiencies and risks. <Click to next slide>

Hadoop Architektur mit konsolidiertem HDFS Storage Sqoop Mahout Hive HBase NameNode PIG Job Tracker Task Tracker DataNode HDFS Compute Node Compute Node Compute Node Ethernet name node data node Compute Node Compute Node Compute Node

Schnelles Deployment von Hadoop Clustern in virtuellen Umgebungen Project Serengeti Open-Source Projekt Schnelles Deployment von Hadoop Clustern in virtuellen Umgebungen VM VM VM VM vCenter Management Server Hadoop Node Hadoop Node Templates vSphere + Serengeti Host Host Host Host

Hadoop-as-a-Service Referenzarchitektur Self Service Portal Serengeti Orchestration & Chargeback User Management Hadoop virtuell Compute Node Compute Node Compute Node vCenter physikalisch HDFS Name node data node Infrastructure Mgmnt

HDaaS Workflow Data Scientist Hadoop Cluster Shared HDFS Storage AD 7: Access and Analyze 1: Request 6: Notify PIVO-TAL HD MASTER HD WORKER Hadoop Cluster SELF SERVICE PORTAL 6: Notify SERENGETI 3: Invoke ORCHESTRATOR 4b: Provision Compute 5: Instantiate 2: Validate 4a: Provision Storage USER/ TENANT MGMT HDFS/ REST API Shared HDFS Storage AD How the environment works: A data scientist has a new workflow task: They make a request through the portal webpage for a new cluster resource The vCAC service broker asks the authentication source is the user making the request is ok to proceed vCAC sends it on to vCOPS to instantiate all the calls to storage, BDE, and resource profiles to make sure it can be configured in the existing environment If so the resources are provisioned BDE configures the Hadoop environment(s) Notification is passed back through vCOPS and surfaced to the user The user can now access the new environment and run their jobs

Vorteile einer entkoppelten und virtualisierten Hadoop Infrastruktur unabhängige Skalierung der Infrastruktur Compute und Data Nodes voneinander unabhängig erweiterbar bessere Ausnutzung der IT Infrastruktur >80% Storage Utilization, verbesserte CPU Utilization parallele Workloads von non-Hadoop Applikationen auf gleicher Hardware automatisierte Bereitstellung und einfaches Management konsolidierter HDFS Speicher Compute Templates als Basis für schnelles Deployment Mandantentrennung Logische Trennung der Datenzugriffe Logische Trennung der Compute Nodes zusätzlicher Schutz der Daten Snapshots, Replikation, Backup Data Scientist HDFS Virtualisierte Hadoop Cluster Shared HDFS Storage Hadoop-as-a-Service Referenzarchitektur EMC Isilon has recently introduced a new scale-out NAS solution for Hadoop that is designed to readily support business analytics as well other enterprise applications and workflows. (This eliminates the silo’d infrastructure approach used in many initial Hadoop deployments.) The new EMC solution also eliminates the “single-point-of-failure” issue. We do this by enabling all nodes in an EMC Isilon storage cluster to become, in effect, namenodes. This greatly improves the resiliency of your Hadoop environment. The EMC solution for hadoop also provides reliable, end-to-end data protection for Hadoop data including snapshoting for backup and recovery and data replication (with SyncIQ) for disaster recovery capabilities. Our new Hadoop solution also takes advantage of the outstanding efficiency of EMC Isilon storage systems. With our solutions, customers can achieve up to 80% or more storage utilization. EMC Hadoop solutions can also scale easily and independently. This means if you need to add more storage capacity, you don’t need to add another server (and vice versa). With EMC isilon, you also get the added benefit of linear increases in performance as the scale increases. EMC also recently announced that we are the 1st vendor to integrate the HDFS (Hadoop Distributed File System) into our storage solutions. This means that with EMC Isilon storage, you can readily use your Hadoop data with other enterprise applications and workloads while eliminating the need to manually move data around as you would with direct-attached storage.

EMC Scale-Out Data Lake Foundation TRADITIONAL WORKLOADS NEXT-GEN WORKLOADS NAS DAS File Shares Analytics SAN CLOUD HPC Mobile TAPE OBJECT Backup/Archive Cloud Apps © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved. 12

EMC Scale-Out Data Lake Foundation TRADITIONAL WORKLOADS NEXT-GEN WORKLOADS DAS NAS File Shares Analytics CLOUD SAN Data Lake Foundation HPC Mobile TAPE OBJECT Backup/Archive Cloud Apps © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved. 13

Next-Gen Access Methods FILE SMB HDFS File Shares FILE Analytics FTP REST HPC Mobile NFS SWIFT NDMP HTTP Key Message: - This is not a placid data lake where data goes in and just sits. (CLICK) This is an active and vibrant data lake that supports multiple protocols and access methods. (CLICK) So when a file moves from a file share in to this data lake it can be actively used (click) and leveraged by all the applications.. An example here is that the files in this unstructured data lake can be cross correlated from multiple sources to perform hadoop big data analytics on it to garner valuable business insights. The same set of files can be accessed using Syncplicity in your iphone or ipad for access on the go. In addition to the business benefits of consolidating the unstructured data, the Isilon scale out data lake also eliminates silos of storage, provides simplified management and scales massively to meet the demands of unstructured data growth Backup/Archive Cloud Apps © Copyright 2014 EMC Corporation. All rights reserved. © Copyright 2014 EMC Corporation. All rights reserved. 14

Expanded Enterprise-Grade Features DATA PROTECTION Isilon Data Lake Foundation DATA MANAGEMENT DATA SECURITY Further, this is a fully enterprise ready Scale out data lake – we have a vast array of enterprise grade features spanning data protection, data security, data management and performance management that ensures that the data lake stores, protect, secures and manage your data with ease. New features include a new protection policy for higher capacity nodes A protection optimizer that constantly monitors and alerts when protection drops below suggested leve NFS audit NFS multi-tenancy (to complement SMB and Hadoop capabilities) New InsightIQ for better reporting and tracking SmartFlash on all archive platforms for increased system performance PERFORMANCE MANAGEMENT © Copyright 2015 EMC Corporation. All rights reserved. 15

Haben Sie noch Fragen?