Towards a Web-scale Data Management Ecosystem Demonstrated by SAP HANA

Slides:



Advertisements
Ähnliche Präsentationen
EH&S Abhängigkeiten SAP Best Practices for Chemicals (Germany)
Advertisements

ES Community Definition Group – Bundle for Customer Billing and Payment In-Person Meeting April 29, 2008 Ratingen, Germany.
Unternehmensstruktur Übersicht
T05 – Mobile Online Applikationen
SAP Sales and Operations Planning on HANA
Arbeitspaket Layouts und CRM Übersicht (Layouts)
Verkaufen im 21. Jahrhundert Wir stellen vor: SAP Cloud for Customers
Enterprise Mobility, Live!
SAP AG 2011, Introduction to SAP Business One 8.8, GTM Rollout Services Page 1 SwissAddOn Installation und Setup Allgemein: Zur Abdeckung Schweiz-spezifischer.
SAP Jam – Social Collaboration
Customer and Market Strategy, Cloud Business Unit, SAP AG Mai 2013
Einführung SSO bei Hero
Neues bei V1.603 SAP Best Practices for Chemicals (Deutschland) SAP Best Practices.
Test Code Design Pattern
Use this title slide only with an image Integrating OData Services into the Semantic Web Marc Kirchhoff September 2014 Public.
SAP / Past&Present&Future
Sprachneuerungen des .NET Frameworks 3.5
Use this title slide only with an image Software Krisztián Mihály Marc 10, 2015 Public.
Use this title slide only with an image SAP PartnerEdge program for ApplicationDevelopment Additional a-la-carte services & resources May 13, 2014 Public.
Organizational Change Administration with Funds or Grants Management (978) SAP Best Practices.
C65 – Aktivitätsmanagement
C66 – Account- und Kontaktmanagement
C83 – Interaction Center (IC) Serviceanforderungsmanagement
Cost Center Accounting with Funds Management Overview
Enterprise Structure Overview
Process Flow Diagram Create Standard Cost for Individual Material
How Unit Tests drive The Code (draft) Vasil Penchev Diana Kalcheva Date.
Rapid-Deployment Solution C81 Interaction Center Marketing Ablaufdiagramme.
Ralf M. Schnell Technical Evangelist Microsoft Deutschland GmbH
Scenario Overview – 1 Purpose and Benefits: Purpose Benefits
Christian Binder Senior Platform Strategy Manager Microsoft Deutschland GmbH.
Literary Machines, zusammengestellt für ::COLLABOR:: von H. Mittendorfer Literary MACHINES 1980 bis 1987, by Theodor Holm NELSON ISBN
Arbeiten in einem agilen Team mit VS & TFS 11
Scenario Overview – 1 Purpose and Benefits: Purpose Benefits
C67 – Pipeline Performance Management Ablaufdiagramm EHP2 für SAP CRM 7.0 EHP2 für SAP CRM 7.0, Version für SAP HANA.
General Ledger Accounting with Funds Management Overview SAP Best Practices for Public Sector US V1.603.
? What is Open PS? SAP Open PS based on EPS 4.0
Mitmachen – IBM lädt Studenten zur DNUG-Frühjahrkonferenz! Freikarten gegen Fachbeitrag zu Enterprise 2.0, Collaboration oder IBM Lotus.
Use this title slide only with an image Customer Coffee Corner for SAP IQ – Roles Saroj Bagai/SAP Global Product Support July 9, 2015 Customer.
Budget Maintenance with Budget Control System Overview SAP Best Practices for Public Sector US V1.603.
Scenario Overview – 1 Purpose and Benefits: Purpose Benefits
Unternehmensstruktur Übersicht
C39 – Schlankes Kampagnenmanagement
Gregor Graf Oracle Portal (Part of the Oracle Application Server 9i) Gregor Graf (2001,2002)
Use this title slide only with an image How to get Host Metrics on BI4.x with SAPOSCOL Marlon Hernandez / BI Deployment October / 2014 Public.
INTERN TB1200 SAP Business One – Implementierung und Support Release 9.0.
Use this title slide only with an image SAP Incidents & Support Tools SAP Product Support with City of Mississauga Public.
CRM 7.0 EHP 3 and Future Utility - Enhancements Customer Connection ASUG & SAP.
User Data Maintenance Learning Map SAP Support Portal.
Customer Icons for Object Families in the IBase WebClient UI Michael Wuschek and Gerhard Gebhard SAP AG CRM Development January 2009.
Use this title slide only with an image SAP & SAP HANA TIM101 - UCSC Jay Thoden van Velzen, HANA Services Global Practice, SAP SE April 30, 2015 Public.
© 2012 IBM Corporation © 2013 IBM Corporation IBM Storage Germany Community Josef Weingand Infos / Find me on:
Use this title slide only with an image LC New Tax Scales 2014 GS HCM CH December 19th, 2013 Customer.
January 2014 Message manager add-on for SAP SMS 365, enterprise service Automator.
Data Extraction for SAP Business Objects Spend Performance Management.
Use this title slide only with an image Erfahrungen eines Arbeitgebers Nico Herzberg – Ausbildungsleiter SAP Dresden März 2016 Public.
Technische Universität München Institute of Aeronautical Engineering Prof. Dr.-Ing. Horst Baier Presentation of the Institute (December 2009)
(Name of presenter) (Short title of presentation).
Custom error page for timeout Gergely Andó / Application Innovation July 10, 2013 Customer.
LLP DE-COMENIUS-CMP Dieses Projekt wurde mit Unterstützung der Europäischen Kommission finanziert. Die Verantwortung für den Inhalt dieser.
Use this title slide only with an image Data Broker & Digital Rights The Need for Dialogue ITU-T, Big Data Workshop Daniel Faulk, SAP AG June 17, 2014,
Use this title slide only with an image Accelerate Digital-First Strategy with SAP Content Management Solutions.
©2016 SAP SE or an SAP affiliate company. All rights reserved.1Internal Time-Based Fast Facts Information INTERNAL SLIDE BACKGROUND The time-based fast.
Workshop 1 Getting Started 2016 Boris Wylutzki
Official Statistics Web Cartography in Germany − Regional Statistics, Federal and European Elections, Future Activities − Joint Working Party meeting.
Integrating Knowledge Discovery into Knowledge Management
Zhunussova G., AA 81. Linguistic communication, i.e. the use of language, is characteristically vocal and verbal behaviour, involving the use of discrete.
 Präsentation transkript:

Towards a Web-scale Data Management Ecosystem Demonstrated by SAP HANA Stefan Bäuerle, Jonathan Dees, Franz Faerber, Wolfgang Lehner

Agenda Motivation & Requirements Different Processing Engines and Integration Scale out edition engine

Application requirements for a modern DBMS Different: data types consumption models data models notions of consistency application and query language levels of scaling hardware capabilities

HANA Platform

HANA System

Beyond relational data processing (1/3) Integrate as deep as possible into the engine Bringing OLAP and OLTP together Proven: works in thousands of customer systems Simplicity: get rid of extracts, loads and redundancy, one system OLAP dominates OLTP in real world systems: optimize accordingly Data mining and prediction Examples: Basked analysis, different forecasting algorithms… Easy interaction with R and SAS Unstructured data Support text search > 30 languages including: Stemming, speech tagging, noun extractions, … Classification, clustering, named entity recognition, sentinel analysis Planning extensions Planning: Define and align business figures for foreseeable future Data heavy operators like disaggregation or logical snapshots

Beyond relational data processing (2/3) Graph processing Real world business data often resembles graphs Model as graph: More explicit and more efficient operators Distance, siblings, shortest path, reachability, transitive closure, … Hierarchy processing Special type of general graphs Used by almost every business application Support for time dependent and versioned hierarchies Extended graph operators: level, neighbor, is_ancestor, … Geospatial processing & Time series Native relational data types Existing compression techniques + powerful specializations for sensor data Spatial: WithinDistance, Contains, Area, … Time series: Group by time interval, Interpolate Missing Values, …

Beyond relational data processing (3/3) Scientific processing Bring prominent operators into the engine Simplifies and speeds up operations in scientific and financial area Matrix operators: Eigenvalue, Multiply, … Financial operators: Interest Rates, GarmanKohlagenProcess, … No SQL processing Document based models, XML, JSON, … Key value stores Flexible Schema, in HANA via specific flexible table type Massive scale out Conventional business applications fit on single box, but: there is a new kind of applications requiring massive scale out Deep and seamless integration with the Hadoop system Scale out and single box application act as one system

Application integration ( examples ) Currency conversion Hierarchy handling Aging / dynamic tiering Dictionary maintenance Graph optimizations

HANA Data Platform Dynamic Tiering HANA Dynamic Tiering Declare table to use disk storage Cost efficient for big data Optimized disk based processing powered by IQ New warm option beside Hot (in-memory) Cold (Near Linear Storage) CREATE TABLE „demo“.“SalesOrders_WARM“ ( ID Integer NOT NULL, CustomerID Integer NOT NULL, OrderDate date NOT NULL, …, PRIMARY KEY (id) ) USING EXTENDED STORAGE; INSERT INTO „demo“.“SalesOrders_WARM“ VALUES ( … ); HANA Dynamic Tiering Native Big Data solution – real-time insights – ALL enterprise data Preferred for struct./transactional cases Manage data cost effectively, yet with desired performance based on SLAs Terabytes to Petabytes Application defined temperature Single Database experience Update & query all data seamlessly via HANA tables Centralized operational control

HANA Data Platform BigData | Vision HANA Data Management Platform HANA native BigData Dynamic Tiering Smart Data Streaming NoSQL | Graph | Geo | TimeSeries HANA & Hadoop SDA  Hive | Spark MapReduce | HDFS Admin & Monitoring User Mgmt / Security Hadoop Extension Velocity Engine Integrated with HANA and Hadoop Information Management | Text | Search | Graph | Geospatial | Predictive ∞ SAP HANA In-Memory HANA Dynamic Tiering HADOOP HANA Scale Out 0.1sec Infinite Storage Raw Data Instant Results Warm Data Smart Data Streaming Administration | Monitoring | Operations | User Management | Security

SAP HANA Massive Scale Out Edition (Velocity) Motivation: Engine for massive scale out and big data Key Features: Scale to thousands of nodes Different data freshness and consistency levels Efficient fail safety design First class citizen within Hadoop (Spark) Support variety of hardware and operating systems Extreme query performance by compiling SQL to native code

SAP HANA SOE (Velocity) and Hadoop (1/2) Ambari Cluster Management Hadoop Ecosystem Zookeeper Coordination Pig Scripting MLib Machine Learning Hive SQL SparkSQL Yarn Processing HDFS Distributed File System HBase Database Spark Processing

SAP HANA SOE (Velocity) and Hadoop (2/2) Steps Stage 1: Integration with Spark (2015) Stage 2: Independent execution cluster Benefits Integration of SAP data with data lakes HANA features add Value into Hadoop (e.g. SQL extensions like time series, hierarchies, …) Performance Holistic data platform

Architecture to Support Different Data Freshness Levels Options read your own writes up-to-date data vs. certain age Separate component for Transactions DTX Query Engine 1 Transaction Broker Version Table A, B, C Query Engine 2 Query Engine 3 R Storage 1 Storage n Storage 2 Distributed Log … A, D A, C, D DQP Storage (checkpoints) Connection n Connection 1 (Session data)

SAP HANA scale out integration

Conclusion Today’s applications have multidimensional set of specialized requirements Gains from moving these requirements into a (single) DBMS: Simplified and more explicit data modeling and processing for applications Increased performance No complicated data transfer between specialized engines Powerful orchestration required Web-scale processing is key to support new applications SAP HANA strives to answer all these requirements in a single data management platform.

SAP HANA Massive scale out edition (Project Velocity) Scales to thousands of nodes Support of massive distribution and failure tolerance ACID properties on large landscape Can run on small devices Low footprint allows to run on small commodity hardware and small devices Integration into Hadoop infrastructure ( Spark ) Access via standard Hadoop mechanisms ( i.e. map & reduce) Deep integration into Spark execution framework Extreme performance with SQL compilation Compile SQL into C code and realtime compilation into executable Support for IoT and semi structured data Special data types for IoT ( time series data) Support of document style data in a massive scale environment Big modification of slide in Stanford EE Computer Systems Colloquium v1.0 (Chris Hallenbeck, Richard Pledereder) The topics under High Performance (compression, parallelization and scanning) receive major attention in this section. Column store is emphasized, although row store is mentioned. ACID (Atomicity, Consistency, Isolation, Durability)

SAP HANA SOE (Velocity) and Hadoop (2/2) General: Embrace Hadoop as technology Goal: Get our own Engine on Hadoop Velocity  HANA Scale-Out Extension Steps First step: Integrated with Spark ( Q3 2015) Mid Term: independent execution cluster Benefits Holistic data platform Integration of SAP data with data lakes HANA features on Hadoop (e.g. time series) Value added abilities on Hadoop data Performance General: Embrace Hadoop as technology Goal: Get our own Engine on Hadoop Velocity  HANA Scale-Out Extension Steps First step: Integrated with Spark ( Q3 2015) Mid Term: independent execution cluster Benefits Holistic data platform Integration of SAP data with data lakes HANA features on Hadoop (e.g. time series) Value added abilities on Hadoop data Performance

Architecture to Support Different Data Freshness Levels Distributed query processor Workers Distributed transaction manager Velocity (OLTP) Velocity (OLTP) Velocity (OLAP) Velocity (OLAP) Distributed log Distributed filesystem (for checkpoints …) Text Document Graph Time series Storage

Thank you