Oracle 12.2 Neue Features für das Data Warehouse

Oracle 12.2 Neue Features für das Data Warehouse
½ - Tagesseminar Oracle Data Warehouse Community Oktober 2016 Oracle Confidential – Restricted

Noch einbauen > select object_id, name, total_access_count, total_rows_returned, last_used from dba_index_usage where owner='HR'; select JSON_OBJECT('empid' is empno, 'empname' is ename, 'Managerid' is mgr, 'Deptid' is deptno) emp from scott.emp where deptno=10; select JSON_OBJECTAGG(ename, job) employees_per_manager from emp group by mgr; Oracle Confidential – Restricted

Themen DWH-Architekturen „Wo wir stehen“
Es muss nicht immer so genau Manchmal reicht‘s auch „ungefähr“ Materialized Views Big Data SQL Partitioning In-Memory Analytic Views Zusammenfassende Szenarien Oracle Confidential – Restricted

DWH Architekturen – „Wo wir stehen“
Oracle Confidential – Restricted

Flexibilität und schnelles Bereitstellen
Enterprise Layer Core - DWH / Info Pool User View Layer Integration Layer Service Servicekunde R B S D F Logistik Logistikaufwand T Controlling Profitabilität T Zusammenhängender Abfragebereich T Einkauf Produkte&Trends B Vertrieb Kundenhistorie Kunde B Marketing Marketingsicht Log Files Web-Clicks Mails Call-Center Verträge Berichte Es geht um Gesamtsichten „Breite“ der Datenmodelle Geschäfts- Objekte Kennzahlen Multi-strukturierte Daten HDFS / NoSQL

Multi-strukturierte Daten
Flexibilität und schnelles Bereitstellen Enterprise Layer Core - DWH / Info Pool User View Layer Integration Layer Rolle der zentralen Schicht Sich aus Informations- vorat bedienen Schneller reagieren können Synchronisierung von „User Views“ Vermeiden von Redundanzen Historische Sicht Service Servicekunde R B S D F Logistik Logistikaufwand T Controlling Profitabilität T Strategische Daten T Einkauf Produkte&Trends B Vertrieb Kundenhistorie B Taktische Daten Marketing Marketingsicht Log Files Web-Clicks Mails Call-Center Verträge Berichte Multi-strukturierte Daten HDFS / NoSQL

Es muss nicht immer so genau Manchmal reicht‘s auch „ungefähr“

Approximate Query Processing

Aggregierte Berichte müssen nicht immer 100% genau sein
Kernproblem Viele BI-Abfragen nutzen oft aggegrierte Informationen Oft besonders intensive Ressourcennutzung “Teure Abfragen” Beobachtungen Viele BI Anwendungen benötigen nicht exakte Ergebnisse sondern begnügen sich mit einer Trenderkennung Auf aggregierter Ebene sind Unterschiede im Detail kaum noch wahrnehmbar Explorative Analysen begnügen sich oft mit einem “ersten groben Blick”. Eine Detail- Analyse kann später nachfolgen For example, give sales numbers of a large number of stores, quickly find the sales number at 90% and then find all the stores whose sales exceed 90%

Die Neuerungen in Oracle 12.2
“Ungefähre SQL-Abfrage-Ergebnisse“ Dennoch hohe Genauigkeit trotzt massiver Resourcen- Einsparung Highly accurate Genauigkeit meist > 97% (mit 95% confidence) Antworten sind massiv schneller als bei 100 % exakten Abfragen

Approximate Query Features
Die “Approximate”-Lösung gibt es für einige häufig genutzte (und teure) Aggregate approx_count_distinct (seit 12.1) approx_percentile, approx_median (jetzt in 12.2) Materialized Views (MV) unterstützen die “Approximate Funktionen” (available in 12.2) “Ungefähre Ergebnisse werden gesondert in Materialized Views gespeichert und stehen zusätzlich für Abfragen bereit NOTE – Cannot be used as analytic functions

Approx_count_distinct
Gleiche Anwendung wie wie count(distinct …) Intensivere Memory-Nutzung pro Group By Massiv weniger Disk-IO z. B. bzgl. Temp Space Parallelisierbar NOTE – Cannot be used as analytic functions

Approx_Count_Distinct Experiment
Lots of temp Exact query 44X faster! no temp Approx query

Approx_percentile/Approx_median
Für numerische und Date-bezogene Feld-Typen Intensivere Speichernutzung, weniger IO Angabe von Fehlerrate und Confidence Level der Schätzung NOTE – Cannot be used as analytic functions SELECT APPROX_MEDIAN (sal) AS median_sal, APPROX_MEDIAN (sal, 'ERROR_RATE') AS error_rate, APPROX_MEDIAN (sal, 'CONFIDENCE') as confidence FROM emp ;

Approx_Percentile Experiment
Exact query Lots of temp 13x faster! No temp Approx query

Anwendung auch auf bestehende Abfragen
Ohne Änderung von Code Anwendbar auf bestehende Umgebungen. Wirkt auch bei BI-Tool- Abfragen, die von außen nicht beeinflußt werden können: Grundlegende Einstellung approx_for_aggregation = TRUE Unterparameter approx_for_count_distinct = TRUE Approx_for_percentile = 'all'|'percentile_disc'|'percentile_cont'|... Einsetzbar auf Session und System Level /* approx_for_aggregation takes boolean value * TRUE - convert exact aggregation to their approximate counterpart * FALSE - do not convert (default value) * This is an umbrella parameter. Once it is turned on, all aggregation * functions that have approximate counterparts are converted to the * approximate version. * by default, the parameter value is FALSE, i.e., no conversion to * approximate version. For now, it controls one child parameter (more might * be added later) * - approx_for_count_distinct * - approx_for_percentile */ /* approx_for_percentile takes string value * none - do not convert (default) * percentile_cont * convert percentile_cont to approx_percentile * percentile_disc * convert percentile_disc to approx_percentile * all * convert both percentile_cont and percentile_disc to approx_percentile * this is controlled by the umbrella parameter approx_for_aggregation.

Business Case For Approximate Aggregations
Continuous data exploration SM Manager View Buying Manager View Continuous data exploration…slicing/dicing BI queries tend to start at top level and drill down Exploration down/across dimensional layers Business teams need different slices Requires multiple levels of aggregate approximations Requires a cube-like approach to calculating results Needs to return results as fast as possible Needs to be space efficient Time Customer Movies Session Log Financial Manager View Ad Hoc View

Building Approximate Aggregates
Three functions to support creation of aggregate approximations: APPROX_xxxxxx_DETAIL(expr) builds summary table containing results for all dimensions in GROUP BY clause APPROX_xxxxxx_AGG (expr) Rolls up summary in lower level to higher level TO_APPROX_xxxxxx(detail, …) Returns results from the specified aggregated results

Materialized View Beispiel (Creation)
CREATE MATERIALIZED VIEW percentile_per_state_mv ENABLE QUERY REWRITE AS SELECT state, county, APPROX_PERCENTILE_DETAIL(sales_volume)AS detail FROM sales GROUP BY county, state;

Materialized View Beispiel (Rewrite Anwendung)
SELECT state, APPROX_MEDIAN(sales_volume) AS detail FROM sales GROUP BY state; SELECT TO_APPROX_PERCENTILE( APPROX_PERCENTILE_AGG(detail), 0.5) FROM percentile_per_state_mv MV rewrite

Approximate Query Takeaways
Compared to exact query, approximate query uses much less memory much faster response time comparable accuracy Approx_for_* parameters enables running existing scripts in approximate mode Create materialized views to achieve even faster response time

Optimierungen für mengenbasiertes Prüfen mit SQL in der Datenbank

Mengen-basierte Prüfungen mit SQL
Attribut-bezogene Regeln Not Null / Pflichtfelder Formatangaben numeric Alphanumerisch Date Masken Div. Check Constraint Wertbereiche Ober-/Untergrenzen / Wertelisten Satz-bezogene Regeln Abhängigkeiten von Werten in anderen Attributen desselben Satzes Satz-übergreifende Regeln Primary Key / Eindeutigkeit Aggregat – Bedingungen Ober- Untergrenzen von Summen Anzahl Sätze pro Intervall usw. Rekursive Zusammenhänge Verweise auf andere Sätze derselben Tabelle (Relation) select bestellnr, case when -- wenn Feld BESTELLNR nicht numerisch REGEXP_LIKE(BESTELLNR, '[^[:digit:]]') then 1 else 0 End Num_Check_bestellnr from bestellung; A select CASE WHEN (F1 = 3 and F2 = F3 + F4) then 1 ELSE 0 end from fx B insert /*+ APPEND */ into err_non_unique_bestellung select bestellnr from (select count(bestellnr) n, bestellnr from bestellung group by bestellnr) where n > 1; C

validate_conversion für schnellere Typ-Prüfungen
Optimal bei der Übernahme von nicht geprüften Feldern aus Vorsystemen vom Typ Text (z. B. varchar) Gerade Feld-bezogene Prüfungen sind sehr teuer Typ-Prüfungen für Binary_Double, Binary_float Date, number Interval DAY to Second / Year to Month Timestamp / with time zone / with local time zone Oracle Confidential – Restricted

Konzept für Einzelfeldprüfung
Gepruefte_Daten Stage-Tabelle Temp-Tabelle Date Number Varchar2() INSERT ALL WHEN Feld_1_is_null =1 into Gepruefte_Daten WHEN Feld_1_is_null=0 into Error_Daten Varchar2() Varchar2() INSERT INTO temp_table SELECT validate_conversion .... FROM Stage_Table Feld1 Feld2 Feld3 Feld1 Feld2 Feld3 Feld1_is_null Feld1_is_numeric Feld2_is_numeric Kopieren Error_Daten Date Number Varchar2() Alle Feld-bezogenen Prüfungen an einer Stelle Mengenbasiert mit SQL Extrem schnell Temporäre Tabelle ist optional Ist wesentlich übersichtlicher Erlaubt Kombination von unterschiedlichen Prüfkriterien Leichte Fehlerstatistik Automatisch aussortierte Fehlersätze 26

Umsetzungsbeispiel Transaktionen_ok
Transaktionen Bestelldatum varchar2(20), Menge varchar2(20), Artikel_ID varchar2(20), Kunden_ID varchar2(20) Tmp_Transaktionen Transaktionen_error insert all when Bestelldatum_Datum_check = 1 and MENGE_NUMBER_CHECK = 1 and KUNDEN_ID_NUMBER_CHECK = 1 and ARTIKEL_ID_NUMBER_CHECK = 1 then into transaktionen_ok (BESTELLDATUM,Menge,Kunden_id,Artikel_ID) values else into transaktionen_error select BESTELLDATUM, BESTELLDATUM_DATUM_CHECK, MENGE, MENGE_NUMBER_CHECK, KUNDEN_ID, KUNDEN_ID_NUMBER_CHECK, ARTIKEL_ID, ARTIKEL_ID_NUMBER_CHECK from tmp_transaktionen; Messung select count(*) from transaktionen_ok; select count(*) from transaktionen_error; select count(*) from tmp_transaktionen where Bestelldatum_Datum_check = 0; create table tmp_transaktionen as select bestelldatum, validate_conversion(Bestelldatum as date) Bestelldatum_Datum_check, Menge, validate_conversion(Menge as number) Menge_number_check, KUNDEN_ID, validate_conversion(KUNDEN_ID as number) KUNDEN_ID_number_check, ARTIKEL_ID, validate_conversion(ARTIKEL_ID as number) ARTIKEL_ID_number_check from Transaktionen; Gesamtlaufzeit bei 1 Million Sätze in weniger als 3 Sekunden (auf Laptop – DB , ohne Parallelisierung)

Materialized Views und Kennzahlensysteme

Materialized Views optimieren Zugriffe - Konzept
D_REGION (Dimension) REGIONNR REGION LANDNR LAND KREISNR KREIS ORTNR ORT REGION_ID 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑹𝒆𝒈𝒊𝒐𝒏 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑳𝒂𝒏𝒅 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑲𝒓𝒆𝒊𝒔 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑶𝒓𝒕 F_UMSATZ (Fakten) KANAL_ID KUNDEN_ID ZEIT_ID REGION_ID ARTIKEL_ID UMSATZ MENGE UMSATZ_GESAMT FK FK 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑹𝒆𝒈𝒊𝒐𝒏 𝒑𝒓𝒐 𝑨𝒓𝒕𝒊𝒌𝒆𝒍 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑳𝒂𝒏𝒅 𝒑𝒓𝒐 𝑨𝒓𝒕𝒊𝒌𝒆𝒍 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑲𝒓𝒆𝒊𝒔 𝒑𝒓𝒐 𝑨𝒓𝒕𝒊𝒌𝒆𝒍 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑶𝒓𝒕 𝒑𝒓𝒐 𝑨𝒓𝒕𝒊𝒌𝒆𝒍 FK FK FK 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑹𝒆𝒈𝒊𝒐𝒏 𝒑𝒓𝒐 𝑮𝒓𝒖𝒑𝒑𝒆 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑳𝒂𝒏𝒅 𝒑𝒓𝒐 Gruppe 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑲𝒓𝒆𝒊𝒔 𝒑𝒓𝒐 𝑮𝒓𝒖𝒑𝒑𝒆 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑶𝒓𝒕 𝒑𝒓𝒐 𝑮𝒓𝒖𝒑𝒑𝒆 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑹𝒆𝒈𝒊𝒐𝒏 𝒑𝒓𝒐 𝑺𝒑𝒂𝒓𝒕𝒆 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑳𝒂𝒏𝒅 𝒑𝒓𝒐 Sparte 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑲𝒓𝒆𝒊𝒔 𝒑𝒓𝒐 𝑺𝒑𝒂𝒓𝒕𝒆 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑶𝒓𝒕 𝒑𝒓𝒐 𝑺𝒑𝒂𝒓𝒕𝒆 MAV D_ARTIKEL (Dimension) ARTIKEL_ID ARTIKEL_NAME GRUPPE_NR GRUPPE_NAME SPARTE_NAME SPARTE_NR MAV 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑨𝒓𝒕𝒊𝒌𝒆𝒍 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑮𝒓𝒖𝒑𝒑𝒆 𝑼𝒎𝒔𝒂𝒕𝒛 𝒑𝒓𝒐 𝑺𝒑𝒂𝒓𝒕𝒆 MAV

Anwenderabfragen vorwegnehmen - Konzept Materialized Views - Query Rewrite
Artikelsparten Artikelgruppen Artikel Liefereinheit Verladeeinheit Gebinde Vermittlungsart Produktart Alle Produkte Segment Umsatz Artikelsparten Artikelgruppen Artikel Liefereinheit Verladeeinheit Gebinde Vermittlungsart Produktart Alle Produkte Segment Konzeptionelles Sicht Der Anwender fragt immer anders, als man denkt. Macht aber nichts. Die Wünsche werden trotzdem erfüllt. Create Materialized View …. AS SELECT a.artikel_name Artikel, sum(u.umsatz) umsatz_pro_Artikel, FROM f_Umsatz_2014 U, D_artikel a WHERE U.artikel_id = a.artikel_id group by a.artikel_name Select sum(UMSATZ) , Produktart from F_UMSATZ, D_Artikel Group by Produktart; Materialized View

DBMS_MVIEW (Refresh-Funktion)
Refresh-Funktionen DBMS_MVIEW.REFRESH() DBMS_MVIEW.REFRESH_DEPENDENT() DBMS_MVIEW.REFRESH_ALL_MVIEW() JOIN D_Zeit FAKT D_PROD U Prod.A Umsatz Prod. Gr und Jahr SUM/Monat SUM/Jahr U Prod B Aufeinander Aufbauende M-Views Refresh-Methoden (optional) COMPLETE (C) FAST (F) FORCE (default) (?) PARTITIONED (P) COMPLETE C) -> immer komplettes Lesen der Basis-Tabelle FAST (F) -> Inkremtentelles Lesen, wenn möglich (View-Log oder PCT) FORCE (default) (?) -> beide vorgenannte Varianten, abhängig von der dafür benötigten Zeit Transaktionsverhalten (optional) ATOMIC_REFRESH REFRESH_AFTER_ERRORS NESTED Tabellen Basis- Bsp.: EXECUTE DBMS_MVIEW.REFRESH('MV_STANDARD‘,'C');

Beispiel Level 4 Produktgruppen-Sicht Finanz-Sicht / Berechnungen
Mv_EA_Finanz_Kum_Gruppe_Monat Produktgruppen-Sicht Finanz-Sicht / Berechnungen LFD_Bestands_Wert / Produkt / Monat LFD_Saldo / Produkt / Monat Kumulierter EK / Produkt Kumulierter VK / Produkt Kumuliertes Saldo Level 3 Mv_EA_Finanz_Kum_Monat Jahres-Sicht Bestands-/Lager-Sicht / Berechnungen LFD_Bestands_Menge / Produkt / Monat VK_Menge / Produkt / Monat EK_Menge / Produkt / Monat Kumulierte EK Menge / Produkt Kumulierte VK Menge / Produkt Mav_Einkauf_Verkauf_Diff_Jahr Mv_EA_Menge_Kum_Monat Level 2 Mav_Produkt_Monat_einkaeufe Level 1 Mav_Produkt_Monat_Verkaeufe EA: Einkauf/Verkauf Kum: kumuliert F_EINKAEUFE F_POSITION F_KAUF

Fast Refresh während Online Redefinition
Anforderung: Änderung der Struktur einer Basistabelle Herausforderung: Abhängige Materialized Views mussten bislang entkoppelt (Offline) und danach komplett neu aufgebaut werden Ab 12.2 Online Redefinition der Basis-Tabelle Automatisches „Umhängen“ der abhängigen Materialized Views mit Beibehaltung der bestehenden Werte Gilt für Refresh-Verfahren mit Log-Tabellen REDEF_TABLE bzw. START_REDEF_TABLE Prozedur Für den Fortschritt der Operation gibt es ein Monitoring Oracle Confidential – Restricted

Lesen und Kombinieren von Hadoop-/DB-Daten Big Data SQL

Oracle Big Data Management System
Umfangreiche SQL Zugriffe auf alle Unternehmensdaten Use the Right Tool for the Job and benefit from the Power of “AND” NoSQL

Big Data SQL: A New Hadoop Processing Engine
Processing Layer MapReduce and Hive Spark Impala Search Big Data SQL Resource Management (YARN, cgroups) Storage Layer Filesystem (HDFS) NoSQL Databases (Oracle NoSQL DB, Hbase)

Oracle Database Storage Server
Oracle Big Data SQL SQL Abfragen in der Oracle DB für Hadoop & Oracle NoSQL, HBase Big Data SQL All Daten mittels Oracle SQL zugreifen Smart Scan Funktionalität auf Hadoop zur Optimierung des Zugriffs Oracle Database Storage Server Oracle Database 12c RAC Cluster Oracle Catalog Hive metadata Any Hardware Cluster System Hive metadata External Table Oracle Big Data SQL also works with external tables, making use of the smart scan technology we have added to the BDA. Heute: ORACLE_HIVE and ORACLE_HDFS access drivers; zukünftig andere Treiber z.B. für MongoDB, Hbase, NoSQL Über Treiber „oracle_hive“ kann auch Zugriff auf Oracle NoSQL gesteuert werden, indem eine external table seitens Hive auf Oracle NoSQL erstellt wird: CREATE EXTERNAL TABLE tablename colname coltype[, colname coltype,...] STORED BY 'oracle.kv.hadoop.hive.table.TableStorageHandler' TBLPROPERTIES ( "oracle.kv.kvstore" = "database", "oracle.kv.hosts" = "nosql_node1:port[, nosql_node2:port...]", "oracle.kv.hadoop.hosts" = "hadoop_node1[,hadoop_node2...]", "oracle.kv.tableName" = "table_name"); Im nicht-Big-Data-SQL-Kontext kann auch direkt aus der Oracle DB eine externe Tabelle auf Oracle NoSQL (Hbase gleiches Verfahren) erstellt werden: CREATE TABLE GENRE (ID NUMBER(5), NAME VARCHAR2(30)) ORGANIZATION EXTERNAL (type oracle_loader default directory ext_tab access parameters (records delimited by newline preprocessor nosql_bin_dir:'nosql_stream' fields terminated by '|') LOCATION ('genre.dat')) PARALLEL; Same query - but there are intelligent optimizations that push the queries down to the source. This is not query federation This is Oracle Database understanding the underlying data store It has processes that are local to the data. Applying query optimizations to the data Storage Indexes Local filtering Caching HDFS Name Node External Table Cloudera Hadoop Horton Works HDFS Data Node create table customer_address ( ca_customer_id number(10,0) , ca_street_number char(10) , ca_state char(2) , ca_zip char(10)) organization external ( TYPE ORACLE_HIVE DEFAULT DIRECTORY DEFAULT_DIR ACCESS PARAMETERS (com.oracle.bigdata.cluster hadoop_cl_1) LOCATION ('hive://customer_address') ) HDFS Data Node

External Table Services
Big Data SQL Dataflow Lesen der Data vom HDFS Data Node Direct-path reads C-based readers wenn möglich Ansonsten nutzen von nativen Hadoop Klassen 1 Big Data SQL Agent 3 Smart Scan Überführen der Bytes in Oracle-Format 2 2 External Table Services Anwenden des Smart Scan-Verfahrens auf Oracle-verstehbaren Byte-Strom Apply filters / Bloom Filter Storage Indexe Project Columns Parse JSON/XML Score models Column Cache 3 SerDe RecordReader 1 Data Node Disks

Erweitern der Oracle External Table-Mechanik
Neue External Tables - Typen ORACLE_HIVE (erhält Metadaten von Hive) ORACLE_HDFS (Metadaten werden in der Definition mitgegeben) Zugriffs-Parameter für die “Fundstellen” auf der Big Data-Maschine Hadoop cluster (-Name) Remote Hive database/table (-Name) Ein DBMS_HADOOP Package kann für die DDL-Erstellung und den automatischen Import der Informatinen aus Hadoop genutzt werden. CREATE TABLE movielog ( click VARCHAR2(4000)) ORGANIZATION EXTERNAL ( TYPE ORACLE_HIVE DEFAULT DIRECTORY DEFAULT_DIR ACCESS PARAMETERS ( com.oracle.bigdata.tablename logs com.oracle.bigdata.cluster mycluster )) REJECT LIMIT UNLIMITED;

Hadoop in Data Warehouse Architekturen
Enterprise Layer Core - DWH / Info Pool User View Layer Integration Layer Alle Prozesse ETL Aktuelle Sichten Alle Unternehmens- teile Strategische Sichten D D D Offload Bestehende Daten Industrie 4.0 Data Lake Archive / ODS / Skalierung F Log + Produktions Daten Spezielle Datenarten CDRs, Bondaten, Logdaten, Click-Daten, Messwertdaten Archiv für alte Daten (ILM) Große Fakten-Tabellen Kosten/Nutzen-Relation bei der Speicherung bestimmter Daten passt oft nicht. Oft bleibt nur der Verzicht auf die Daten, weil es zu teuer ist. Neue Daten

Partitioning – Eines der wichtigsten Data Warehouse Features

Partitioning unterstützt viele Aufgaben
Große Tabellen Query Performance Partition Pruning Beschleunigung des Ladeprozesses Tablespace Tablespace Unterstützung ILM (Information Lifecycle Management) Tablespace Leichterer Umgang mit Indizierung Tablespace Tablespace Tablespace Unterstützung im Backup-Prozess Steuerung der Komprimierung Feld für Partitionierungs- Kriterium nach fachlichen oder organisatorischen Gesichtspunkten ausgewählt Range List Hash Unterstützung bei der Aktuali- sierung von Materialized Views (Partition Change Tracking) Hochverfügbarkeit auch während des Ladens und Maintenance 42

Wo und wie wird im DWH partitioniert
Lade-Aktivitäten Partitioning und Ladesteuerung laufen oft synchron Parallelisierung Lese-Aktivitäten Partition Pruning Parallelisierung Data Integration Layer Enterprise Information Layer User View Layer R: Referenztabellen 20% R R T S S S D D T: Transfertabellen S: Stammdaten B: Bewgungsdaten T PEL F D D: Dimensionen B F: Fakten T D B 80% ILM Backup Vorbereitung Temporäre Tabellen (Prüfungen etc) Bewegungsdaten sind Partitioniert (80/20 – Prinzip) Faktentabellen und Würfel können partitioniert sein 43

Enterprise Information Alter table exchange partition
Data Integration Layer Enterprise Information Layer User View Layer Archivieren (drop partition) Älteste Mai Juni Juli Checks August September Oktober CTAS Alter table add partition Tmp_table November Direct Path Alter table exchange partition

Aufbau Fakten-Tabellen
Data Integration Layer Enterprise Information Layer User View Layer Archivieren (drop partition) Älteste Fact-Table Mai Juni Juni Juli Juli August Checks August September September Oktober Tmp_table Oktober November CTAS Tmp_table November CTAS Direct Path Alter table exchange partition Tmp_table Alter table exchange partition

Übersicht Automatic List Partitioning Multi Column List Partitioning
Deferred Segment Creation bei dem Automatische List-Partioning Read Only Partitions Online Conversion einer non zu einer Partitioned Table Automatisierte Erstellung einer Exchange-Tabelle Filter für MOVE/MERGE/SPLIT-Operationen Online-Split Operationen Partitions auf External Tables Oracle Confidential – Restricted

Automatic List Partitioning
Automatisches Erstellen einer neuen Partition, wenn in einem „List-partitionierten“ Feld ein neuer Wert eingetragen wird. Neue Schlüsselwörter Automatic (Default) Manual CREATE TABLE sales_auto_list ( salesman_id NUMBER(5), salesman_name VARCHAR2(30), sales_state VARCHAR2(20), sales_amount NUMBER(10), sales_date DATE ) PARTITION BY LIST (sales_state) AUTOMATIC (PARTITION P_CAL VALUES ('CALIFORNIA') ); Oracle Confidential – Restricted

Multi Column List Partitioning
Mehrere Spalten können zu einem LIST-Partition-Kriterium kombiniert werden Neue Klausel: PARTITION BY LIST (column1,column2) CREATE TABLE sales_by_region_and_channel (deptno NUMBER, deptname VARCHAR2(20), quarterly_sales NUMBER(10,2), state VARCHAR2(2), channel VARCHAR2(1) ) PARTITION BY LIST (state, channel) ( PARTITION q1_northwest_direct VALUES (('OR','D'), ('WA','D')), PARTITION q1_northwest_indirect VALUES (('OR','I'), ('WA','I')), PARTITION q1_southwest_direct VALUES (('AZ','D'),('UT','D'),('NM','D')), PARTITION q1_ca_direct VALUES ('CA','D'), PARTITION rest VALUES (DEFAULT) ); Oracle Confidential – Restricted

Read-Only-Partitions
CREATE TABLE orders_read_write_only ( order_id NUMBER (12), order_date DATE CONSTRAINT order_date_nn NOT NULL, state VARCHAR2(2) ) READ WRITE PARTITION BY RANGE (order_date) SUBPARTITION BY LIST (state) ( PARTITION order_p1 VALUES LESS THAN (TO_DATE ('01-DEC-2015','DD-MON-YYYY')) READ ONLY ( SUBPARTITION order_p1_northwest VALUES ('OR', 'WA'), SUBPARTITION order_p1_southwest VALUES ('AZ', 'UT', 'NM') ), PARTITION order_p2 VALUES LESS THAN (TO_DATE ('01-MAR-2016','DD-MON-YYYY')) ( SUBPARTITION order_p2_northwest VALUES ('OR', 'WA'), SUBPARTITION order_p2_southwest VALUES ('AZ', 'UT', 'NM') READ ONLY PARTITION order_p3 VALUES LESS THAN (TO_DATE ('01-JUL-2016','DD-MON-YYYY')) ( SUBPARTITION order_p3_northwest VALUES ('OR', 'WA') READ ONLY, SUBPARTITION order_p3_southwest VALUES ('AZ', 'UT', 'NM') ) ); Markieren von Partitionen mit READ ONLY READ WRITE Kann bei CREATE TABLE oder durch ALTER TABLE festgelegt werden Oracle Confidential – Restricted

Converting non-partitioned Table
Nicht partitionierte Tabellen können jetzt in einer Online- Aktion in eine partitionierte Tabelle umgewandelt werden Parallele DML-Operationen sind möglich Bislang war dazu ein Kopieren der Tabelle nötig. Neue Schlüsselwörter: Modify Online ALTER TABLE employees_convert MODIFY PARTITION BY RANGE (employee_id) INTERVAL (100) ( PARTITION P1 VALUES LESS THAN (100), PARTITION P2 VALUES LESS THAN (500) ) ONLINE UPDATE INDEXES ( IDX1_SALARY LOCAL, IDX2_EMP_ID GLOBAL PARTITION BY RANGE (employee_id) ( PARTITION IP1 VALUES LESS THAN (MAXVALUE)) ); Oracle Confidential – Restricted

Filter für MOVE/MERGE/SPLIT-Operationen
Einschränken der Sätze, die von einem MOVE, MERGE, SPLIT betroffen sind. ALTER TABLE orders_move_part MOVE PARTITION q1_2016 TABLESPACE open_orders COMPRESS ONLINE INCLUDING ROWS WHERE order_state = 'open'; Oracle Confidential – Restricted

Online-Split Operationen
Aufteilen von Partitionen ohne Unterbrechung des Online-Betriebs ist möglich Alter Table – Statement ONLINE CREATE TABLE orders (prod_id NUMBER(6), cust_id NUMBER, time_id DATE, channel_id CHAR(1), promo_id NUMBER(6), quantity_sold NUMBER(3), amount_sold NUMBER(10,2) ) PARTITION BY RANGE (time_id) (PARTITION sales_q1_2016 VALUES LESS THAN (TO_DATE('01-APR-2016','dd-MON-yyyy')), PARTITION sales_q2_2016 VALUES LESS THAN (TO_DATE('01-JUL-2016','dd-MON-yyyy')), PARTITION sales_q3_2016 VALUES LESS THAN (TO_DATE('01-OCT-2016','dd-MON-yyyy')), PARTITION sales_q4_2016 VALUES LESS THAN (TO_DATE('01-JAN-2017','dd-MON-yyyy')) ); ALTER TABLE orders SPLIT PARTITION sales_q4_2016 INTO (PARTITION sales_oct_2016 VALUES LESS THAN (TO_DATE('01-NOV-2016','dd-MON-yyyy')), PARTITION sales_nov_2016 VALUES LESS THAN (TO_DATE('01-DEC-2016','dd-MON-yyyy')), PARTITION sales_dec_2016 ONLINE; Oracle Confidential – Restricted

Partitioning auf External Tables
Partition Pruning und Partition-Wise Joins Beide Join-Tabellen müssen über das gleiche Feld partitioniert sein Auch Reference-Partitioning (Join zwischen Parent / Child –Tables) Range-, List- und Composit-Partitioning CREATE TABLE sales (loc_id number, prod_id number, cust_id number, amount_sold number, quantity_sold number) ORGANIZATION EXTERNAL (TYPE oracle_loader DEFAULT DIRECTORY load_d1 ACCESS PARAMETERS ( RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII NOBADFILE LOGFILE log_dir:'sales.log' FIELDS TERMINATED BY ",„)) REJECT LIMIT UNLIMITED PARTITION BY RANGE (loc_id) (PARTITION p1 VALUES LESS THAN (1000) LOCATION ('california.txt'), PARTITION p2 VALUES LESS THAN (2000) DEFAULT DIRECTORY load_d2 LOCATION ('washington.txt'), PARTITION p3 VALUES LESS THAN (3000)) ; Oracle Confidential – Restricted

Einstellungen und deren Verwendung

In-Memory Oracle Confidential – Restricted

Permanente Performance-Optimierung
Buffer Cache In-Memory Column Store SALES Row Format Column Permanente Optimierung 3x schneller bei Mixed Workloads Einige Optimierungen schon in Bundle Patch Mehr Optimierung bzgl. analytischen Abfragen Transactions Analytics Before getting in to what is new in 12.2 I want to take a moment to remind you all that the development of Database In-Memory didn’t stop after it was release in We have continued to improve the product performance via the Database Proactive bundle patches, previously known as the Exadata and DBIM bundle patches. Since GA we have improved the performance of In-Memory column store in a mixed workload environment by reducing the overhead of keeping the In-Memory column store in sync. We can now maintain the column store 3X faster. If you customer is interesting in trying out Oracle Database In-Memory either in or in you need to ensure they don’t test using the base release. THEY MUST DOWNLOAD THE LATEST BUNDLE PATCH IN ORDER TO GET THE BEST PERFORMANCE.

Zunächst ein wichtiges Prinzip: Verteilung von Tabellendaten über die Speicherhierarchie
Umsatzdaten DISK SSD Flash In Memory März 14 Februar 14 Januar 14 Dezember 13 November 13 Oktober 13 September 13 August 13 Juli13 Juni 13 April 13 März 13 Februar 13 Juni 14 Mai 14 April 14 Juni 14 Mai 14 April 14 März 14 Februar 14 Januar 14 Dezember 13 November 13 Oktober 13 September 13 August 13 Juli13 Juni 13 April 13 März 13 Februar 13

3 Mit In-Memory Virtualisierung des User View Layers Mehr Flexibilität
Integration Layer Enterprise Layer Core - DWH / Info Pool User View Layer BI Plattform 3 Virtuelle Strukturen Cache Mit In-Memory Virtualisierung des User View Layers Mehr Flexibilität Schnellere Weiterentwicklung / Reagieren Weniger Redundanz von Daten Volle multidimensionale Sicht

Mögliches Vorgehen : Virtualisierung User View Layer
Enterprise Layer User View Layer Im Enterprise Layer müssen Dimensionsschlüssel müssen bereits existieren Historisierung Kaum physikalische Persistenz auf den Festplatten Zugewinn Flexiblere und schneller Bereitstellung multidimensionaler Strukturen Weniger Plattenplatz weil weniger Redundanz im Schichtenmodell Mehr Performance REGION REGIONNR REGION REGIONNR LAND LANDNR KREISNR KREIS ORTNR ORT REGION_ID Dimension als View auf In-Memory Stammdaten REGIONNR LAND LANDNR View-Definition Mav KZ LANDNR KREISNR KREIS VW_ORT KREISNR ORTNR ORT REGION_ID Memory-Load Nur aktuelle Partitionen Und einzelne Spalten ARTIKEL_ID ARTIKEL_NAME GRUPPE_NR GRUPPE_NAME SPARTE_NR SPARTE_NAME ZEIT_ID DATUM_ID TAG_DES_MONATS TAG_DES_JAHRES WOCHE_DES_JAHRES MONATS_NUMMER MONAT_DESC QUARTALS_NUMMER JAHR_NUMMER ARTIKEL_ID ARTIKEL_NAME GRUPPE_NR View-Definition GRUPPE_NR GRUPPE_NAME SPARTE_NR VW_ARTIKEL SPARTE_NR SPARTE_NAME Kleine Dimensionen terden persistiert

In-Memory Joins: Join Groups
In-Memory Joins waren schon bis zu 10 mal schneller In 12.2 zusätzlich Join- Groups Join-Gruppen / Spalten werden auf Dictionary- Ebene deklariert Enables 2-3x speedup over already fast inmemory joins Example: Find sales price of each Vehicle Vehicle Sales Sales NAME NAME is join column In we sped up join in the In-Memory column store by converting joins to scan and filter operations, using bloom filters. But what about joins where there is no filter predicate on the dimension table to facilitate a bloom filter? In we scanned both tables in the join via the In-Memory column store and then decompressed all of the data before sending it to a standard hash join. In 12.2 we’ve been able to speed up joins by ensuring the join columns are compressed in the same way within the In-Memory column store. This means the column share a global compression dictionary, which allows us to execute the join on the compressed values (encoded symbols) rather than having to uncompress the data and then hash it to complete the join. Steps required to benefit from join groups” Creates Join Group CREATE INMEMORY JOIN GROUP jgroup_cust (sales(cust_id), customers2(cust_id)); -- Check if the join group exists select joingroup_name, table_name table_Column from user_joingroups; 3. Populate tables into the In-Memory column store Global dictionaries are only created on population / repopulation of tables. VEHICLE NAME CREATE INMEMORY JOIN GROUP V_name_jg (VEHICLES(NAME),SALES(NAME));

In-Memory Joins: Join Groups
12.1 performance number (Serial) 12.2 performance number with join group Create join Group on Lineorder.lo_orderdate, date_dim.d_datekey We tested the join performance on 12.1 with In-Memory and then on 12.2 with a column group defined on the tables populated In-Memory . A global dictionary was created on the list of distinct values in the join column between the lineorder and date_dim tables (lo_orderdate and d_datekey). You can see that the addition of the column group resulted in an almost 2x performance improvement. In this example the join group creation is the following, and requires a repopulation: create inmemory join group jgrp_lo_date (lineorder(lo_orderdate), date_dim(d_datekey)); Now in 12.2 the join is conducted on the compressed values / symbols rather than on the actual data values removing the necessity to uncompress the data and then hash it. 12.2 performance number with join group 2X Performance Improvement

Join Group : Managing & Monitoring
Typically compression is thought of as a capacity saving mechanism only. However, data populated into the IM column store is compressed using a new set of compression algorithms that not only help save space but also improve query performance. The new in-memory compression format allows queries to execute directly against the compressed columnar format. This means all scanning and filtering operations will execute on a much small amount of data. Data is only decompressed when it is found to be a match for a query. In-memory compression is specified using the keyword MEMCOMPRESS, a sub-clause of the INMEMORY attribute. There are six levels, each of which provides a different level of compression and performance.

Join Group : Under the Hood - Managing & Monitoring
Typically compression is thought of as a capacity saving mechanism only. However, data populated into the IM column store is compressed using a new set of compression algorithms that not only help save space but also improve query performance. The new in-memory compression format allows queries to execute directly against the compressed columnar format. This means all scanning and filtering operations will execute on a much small amount of data. Data is only decompressed when it is found to be a match for a query. In-memory compression is specified using the keyword MEMCOMPRESS, a sub-clause of the INMEMORY attribute. There are six levels, each of which provides a different level of compression and performance.

Was sind In-Memory Expressions
SELECT Price*(1-Discount) FROM SALES WHERE Price * Tax > 50; SELECT UPPER(Name) FROM PRODUCTS WHERE ROUND(Price,1)>6.1 In-Memory Expressions (IMEs) sind Ergebnisse von SQL-Ausdrücken die in einer zusätzlichen Spalte materialisiert sind 1-to-1 Mapping zu den Werten in einem Record einer Tabelle Viele unterschiedliche Typen: Rechenausdrücke Logische Ausdrücke (z. B. DECODE) Typ-Konvertierungen (z. B. UPPER, TO_CHAR) PL/SQL Ausdrücke Konstanten

In-Memory Expressions: Manuelle Deklaration
Manuelle Erstellung von virtual columns für die gewünschte Inmemory Expression. INMEMORY_VIRTUAL_COLUMNS Parameter: ENABLE – Alle manuell definierten virtual Columns für eine Tabelle oder Partition werden mit die In-Memory-Nutzung errechnet und in den Speicher geladen. MANUAL – (default) User-Defined Virtual Columns müssen explicit mit INMEMORY markiert sein DISABLE – keine User-Defined Virtual Column wird geladen CREATE TABLE SALES ( PRICE NUMBER, TAX NUMBER, …, NET AS (PRICE + PRICE * TAX) ) INMEMORY;

In-Memory Expressions: Automatisch gebildete Ausdrücke
Column store + Expressions Der Expressions Statistics Store (ESS) beobachtet permanent den jeweiligen Workload Aufzeichnung von häufigen und kostenintensiven Ausdrücken Mit dem DBMS_INMEMORY package kann man erkannte oft genutzte Expressions in virtuelle Spalten überführen: IME_CAPTURE: Feststellen der “top N” hot Expressions IME_POPULATE: Erstellung In-Memory Virtual Columns ESS Current Top N Expressions .. a+5 Upper(x) X+Y IME_CAPTURE Capture new candidate expressions IME_POPULATE Declare hidden in-memory virtual columns for candidates Hot expressions constantly recorded

In-Memory Expression - Statistik
Auslesbar in den entsprechenden Dictionary-Views all|dba|user_expression_statistics

In-Memory Expressions: Performance
12.1 Performance Werte 12.2 Performance Werte mit zwei Inmemory Virtual Columns Each In-Memory expression adds a hidden user defined virtual column to the table. The expression is pre- computed and stored in the hidden virtual column. The 12.1 test executed with a fully populated lineorder table but the expressions had to be calculated for each occurrence since they cannot be stored in the IM column store. The 12.2 test ran the query with In-Memory expressions enabled and two user defined hidden virtual columns. Note that with the IMEs defined that the In-Memory size for the lineorder table was larger because of the space consumed by the hidden virtual columns (i.e. 287 GB vs. 338 GB). However, with the IMEs enabled the query was much faster since the expressions have already been calculated and stored in the IM column store. Create Inmemory virtual columns for Highlighted expressions 3.5X Performance Verbesserung

Multi-Model Analytics: In-Memory JSON
In-Memory Colum Store Volle JSON Dokument-Unterstützung in speziellem, optimierten binary Format. Zusätzliche Ausdrücke können als JSON Column definiert werden. Abfragen auf JSON-Content und Ausdrücke werden automatisch zu dem In-Memory-Format umgeleitet. E.g. Find movies where movie.name contains “Jurassic” x Performance- Verbesserungen Relational In-Memory Virtual Columns In-Memory JSON Format Json documents have always been supported with Database In-Memory but in 12.2 we have improved the format we use to store the JSON document in memory. The new format allows us to scan and filter the data within each document much more efficiently than ever before. It’s also possible to create additional In-Memory expression on JSON documents using the JSON_VALUE functions, which will be automatically materialized and maintained the results in the In-Memory column store. Oracle will automatic redirect queries to use the In-Memory Expressions instead of the base JSON column. { "Theater":"AMC 15", "Movie":”Sully", "Time“: T18:45:00", "Tickets":{ "Adults":2 } Relational Virtual JSON

Multi-Model Analytics: In-Memory JSON
12.1 Performance Werte 12.2 Performance Werte mit Inmemory JSON column For the JSON tests a new table is created that has a JSON column defined for the entire document and then a second column for just a JSON value. In 12.1 JSON is supported in the database but not in the IM column store. The search is relatively slow because it has to access the JSON document in the buffer cache. In 12.2 the first query searches the JSON document using the JSON_VALUE call, but since it has been populated into the IM column store it is a binary search and is much faster than searching JSON from the buffer cache. The second query is just picking up a JSON_VALUE so it is very fast. In-Memory JSON column allows fast JSON_TABLE evaluation 23X Performance Verbesserung

Automation: In-Memory Column Store Policies
Sales _Q1 Automatic Data Optimization ist auf In-Memory ausgeweitet worden Über eine Heatmap lassen sich Policies auf den In-Memory Column Store anwenden Policie-Automatisierung, z. B.: Bringe neue Objekte in den Column Store Stärkere Kompression wenn Daten seltener genutzt werden Entfernen von Objekten aus dem Column Store Sales_Q3 Sales_Q2 It’s also easier to ensure you always have the right data in memory at the right time, as we will automatically moved data in and out of the In-Memory column store based on an in-memory heat map and user defined policies. The heat map tracks data accesses, which allows us to move the data into the memory and to evict the data from memory based on user defined policies. Polices can be defined to: Populate data into memory Evict data into memory Increase compression levels in memory Policy criteria are base on time since: Object was created Data was accessed Data was modified OR a user defined boolean function – TRUE the policy is applied, FALSE the policy is not applied. NOTE: Requires heat map feature – heat_map=on (init.ora parameter) but ACO license is not required Heat map/ADO included with In-Memory in 12.2 but only for objects in the In-Memory column store. Sales_Q4

Background: Automatic Data Optimization in 12c
Eine In-Memory Heat Map beobachtet die Block und Segment-Zugriffe Heat Maps werden periodisch auf die Platte geschrieben Daten sind über Views oder Stored Procedures zugreifbar Regeln können z. B. für Komprimierung oder Verschieben auf andere Speicherorte zu den Tabellen definiert werden Gilt für Tables, Partitions oder Sub-Partitions Online durchführbar. Daten sind verfügbar Allows automatic data tiering Heat Map Let’s talk about Automatic Data Optimization. This is a new 12c feature along with Heat map that makes managing storage growth feasible. It’s part of the Advanced Compression Option. Now the database can do most of the work that DBAs and others in the organization have a hard time getting under control. Automated, policy based management of storage using compression and storage tiering based on usage that you can customize if you need to. This is a big deal. Make no mistake though, it’s not an archive and purge solution. We’ve found that it’s just too hard for most customers to actually remove data from their database into some kind of archive. Identifying how to remove the data and then managing it so that it can be retrieved in a timely manner over a long period of time is very hard. We believe you can use ADO to manage all of your data in your database. Policy 3 Policy 2 Policy 1 Oracle Confidential – Highly Restricted

Automatic Data Optimization mitDatabase In-Memory
ADO IM Policies Beispiele alter table sales ilm add policy .. no inmemory after 10 days of no access; no inmemory after 45 days of creation; memcompress for query after 3 days of creation; memcompress for capacity after 21 days of creation; noinmemory on MyCustomPolicyFunction; Policy Beispiele: set INMEMORY oder NO INMEMORY alter MEMCOMPRESS level Policy Kriterien sind: after <time> of no access after <time> of creation after <time> of no modification on <user defined boolean function> Actions run in maintenance window Dee Regeln sind auch manuell durchführbar – dbms_ilm.execute_ilm procedure 3 modes on No access On creation On your customer function that returns a boolean A successful policy acts as No InMemory on object (eviction only) Oracle Confidential – Highly Restricted

Automatic Data Optimization mit Database In-Memory
Beispiel In-Memory Column Store ALTER TABLE sales ILM ADD POLICY NO INMEMORY AFTER 9 months OF CREATION; Sales _Q1 Sales_Q3 Sales_Q2 Ermögloicht “sliding window” für die Columns im Column Store Speicher wird frei gegeben, wenn die Zeit abgelaufen ist Neue Daten können geladen werden Sales_Q4 Oracle Confidential – Highly Restricted

Improved Availability: In-Memory Fast-Start
Daten sind in dem IM Column Format zusätzlich persistent gespeichert In-Memory Column Store Inhalte sind mit einem Checkpoint versehen in einem SecureFile gespeichert Bei einem DB- Restart ist die Befüllung des Column Store schneller Faktor 2-5x im Vergleich zu dem normalen Laden des Column Store Buffer Cache In-Memory Column Store With the In-Memory columnar format was not persisted on disk In any way, so when the database instance was shutdown the In-Memory columnar format was gone. Each time the instance restarted the In-Memory columnar format would have to be rebuilt, by Reading the data in it’s row format, pivoting it 90 degrees to create columns, and then compressing the data before populating it into memory. As you can imagine the population process is a CPU and time consuming step. In order to improve the speed of the population process in 12.2 we have introduce In-Memory FastStart. With IM Faststart data from the In-Memory column store is periodically checkpointed to FastStart tablespace on disk in it’s compressed columnar format. When the database restarts the data is read directly from disk in it’s compressed columnar format speeding up the population process by a factor of 2-5X. PLEASE NOTE:: Systems that are already IO bound during the population process may not see as much benefit as the IO requirements during the population have not changed. Steps required to enable Fast Start:: Allocate a FastStart tablespace using DBMS_INMEMORY_ADMIN.ENABLE_FASTSTART Tablespace should be sized 2X of INMEMORY_SIZE parameter 2. Data stored in LOB segment called DBIMFS_LOBSEG$ in the FastStart table and the Metadata is held in SYSAUX tablespace More details can be found in v$INMEMORY_FASTSTART view 3. Data is automatically check pointed to FastStart area on Population or Repopulation No manual way to force a checkpoint 4. On subsequent database restarts data populated directly from FastStart area No formatting or compression required Standard population rules apply Once an object is marked no inmemory its removed from FastStart area DBFILE1 Table Index DBFILE2 SALES TABLESPACE FAST START TABLESPACE Fast Start Data

In-Memory FastStart – Under the Hood
BEGIN dbms_inmemory_admin.Faststart_enable('TS_FASTSTART'); END; Allokieren eines FastStart tablespace DBMS_INMEMORY_ADMIN.FASTSTART_ENABLE Tablespace sollte 2X größer als der INMEMORY_SIZE Parameter sein. Daten sind im LOB Segment gespeichert DBIMFS_LOBSEG$ Die Metadaten dazu sind im SYSAUX tablespace

In-Memory FastStart – Under the Hood
Beim Restart werden die Daten direct aus der FastStart area geladen Es muss keine Umformatrierung oder In-Memory-Kompression gemacht werden Buffer Cache In-Memory Column Store Daten sind mit Checkpoints versehen Fast Start Data DBFILE1 Table Index DBFILE2 SALES TABLESPACE FAST START TABLESPACE

Analytic Views Oracle Confidential – Restricted

Wie funktionieren heutige Analyse-Infrastrukturen?
Metadata und Kalkulationen sind im Anwendungs-Layer definiert Wenig Wiederverwendungseffekte Mehrfacharbeit Mögliche Inkonsistenzen Eventuell unnötige Datentransporte Zuviel und zu teure Infrastrukturen auf der Client-Seite Zu viel IT-Know how / Personal in den Fachabteilungen …. Komplexe Code-Generatoren Inflexibel Schwergewichtige Tools BI-Tool und BI Anwendungen Business Modell und Kalkulationen Query Generator Datenbank Daten und Abfrageausführung Oracle Confidential – Restricted

Analytical Views Vollständige Beschreibung einer multidimensionale Sicht mittels Metadaten in der Datenbank Dimension-Attributes Hierarchies mit Level Measures (direkte und virtuelle berechnete Spalten) Implementieren von Business-Logik Über MDX ansprechbar Oracle Confidential – Restricted

Vereinfachung multidimensionaler Abfragen
Physical Layer Tables Views External Tables Big Data SQL Business Model Attributes Levels + Hierarchies Aggregation Rules Measure Calculations Presentation Layer Einfaches SQL MDX / OLE DB for OLAP Presentation Metadata Oracle Confidential – Restricted

3 neue Datenbank - Objekte
Attribut-Dimensionen Identifizieren von Daten-Objekte Funktion von Spalten wird festgelegt Hierarchien Level und Drillpfade Abfragbare Strukturen Strukturauflösungen auf Werteebene Analytic Views Kopplung Dimensionen und Fakten Eine neuer View-Typ, der über Metadaten Daten-Objekte zusammen- zieht Keys All Land Kreis Ort Jahr Monat Tag Woche Aggregation Rules Attribute Attribute Definition Level Hierarchien Kennzahlen Analytic View Source Tabellen Beschreibungen Fact Tables Berechnungen Oracle Confidential – Restricted

Was ist nötig, um diesen Output zu erzeugen?
Umsatz pro Artikel Gruppiert nach Jahren Anteil des Artikelumsatzes am Gesamt-Artikelgruppenumsatz Vorjahresvergleich Zeit Hierarchie Tag/Woche/Jahr Artikel Bohrhammer 4711 Farbtopf_Lack_rot CU_Muffe_18mm Tag Woche Monat Quartal Jahr Dekade Artikel Gruppe Sparte Sanitär Garten Elektro Food Non-Food Services Hierarchie Tag/Monat/ Quartal/Jahr Zeit_ID Artikel_ID Region_ID Channel_ID Menge Umsatz_Pos Keys Facts Channel Medium Kampagne Ort VertGebiet Kreis Bundesland Land Region Channel Region Oracle Confidential – Restricted

H1.level_name in (‘xxxxx‘ ) and H2.level_name in (‘yyyyy‘ ) ORDER BY
Zeit Hierarchie Tag/Woche/Jahr Artikel Bohrhammer 4711 Farbtopf_Lack_rot CU_Muffe_18mm Attribute Dimension Tag Woche Monat Quartal Jahr Dekade Artikel Gruppe Sparte Sanitär Garten Elektro Source Table Attribute-Mapping Order Levels SELECT dim_Zeit.att1, dim_Artikel.att2 Measure1, Berechnung_2 …. FROM AV_Umsatz HIERARCHIES (H1,H2) WHERE H1.level_name in (‘xxxxx‘ ) and H2.level_name in (‘yyyyy‘ ) ORDER BY H1.level_order Food Non-Food Services Hierarchie Tag/Monat/ Quartal/Jahr Hierarchy Attribute Dimension Level-Hierarchien Zeit_ID Artikel_ID Region_ID Channel_ID Menge Umsatz_Pos Keys Analytic View Facts Source-Fact-Table Attribute Dimension Hierarchie-Objekte Measures Berechnung_1 Berechnung_2 Berechnung_3 Berechnung_4 ……. Channel Medium Kampagne Ort VertGebiet Kreis Bundesland Land Region Channel Region Oracle Confidential – Restricted

Zeit Artikel All Sparte Gruppe Artikel AV_UMSATZ Keys Tag Monat Facts
Hierarchie Tag/Woche/Jahr Artikel All Tag Woche Monat Quartal Jahr Dekade Artikel Gruppe Sparte Sparte Gruppe Artikel Hierarchie Tag/Monat/ Quartal/Jahr Zeit_ID Artikel_ID Region_ID Channel_ID Menge Umsatz_Pos AV_UMSATZ Menge Umsatz Vorjahres_Umsatz Year_To_Date_Umsatz …. Keys Tag Monat Facts Quartal Jahr Channel Medium Kampagne Ort VertGebiet Kreis Bundesland Land Region Ort All Kreis Land Region Channel Region Oracle Confidential – Restricted

Artikel_hier.member_name as Artikel, ZEIT_HIER.jahr, umsatz
All select Artikel_hier.member_name as Artikel, ZEIT_HIER.jahr, umsatz anteil_artikel_parent umsatz_vorjahr from av_umsatz hierarchies (Artikel_hier, ZEIT_HIER) where Artikel_hier.level_name = 'ARTIKEL' and ZEIT_HIER.level_name = 'JAHR' order by ZEIT_HIER.jahr; Sparte Gruppe Artikel AV_UMSATZ Menge Umsatz Vorjahres_Umsatz Year_To_Date_Umsatz …. Tag Monat Quartal Jahr Ort All Kreis Land Region Oracle Confidential – Restricted

Artikel_hier.member_name as Artikel, ZEIT_HIER.jahr, umsatz
All Level-Wert-Vertreter aus den jeweiligen Dimensionen select Artikel_hier.member_name as Artikel, ZEIT_HIER.jahr, umsatz anteil_artikel_parent umsatz_vorjahr from av_umsatz hierarchies (Artikel_hier, ZEIT_HIER) where Artikel_hier.level_name = 'ARTIKEL' and ZEIT_HIER.level_name = 'JAHR' order by ZEIT_HIER.jahr; Sparte Daten-Spalte in Fakt-Tabelle Gruppe Zusätzliche Berechnungen Analytic-View-Definion Artikel Hierarchie-Definitionen AV_UMSATZ Menge Umsatz Vorjahres_Umsatz Year_To_Date_Umsatz …. Tag Monat Festlegen des Hierarchie-Levels für die Betrachtung der jeweiligen Kennzahl Quartal Jahr Ort All Kreis Land Region Oracle Confidential – Restricted

Strukturierter Zugriff auf dimensionale Werte
Standardisierte Definitionen und Schlüsselwörter SELECT CASE is_leaf WHEN 0 then lpad(' ',depth * 2,'.') || '+ ' || member_name ELSE lpad(' ',depth * 3,'.') || member_name END AS DRILL, depth FROM Artikel_hier ORDER BY hier_order; SELECT CASE is_leaf WHEN 0 then lpad(' ',depth * 2,'.') || '+ ' || member_name ELSE lpad(' ',depth * 3,'.') || member_name END AS DRILL, depth FROM Ort_hier ORDER BY hier_order; Oracle Confidential – Restricted

Create Dimension / Hierarchy
Create or replace force attribute dimension Att_dim_Artikel dimension type STANDARD USING D_ARTIKEL Attributes ( "ARTIKEL_NAME" as ARTIKEL_NAME , "ARTIKEL_ID" as ARTIKEL_ID , "GRUPPE_NR" as GRUPPE_NR , "GRUPPE_NAME" as GRUPPE_NAME , "SPARTE_NAME" as SPARTE_NAME , "SPARTE_NR" as SPARTE_NR , "QUALITAET" as QUALITAET , "WERTIGKEIT" as WERTIGKEIT , "EK_PREIS" as EK_PREIS , "VK_PREIS" as VK_PREIS) LEVEL ARTIKEL KEY ARTIKEL_ID MEMBER NAME ARTIKEL_NAME MEMBER CAPTION ARTIKEL_NAME ORDER BY ARTIKEL_ID DETERMINES (GRUPPE_NR,EK_PREIS,VK_PREIS,QUALITAET,WERTIGKEIT) LEVEL ARTIKEL_GRUPPE KEY GRUPPE_NR MEMBER NAME GRUPPE_NAME MEMBER CAPTION GRUPPE_NAME DETERMINES (SPARTE_NR) LEVEL ARTIKEL_SPARTE KEY SPARTE_NR MEMBER NAME SPARTE_NAME MEMBER CAPTION SPARTE_NAME ALL MEMBER NAME 'Alle Artikel' Level-Definition Schlüssel auf Level-Ebene Standardisierte Terminologie Z. B. MEMBER NAME Parent / Child Festlegung Drillpfade CREATE OR REPLACE HIERARCHY Artikel_hier USING Att_dim_artikel (ARTIKEL CHILD OF ARTIKEL_GRUPPE CHILD OF ARTIKEL_SPARTE) Oracle Confidential – Restricted

Definition Analytic View
CREATE OR REPLACE ANALYTIC VIEW av_umsatz USING F_UMSATZ DIMENSION BY (Att_dim_Artikel KEY ARTIKEL_ID REFERENCES ARTIKEL_ID HIERARCHIES ( Artikel_hier ), List of hierarchies that use Att_dim_REGION KEY REGION_ID REFERENCES ORTNR HIERARCHIES (Ort_hier ), Att_dim_zeit KEY ZEIT_ID REFERENCES ZEIT_ID HIERARCHIES (Zeit_hier) ) MEASURES (umsatz FACT umsatz, as (SHARE_OF (UMSATZ HIERARCHY Artikel_hier PARENT)), antanteil_artikel_parenteil_Region_parent as (SHARE_OF (UMSATZ HIERARCHY ort_hier PARENT)), umsatz_vorjahr as (LAG(umsatz) OVER (HIERARCHY Zeit_hier OFFSET 1 ACROSS ANCESTOR AT LEVEL jahr)), menge FACT menge) DEFAULT MEASURE UMSATZ; Oracle Confidential – Restricted

Zusammenfassende Szenarien

Von In-Memory bis zum Online Archiv: ein Objekt!
Analytic View (Meta View) Bons SQL/ MDX Oracle 12c Filiale Produkt Kunden Dimensionen/Facts Hierarchie-Objekte Drillpfade Level-Attribute Kennzahl_1 Kennzahl_2 Kennzahl_3 Kennzahl_4 ……. Kennzahl_n Bons Zeit Region Kampagne komplett hybrid Bons Bons HDFS Oracle Confidential – Restricted

Szenarien Kombinierter Einsatz von
In-Memory Big Data SQL Analytical View Zugriff über fachlich motivierte Metadaten und transparente verteilte physische Speicherung der Daten Oracle Confidential – Restricted

Restfolien Oracle Confidential – Restricted

Create Dimension create force attribute dimension Att_dim_REGION
dimension type STANDARD USING D_REGION Attributes ( ORTNR as ORTNR, ORT AS ORT , KREISNR AS KREISNR , KREIS AS KREIS , LAND AS LAND , LANDNR AS LANDNR , REGION AS REGION , REGIONNR AS REGIONNR , REGION_ID AS REGION_ID ) LEVEL ORT KEY ORTNR DETERMINES (ORT) LEVEL Kreis KEY KREISNR DETERMINES (KREIS) LEVEL Land KEY LANDNR DETERMINES (LAND) LEVEL REGION KEY REGION_ID DETERMINES (REGION, REGIONNR) / create force attribute dimension Att_dim_artikel dimension type STANDARD USING D_ARTIKEL Attributes ( ARTIKEL_NAME AS ARTIKEL_NAME , ARTIKEL_ID AS ARTIKEL_ID , GRUPPE_NR AS GRUPPE_NR , GRUPPE_NAME AS GRUPPE_NAME , SPARTE_NAME AS SPARTE_NAME , SPARTE_NR AS SPARTE_NR) LEVEL Artikel KEY ARTIKEL_ID DETERMINES (ARTIKEL_NAME) LEVEL Artikel_Gruppe KEY GRUPPE_NR DETERMINES (GRUPPE_NAME ) LEVEL Artikel_Sparte KEY SPARTE_NR DETERMINES (SPARTE_NAME) / Oracle Confidential – Restricted

Hierarchy-Objekte Zuordnung der Attribute zu Hierarchie-Objekten und –Level Hierarchisierung der Level CREATE OR REPLACE HIERARCHY REGION_hier USING Att_dim_REGION (ORT CHILD OF KREIS CHILD OF LAND CHILD OF REGION) / CREATE OR REPLACE HIERARCHY Artikel_hier USING Att_dim_artikel (ARTIKEL CHILD OF ARTIKEL_GRUPPE CHILD OF ARTIKEL_SPARTE) CREATE OR REPLACE HIERARCHY Zeit_Woche_Jahr USING Att_dim_Zeit (TAG CHILD OF WOCHE CHILD OF JAHR) / CREATE OR REPLACE HIERARCHY Zeit_Monat_Quartal_Jahr USING Att_dim_Zeit (TAG CHILD OF MONAT CHILD OF QUARTAL CHILD OF Jahr) / Oracle Confidential – Restricted

Oracle 12.2 Neue Features für das Data Warehouse

Ähnliche Präsentationen

Präsentation zum Thema: "Oracle 12.2 Neue Features für das Data Warehouse"— Präsentation transkript:

Ähnliche Präsentationen

Über Projekt

Feedback

Anmelden

Anmeldung über soziales Netzwerk:

Oracle 12.2 Neue Features für das Data Warehouse

Ähnliche Präsentationen

Präsentation zum Thema: "Oracle 12.2 Neue Features für das Data Warehouse"— Präsentation transkript:

Ähnliche Präsentationen

Über Projekt

Feedback