Die Präsentation wird geladen. Bitte warten

Die Präsentation wird geladen. Bitte warten

Prof. Dr. Stefan Edlich NoSQL in der Cloud Prof. Dr. Stefan Edlich NoSQL in der Cloud.

Ähnliche Präsentationen


Präsentation zum Thema: "Prof. Dr. Stefan Edlich NoSQL in der Cloud Prof. Dr. Stefan Edlich NoSQL in der Cloud."—  Präsentation transkript:

1 Prof. Dr. Stefan Edlich NoSQL in der Cloud Prof. Dr. Stefan Edlich NoSQL in der Cloud

2 nosqlberlin.de nosqlfrankfurt.de nosql powerdays

3

4

5

6 NoSQL is specialization! Big Data Massive Write Performance Fast KV Access Write Availability Flexible Schema (Migration) + Flexible Datatypes Easier maintainability, administration and operations No single point of failure Programmer ease of use

7 Theorie?! Map/Reduce Map/Reduce Nachfolger! ACID / BASE & CAP P liegt in der Regel nie vor! Consistent Hashing Basis skalierbarer K/V Stores MVCC non blocking Vorteile Vector Clocks [122:1] [147:2|122:1] [97:3|147:2|122:1]

8

9 Google Protocol Buffers =>

10 unbedingt evaluieren! Apache Avro!

11 Datenmodelle

12 Voldemort, Chordless, Scalaris, Dynamo / Dynomite db4o, Versant, Objectivity, Gemstone, Progress, Mark Logic, EMC Momentum, Tamino, GigaSpaces, Hazelcast, Terracotta, … Column Family DocumentDBs Key/ValueDBs GraphDBs andere

13 HBase Cassandra SimpleDB

14 + Skalierung = new node + Community + API - Replikation - Aufsetzen, Optimierung, Wartung + Skalierung = new node + Replikation + Konfiguration (r, w) - Dokumentation - Abfragen + stressfreie SaaS Lösung + transparent scaling - UTF-8 String - Daten liegen bei Amazon +- kein tuning / config

15 Document Databases

16 any JS-Client no Middleware! DB+WebServer +evolving App

17 2.Runde += 6,5 Mio $

18 -nicht normalisiert (Duplicates, Delete Orphans,...) -(konfigurierbare Zeit Crash anfällig) (Journaling) -Eventually Consistent -echte Skalierung nur über Sharding - (noch nicht kill -9 fest)

19 67 GB Index Data 11 hours + 1 day off

20 + nicht normalisiert + Schema Agilität + Doku exzellent + Speed (MemMapped Files) + Installation+save =28 sek! + beliebige Indizes + MapReduce + Rich Query Language + GridFS (statt HDFS) + einfache Replizierung (Master-Slave / Replica Sets)

21 db.system.indexes.find(); db.friends.getIndexes(); db.friends.ensureIndex({friend: 1}); db.friends.ensureIndex({friend: 1, zip: 1}); //compound db.friends.find({friend: Mario, zip: 13755}).explain(); Queries: age: {$gt: 10} food:{$all: [pizza, noodles]} $gt, $lt, $lte, $ne, $in, $nin, $mod, $all, $size, $exists, $type,, $or, $elem, $elemMatch, regexp,... NoSQL Query LockIn?!

22 Sich veränderndes Schema rename try {... } catch (FirstException | SecondException ex) { // newName = BlackList.checkName(OldName)}

23 B) Rails Migration new name old name new name (nicht wenn zu oft repliziert) old name new name old name new name old name new name

24 Duplikate = Space Aktualität der Daten Pre-Joined Daten! pre-computeD wachsende Daten raus oder Pre-SPACED

25 In die Cloud…

26 Clients Config Servers mongos ROUTER Replica Set Shard A Shard BShard C RAM+ DISK+ POSSIBLE ARBITER micro 64 bit [extra | double | quadrupel] Large

27 Erfahrungen… RAID Konfigurationen (00,01,10,03,05, …) Journaling-Dateisysteme (ext4, xfs, …) (Security) Ports, F-Deskriptoren, Snapshots,… +EC2

28 K/V-Stores Datenstrukturen abbilden ->

29 Sorted Set

30 memcached API

31 simply dynamic scaling (up & down) scales linear bullet proof by Zynga.com limited membase protocol Membase Tap (Protocol Interception) Code-Node:

32 Membase in der Cloud Fertige RightScale & AMI templates Diverse Ports öffnen DNS Eintrag und keine verändernden IPs Master Node angeben legt Quota für die Erben fest Backups für EBS

33 GraphDBs Property Graph

34 player

35 Graph DBs in der Cloud > N Milliarden Knoten? Sharding! aber meistens kein predictable lookup möglich nur bei Domain Specific Knowledge ausbalancierte DBs ohne sweet spots kaum möglich Access Patterns + Heuristiken (Insert Sharding / Runtime Sharding) => partitionierungs Algorithmen (HA) Neo4j Cache Sharding! Multi-Master Cluster for Consistent Routing

36 durchaus frustrierendes Consulting…

37 Data Transactions Performance Queries Architecture other Non-Functional Requirements

38 Analyse your Data Domain-Data, Log-Data, Event-Data, Message-Data, critical Data, Business-Data, Meta-Data, temp Data, Session-Data, Geo Data, etc. Data- / Storage-Model: relational, column-o, doc-alike, graphs, objects, etc. What Types / Type-System? Data-Navigation, Data Amount, Data Komplexity (Deep XML?) ACID vs. BASE vs. Mixture? CAP decisions Performance Dimension Analysis Latency, Request behaviour, Throughput Scale-Up vs Scale-Out Query Requirements Typical queries, Tools, Ad-Hoc Queries, SQL / LINQ needed, Map/Reduce? … Non Functional Requirements : Replication, Refactoring Frequency, DB-Support, Qualification / simplicity, Company restrictions, DB diversity (allowed?), Security, Safety / Backup & Restore, Crash Resistance, Licence…

39 NoSQL FAZIT

40 Unbedingt RAM & SDD annehmen! Gustavo Alonso Lots of >1 PT RAM DBs in California! Service, RAM, Cloud, Mobile RethinkDB SAP-Strategie?

41 DaaS Zeitalter Alleine für MongoDB weit über 100 Database-as-a- Service Provider! Amazon: SimpleDB, Hadoop, etc.

42 Viele clevere hybrid Lösungen! CouchBase, Hadoop+MySQL

43 OLAP Availability Ad Hoc Query

44 kritis che Daten unkritisc he Daten (View, Domain, Stamm, Meta, Log, …) by Couch, MongoDB, Redis, Membase, … Management Analytics Zahlungsdaten, persönliche Daten, … by classic RDBMS, Vertica, VoltDB, Database.com, GenieDB, … Hadoop* BI OLAP BI Dwight Merriman (10gen)

45 Links nosql-database.org nosqltapes.com mynosql.com.com

46

47 Schutzpaten t funktionale (graph) Dekomposition? Oder… Group By Use Case: Aggregate pi -> > 1000 cluster

48 Programmierung top! Programmierung nervt! Nur `large data indexing` Stratosphere (TUB), Starke Konkurrenz: Stratosphere (TUB), ePic, SwissBox, etc. A giant step back! Imcompatible, missing features, not new, … herrlich paralellisierbar

49 Map Reduce Cross Match CoGroup Paralellization Contracts u.v.m … => Graph Ops compile, analyze, optimize auf einer atmenden Cloud!

50 Amazon Dynamo MySQL Replikation Eventually Consistent

51 Consistency Models © Wilfried Springer NoSQL Rollercoaster

52 Availability Partition Tolerance Consistency CAP Theoreme ACID / Isolation Clients see equal data System is always on Clients find replicas Pick 2! NoSQL Klassiker

53 Dont throw C away so easy! Its complex.

54 6 = Network Partition is rare6 = Network Partition is rare 3,4,5,6 is mostly a Single Node3,4,5,6 is mostly a Single Node Algorithms can help!Algorithms can help! 6 = Network Partition is rare6 = Network Partition is rare 3,4,5,6 is mostly a Single Node3,4,5,6 is mostly a Single Node Algorithms can help!Algorithms can help!

55 M:[0,5) N:[5,10) O:[10,15) P:[15,20) Q:[20,25) R:[25,30) HASHKNOTENREPLIKAT 2MN,O 8NO,P 10OP,Q 17PQ,R 22QR,M 26RM,N W = 2*W R = 1*R Consistent Hashing ausfallsicher leicht erweiterbar gut verteilt / vnodes

56 pessimistisches Locking? MVCC Multi Version Concurrency Control

57

58 laufen L:1 surfen P:1 L:1 surfen L:2 P:1 laufen A:1 L:1 laufen A:1 L:1 P:0 surfen L:2 P:1 A:0 surfen P:2 A:1 L:2 => Anna Paul Laura Vector Clocks


Herunterladen ppt "Prof. Dr. Stefan Edlich NoSQL in der Cloud Prof. Dr. Stefan Edlich NoSQL in der Cloud."

Ähnliche Präsentationen


Google-Anzeigen