Coding serbia 2015

Post on 12-Feb-2017

249 views 0 download

Transcript of Coding serbia 2015

Breaking free fromrelational databases

Matija Gobec@mad_max0204

SmartCat@SmartCat_io

AgendaWhat’s wrong with RDBMS

What we need today

Introduction to Cassandra

Use-cases and benefits

Learning process and common mistakes

Migrating to Cassandra

What’s wrong with RDBMS

Reliable and we understand it

Been around for ages

Development and administration tools

Worked great for few decades

What’s wrong with RDBMS

Requires expensive server hardware

Doesn’t scale

Limited fault tolerance

Can’t handle big amounts of data

Architectural limitations

What’s wrong with RDBMS

Master

Slave Slave

What’s wrong with RDBMS

Master

Slave Slave

●Network split

●Hardware failure

●Latency

What’s wrong with RDBMS

Master

Master Slave

What’s wrong with RDBMS

Master

Master Slave

? ?

What we need today

90% of world's data generated over last two years

Data impacts business

Systems that can learn and adopt

Personalized experience

We store everything

What we need today

AvailabilityScalability

Fault tolerancePerformance

ACID vs BASE

AtomicConsistent

IsolatedDurable

Basically AvailableSoft state

Eventually Consistent

Cassandra - introduction

Row partitioned storage

Fast, scalable and fault tolerant

Share nothing masterless architecture

Active everywhere design

Native multi-datacenter support

Cassandra - architecture

Cassandra - architecture

Client contact

Cassandra - architecture

Client request

Cassandra - architecture

Client

response

Cassandra - architecture

DC1 DC2

Cluster

Cassandra - architecture

DC1 DC2

Cluster

Cassandra - data layout

Partition key K:V K:VPartition

Cells

Partition keyK:V K:V

Partition

Cells

Clustering key

Cassandra - use casesIoT applications

Product catalogs and retail appsActivity trackingFraud detection

MessagingAnalytics and recommendation engines

...

Cassandra - benefits

Reliable storage

High performance

Easy scaling on commodity hardware

Solves problems by design

But at what cost?

An arm and a

leg ?

Learning processCQL looks like SQL

CREATE TABLE songs ( id uuid PRIMARY KEY, title text, album text, artist text, data blob);

CREATE TABLE playlists ( id uuid, song_order int, song_id uuid, title text, album text, artist text, PRIMARY KEY (id, song_order ));

Learning processCQL looks like SQL

CREATE TABLE songs ( id uuid PRIMARY KEY, title text, album text, artist text, data blob);

CREATE TABLE playlists ( id uuid, song_order int, song_id uuid, title text, album text, artist text, PRIMARY KEY (id, song_order ));

BUT IT’S NOT!!!

Learning process

I’ll create a data model based on my relational data model(especially while migrating)

Learning process

I’ll create a data model based on my relational data model

(especially while migrating)

WRONG!!!

Learning process

I can read from database what I just written

(write read antipattern)

Learning process

I can read from database what I just written

(write read antipattern)

WRONG!!!

Learning process

I’ll read from database to calculate what I write

(read write antipattern)

Learning process

I’ll read from database to calculate what I write

(read write antipattern)

WRONG AGAIN!!!

Learning process

Secondary indexes are your friend

(at least Mongo didn’t mind)

Learning process

Secondary indexes are your friend

(at least Mongo didn’t mind)

WRONG!!!

Migrating from RDBMSUnderstand how data is queried

Conceptual model is reusable

Run in parallel (leverage MQ)

Start developing with 3 nodes

Leverage parallel execution

You cannot beat laws of physics

MongoDB to Cassandra

MongoDB uses “relational” model

MongoDB is more flexible for R&D

Don’t measure performance of single nodes

Don’t use secondary indexes

Don’t use MongoDB

When not to use Cassandra

When you don’t need scalability

When you have a lot of updates

When you need query flexibility

When you don’t know what you need

Action points

Data modeling is query based

Understand physical data layout

Respect eventual consistency

Have fun

Thank you

Matija Gobec@mad_max0204

matija.gobec@smartcat.io

SmartCatwww.smartcat.io@SmartCat_io

https://github.com/smartcat-labs