Coding serbia 2015

38
Breaking free from relational databases Matija Gobec @mad_max0204 SmartCat @SmartCat_io

Transcript of Coding serbia 2015

Page 1: Coding serbia 2015

Breaking free fromrelational databases

Matija Gobec@mad_max0204

SmartCat@SmartCat_io

Page 2: Coding serbia 2015

AgendaWhat’s wrong with RDBMS

What we need today

Introduction to Cassandra

Use-cases and benefits

Learning process and common mistakes

Migrating to Cassandra

Page 3: Coding serbia 2015

What’s wrong with RDBMS

Reliable and we understand it

Been around for ages

Development and administration tools

Worked great for few decades

Page 4: Coding serbia 2015

What’s wrong with RDBMS

Requires expensive server hardware

Doesn’t scale

Limited fault tolerance

Can’t handle big amounts of data

Architectural limitations

Page 5: Coding serbia 2015

What’s wrong with RDBMS

Master

Slave Slave

Page 6: Coding serbia 2015

What’s wrong with RDBMS

Master

Slave Slave

●Network split

●Hardware failure

●Latency

Page 7: Coding serbia 2015

What’s wrong with RDBMS

Master

Master Slave

Page 8: Coding serbia 2015

What’s wrong with RDBMS

Master

Master Slave

? ?

Page 9: Coding serbia 2015

What we need today

90% of world's data generated over last two years

Data impacts business

Systems that can learn and adopt

Personalized experience

We store everything

Page 10: Coding serbia 2015

What we need today

AvailabilityScalability

Fault tolerancePerformance

Page 11: Coding serbia 2015

ACID vs BASE

AtomicConsistent

IsolatedDurable

Basically AvailableSoft state

Eventually Consistent

Page 12: Coding serbia 2015

Cassandra - introduction

Row partitioned storage

Fast, scalable and fault tolerant

Share nothing masterless architecture

Active everywhere design

Native multi-datacenter support

Page 13: Coding serbia 2015

Cassandra - architecture

Page 14: Coding serbia 2015

Cassandra - architecture

Client contact

Page 15: Coding serbia 2015

Cassandra - architecture

Client request

Page 16: Coding serbia 2015

Cassandra - architecture

Client

response

Page 17: Coding serbia 2015

Cassandra - architecture

DC1 DC2

Cluster

Page 18: Coding serbia 2015

Cassandra - architecture

DC1 DC2

Cluster

Page 19: Coding serbia 2015

Cassandra - data layout

Partition key K:V K:VPartition

Cells

Partition keyK:V K:V

Partition

Cells

Clustering key

Page 20: Coding serbia 2015

Cassandra - use casesIoT applications

Product catalogs and retail appsActivity trackingFraud detection

MessagingAnalytics and recommendation engines

...

Page 21: Coding serbia 2015

Cassandra - benefits

Reliable storage

High performance

Easy scaling on commodity hardware

Solves problems by design

Page 22: Coding serbia 2015

But at what cost?

An arm and a

leg ?

Page 23: Coding serbia 2015

Learning processCQL looks like SQL

CREATE TABLE songs ( id uuid PRIMARY KEY, title text, album text, artist text, data blob);

CREATE TABLE playlists ( id uuid, song_order int, song_id uuid, title text, album text, artist text, PRIMARY KEY (id, song_order ));

Page 24: Coding serbia 2015

Learning processCQL looks like SQL

CREATE TABLE songs ( id uuid PRIMARY KEY, title text, album text, artist text, data blob);

CREATE TABLE playlists ( id uuid, song_order int, song_id uuid, title text, album text, artist text, PRIMARY KEY (id, song_order ));

BUT IT’S NOT!!!

Page 25: Coding serbia 2015

Learning process

I’ll create a data model based on my relational data model(especially while migrating)

Page 26: Coding serbia 2015

Learning process

I’ll create a data model based on my relational data model

(especially while migrating)

WRONG!!!

Page 27: Coding serbia 2015

Learning process

I can read from database what I just written

(write read antipattern)

Page 28: Coding serbia 2015

Learning process

I can read from database what I just written

(write read antipattern)

WRONG!!!

Page 29: Coding serbia 2015

Learning process

I’ll read from database to calculate what I write

(read write antipattern)

Page 30: Coding serbia 2015

Learning process

I’ll read from database to calculate what I write

(read write antipattern)

WRONG AGAIN!!!

Page 31: Coding serbia 2015

Learning process

Secondary indexes are your friend

(at least Mongo didn’t mind)

Page 32: Coding serbia 2015

Learning process

Secondary indexes are your friend

(at least Mongo didn’t mind)

WRONG!!!

Page 33: Coding serbia 2015

Migrating from RDBMSUnderstand how data is queried

Conceptual model is reusable

Run in parallel (leverage MQ)

Start developing with 3 nodes

Leverage parallel execution

You cannot beat laws of physics

Page 34: Coding serbia 2015

MongoDB to Cassandra

MongoDB uses “relational” model

MongoDB is more flexible for R&D

Don’t measure performance of single nodes

Don’t use secondary indexes

Don’t use MongoDB

Page 35: Coding serbia 2015

When not to use Cassandra

When you don’t need scalability

When you have a lot of updates

When you need query flexibility

When you don’t know what you need

Page 36: Coding serbia 2015

Action points

Data modeling is query based

Understand physical data layout

Respect eventual consistency

Have fun

Page 37: Coding serbia 2015
Page 38: Coding serbia 2015

Thank you

Matija Gobec@mad_max0204

[email protected]

SmartCatwww.smartcat.io@SmartCat_io

https://github.com/smartcat-labs