Lecture 5 Replication & Consistency

Replication
- Consistency in Replication
Consistency

Replication

Making copies of services on multiple machines

Motivations:
- Reliability (redundancy)
- Performance
  - Increasing processing capacity, reducing communication
- Scalability
  - Prevent overloading (size), avoid communication latency (geography)
Data Replication
- Mirror data across multiple file servers
- Caching part of data on multiple file servers
  - Has concept of ‘stale’ data
  - Is a dynamic subset of the main data store
Control Replication
- Replicate the actual service (e.g. web server). Replicated services access centralised database store
- Issues: all distributed system challenges!

Replication Challenges

Updates to the data
- Consistency (how to deal with updates)
- Update propagation (what do updates look like?)
  - Do you send the entire data again, run an operation on existing data etc.?
Replica Placement
- How many replicas?
- Location of replicas (depends on who is accessing the data)
Redirection & Routing
- Which replica should clients use?

Consistency in Replication

Do replicas have the same data? What differences are allowed?
Time:
- How old is the data? (staleness)
- How old is the data allowed to be? (time, versions)
Operation Order (correct order, allowed order?)

Ordering

Conflicting operations
- Read/write conflict
- Write/write conflict
- Ordering affects consistency
Partial / Total Ordering
- Partial: Ordering of operations of a single client
- Total: interleaving of all conflicting operations

Consistency Model defines which interleaving operations are valid

Data Coherence: ordering of operations on a single data item
Data Consistency: ordering of operations for a whole data store
- Implies data coherence

Consistency

Data-centric Consistency Models

A contract between a distributed data store and clients, in which the data store specifies precisely what the results of read/write operations are

Strong Ordering:
- All writes must be performed in the order they are invoked
- Strict, Sequential, Causal, FIFO
- Expensive communication requirements
Weak Ordering:
- Ordering of groups of writes rather than individual writes
- Weak, Release, Entry

Strong-ordering Consistency Models

Strict Consistency

Any read on data item returns a value corresponding to the result of the most recent write on x
- Assumes an absolute global time (what does most recent mean?)
- Assumes instant communication
- Normal on a uniprocessor
- Impossible in a distributed system!

Linearisable Consistency

All operations are performed in a single, sequential order
- Operations ordered according to global timestamp (but this doesn’t necessarily mean the time it was actually done)
- Program order of each client is maintained
- Clock synchronisation is difficult

Sequential Consistency

All operations are performed in some sequential order
- Most common model used
- More than one correct sequential order is possible.
- Program order of each client is maintained.
- Not ordered according to some notion of time
- Performance limitation: read time + write time >= minimal packet transfer time
  - At least one communication needed for a read or write

Causal Consistency

Potentially causally- related writes are executed in the same order everywhere
- Causally related operations:
  - read followed by a write (in same client)
  - write followed by a read (on any client)

FIFO (PRAM) Consistency

Only partial orderings of writes are maintained

Weak-ordering Consistency Models

Weak Consistency

Synchronisation variable/operation and define critical section with synchronise operations
- The synchronisation operation can’t be completed until all clients have reached the synchronisation point
Order of synchronisation operations is sequentially consistent
Can’t perform read/write until all previous synchronisation operations have completed
Order of synchronisations are sequentially consistent

Release Consistency

Explicit separation of synchronisation tasks:
- acquire(S): Brings local state up to date
- release(S): Propagate local updates to all replicas
- An acquire/release pair defines the critical region
Order of synchronisation operations are FIFO consistent
Release can’t be performed until all previous read/writes are completed
Read/write operations can’t be performed until all previous acquires have completed

Lazy Release & Entry Consistency

Lazy Release: Don’t send updates on release (prepare update, but not send out)
- Acquire causes client to get newest state
Entry Consistency: each shared data item has it’s own synchronisation variable
- Exclusive and non-exclusive access modes

Eventual Consistency

If no updates take place for a long time, all replicas will gradually become consistent
Must have:
- Few read/write or write/write conflicts
- Clients need to accept and time inconsistency (they need to know they will see stale data)

CAP Theory

Consistency (linearisability)

Availability (timely response to operations)

Partition-tolerance (functions if partitioned - fault tolerance on partition such as network)

Can only choose two

CAP Impossibility Consequences for wide-area systems…

Must choose consistency or availability
Choosing Availability:
- Give up on consistency? Or eventual consistency?
Choosing Consistency
- No availability, delayed operations
You have to have P in a distributed system
- If your system isn’t partition tolerance (e.g. network failure), it can’t be distributed

Client-centric Consistency Models

Provides guarantees about ordering of operations for a single client

Single client accessing different data store replicas (data isn’t shared by clients)
The effect of an operation depends on the client performing it, and the history of the operations the client has performed
Data items have an owner
No write-write conflicts

Monotonic Reads

If a client of seen a value x at some time t, it will never see an older version of x at a later time
- Reads are always newer

Monotonic Writes

A write operation on a data item x is completed before any successive write on x by the same client
E.g.: appending to log file

Read Your Writes

The effect of a write on x will always be seen by a successive read by the same client
E.g. shopping cart

Write follows Reads

A write operation on x will be performed on a copy of x that is up to date with the value most recently read by the same client

Consistency Protocols

Implementation of consistency models

Primary-based Protocols:
- One main store, with backups
- Remote-write and local-write protocols
Replicated-Write Protocols:
- Active replication
- Quorum-based protocols

Remote-Write Protocols

Single Server:
- All writes and reads executed at single server
- No Replication!
Primary-Backup
- All writes executed at single server, all reads local
- Updates block until executed on all backups
- Poor performance
- Still centralised (breaks if primary goes down)

Local-Write Protocol

Migration:
- Data migrated to local server when accessed
- Good Performance (when not sharing data)
Migration-Primary:
- Multiple readers, single writer
- Performance for concurrent reads
- Poor Performance for concurrent writes

Active Replication

Updates (write operation) sent to all replicas
- Usually sends update operations to the replicas, and replicas apply the operation to the data
Need totally-ordered multicast for sequential consistency
If operation needs to return result: you need to work out if all the results returned are consistent

Quorum-Based Protocols

Choose a subset of replicas to send read/write operations to
- Set up groups to achieve desired consistency
Data is versioned (need to make sure you always work on the latest version)
Read (Nr) and Write (Nw) Quorum is the size of the groups - you need to contact all of them when reading/writing.
- Read: contact Nr replicas and read the one with highest version number
- Write: contact Nw replicas, and update the data on the latest version
Constraints:
- Nr + Nw > N: should always overlap so that whenever you do a read, you always get one server with the newest write on it
- Nw > N/2: Write quorums will always overlap, so successive writes will always have at least one replica that overlaps

Note: see slides for examples!

Push vs. Pull

How do you send the updates to replicas?

Pull (or client-based):
- Updates are propagated on request
- Contact all replicas and get the newest state
- Good when you have more writes than reads
  - You only have to figure out versioning when a read is performed
- Polling delay (whenever a read is performed)
Push (or server-based):
- Push updates to all replicas whenever a write is performed
- When low staleness is required
- Good when you have more reads than writes
- Have to keep track of all replicas
- What to propagate?
  - Data (good when reads >> writes)
  - Update operation (lower bandwidth cost)
  - Notification/Invalidation (good when writes >> reads)
Leases:
- Server ‘promises’ to push updates until the lease expires
- Lease length depends on:
  - age (last modified time) - if it was recently updated, it’s likely to be updated soon in the future
  - renewal frequency
  - state-space overhead