Exploring CRDTs — Conflict-Free Replicated Data Types

tech

CRTSs are core behind understanding how to build truly distributed systems that sync data without conflicts - the theory behind collaborative apps like Figma, Notion, and Linear.

Distributed collaborative apps such as Google Docs, Apple Notes, Figma, etc. needs efficient conflict-resolution. Someone types "Hello" at position 4 and another types "Hi" at the same position at the same time. Which one gets picked? We have a few options here:

  • pick the one that came later (former update gets discarded - bad UX)
  • timestamp based ordering
  • server decides order
  • show both users a conflict dialog

But the users could be offline, there could be network lag, and if you're lucky, the might have a thousand or million users. The core problem is, distributed systems have no single source of truth about time or order. When multiple different nodes (users, services, servers) update a shared piece of data independently, you need a way to merge their changes such that there: no loss, always produces same data irrespective of the order, doesn't require external coordination.

We did have solutions before CRDTs, such as:

  • Locking: Only allow one user to make changes at a time (bad UX)
  • Last write wins: Update by the first user is lost
  • Conflict dialogs: Interrupt user with conflict dialogs, leading to manual resolution
  • Central coordination: Requires constant connectivity, single point of failure.

CRDTs can solve this using data structure that change or grow in ways that naturally merge without conflicts. If all operations are cumulative (order doesn't matter) and idempotent (repeating doesn't change the result), then all replicas will eventually converge to the same state. We're not dealing with order anymore.

We're effectively changing how we represent data in such applications. By representing differently, we eradicate some of the misgiving of distributed and/or collaborative nature of such applications. But how exactly do we represent data now? Let's see through an example.

Imagine you were building a collaborative shopping cart. You'd need to store each item that a user adds. You'd store this as: ['milk', 'bread', ...]. Now you both click Add Banana and Add Apple at once. Whose product gets added at position 2? Let's change this representation.

We'd now store this as:

[
 {
  id: 4565648941,
  name: 'milk',
  createdAt: 2-9-25,
  deleted: false
 },
 {
  id: 75448648989,
  name: 'bread',
  createdAt: 4-9-25,
  deleted: false
 }
]

Each item is unique identified by an id. Now, when both users click add banana and apple, both items get added to the list, with unique ids - no locking-mechanism, no central server, no conflict-resolution.

CRDT syncs are also faster. In traditional systems, you add -> send to server -> server decides order -> server informs everyone -> ui updates. In CRDT, you add -> immediately update on screen -> eventually sync with others -> everything merges automatically.

Real-World Application

Here's a simple example of a shopping list to demonstrate how CRDTs work. This is a React + Express application that has a CRDTManager and a simple syncEngine set up in the frontend. The syncManager syncs local changes with the server at /api/sync, which then logs the changes in a database, and sends back the changes of the other user, which are then sent to the CRDTManager for merging.

// App.tsx
import { useEffect, useRef, useState } from 'react'
import { CRDTManager, type Item } from './lib/crdt-manager';

import './App.css'
import { syncWithServer } from './lib/sync';

function App() {
  const [items, setItems] = useState<Item[]>([]);
  const [newItem, setNewItem] = useState<string>("");
  const [syncState, setSyncState] = useState<CRDTManager>();
  const crdtRef = useRef(new CRDTManager());

  useEffect(()=> {
    const activeItems = crdtRef.current.getActiveItems();
    setItems(activeItems);
  }, []);

  // polling for sync with server every 5 seconds
  useEffect(()=> {
    const syncInterval = setInterval(async ()=> {
        const syncedState = await syncWithServer(crdtRef.current, []);
        setSyncState(syncedState);
    }, 5000);

    return ()=> clearInterval(syncInterval);
  }, [syncState]);

  return (
    <>
      <h1>Shopping List</h1>
      <input type="text" placeholder="Add item" value={newItem} onChange={(e)=> setNewItem(e.target.value)} />
      <ul>
        {items.map(item => (
          <li key={item.id}>{item.name}</li>
        ))}
      </ul>

      <button onClick={()=> {
        crdtRef.current.addItem(newItem);
        setNewItem("");
        setItems(crdtRef.current.getActiveItems());
      }}>Add</button>
    </>
  )
}

export default App
// lib/CRDTManager.ts

export type Item = {
    id: string;
    name: string;
    createdAt: number;
    deleted: boolean;
}

export class CRDTManager {
    private items: Item[];

    constructor() {
        this.items = [];
    }

    addItem(name: string) {
        const item: Item = {
            id: `item-${Date.now()}`,
            name,
            createdAt: Date.now(),
            deleted: false,
        }

        this.items.push(item);
    }

    deleteItem(id: string) {
        const item = this.items.find(item => item.id === id);
        if (item) {
            item.deleted = true;
        }

        return item;
    }

    mergeWith(other: CRDTManager) {
        const newItems = other.items.filter(item => !this.items.some(localItem => localItem.id === item.id));
        const deletedItems = this.items.filter(item => !other.items.some(otherItem => otherItem.id === item.id));

        this.items = [...this.items, ...newItems];
        this.items = this.items.filter(item => !deletedItems.some(deletedItem => deletedItem.id === item.id));

        return this;
    }

    getActiveItems() {
        return this.items.filter(item => !item.deleted);
    }
}
// lib/sync.ts
import type { CRDTManager, Item } from "./crdt-manager";

export async function syncWithServer(localState: CRDTManager, pendingItems: Item[]): Promise<CRDTManager> {
    const response = await fetch("http://localhost:3000/api/sync", {
        method: "POST",
        headers: {
            "Content-Type": "application/json",
        },
        body: JSON.stringify({
            data: {localState, pendingItems},
        }),
    });

    const serverState = await response.json();

    return localState.mergeWith(serverState);
}