Real-time collaboration feels like magic — multiple people typing in the same document, seeing each other's cursors, with no conflicts or lockouts. Behind this experience is a family of data structures called CRDTs: Conflict-free Replicated Data Types. This post explains how they work and how to implement one for collaborative text editing.
Why not Operational Transformation?
For years, the dominant approach was Operational Transformation, famously used by Google Docs. OT works by transforming operations against each other — if Alice inserts 'a' at position 3 and Bob deletes position 2, the server transforms Alice's operation to insert at position 2. This requires a central server to maintain a total order of operations, and the transformation functions grow quadratically with the number of operation types.
CRDTs take a different approach. Instead of transforming operations, they use data structures that are mathematically guaranteed to converge — any two replicas that have seen the same set of updates will have the same state, regardless of the order they applied those updates. This means they work peer-to-peer with no central server.
The core idea: monotonic joins
A CRDT is defined by three properties. The state space forms a join-semilattice — a partially ordered set where any two elements have a least upper bound. All operations on the state are monotonic with respect to this order (state only moves "up"). And the merge of two states is their join — take the least upper bound of the two.
Text editing with RGA
For text editing, a common CRDT approach is the Replicated Growable Array (RGA). Each character in the document gets a globally unique identifier — typically a combination of a Lamport timestamp and the node's unique ID. Characters are ordered by their identifiers. When two users insert at the same position concurrently, the identifiers break the tie deterministically.
Deletion is handled with tombstones — characters are marked as deleted rather than removed, so their identifiers can still be used for ordering subsequent insertions. This means the data structure grows monotonically, satisfying the CRDT lattice property. In practice, you periodically run garbage collection to prune tombstones below the oldest known version.
Trade-offs and practical considerations
CRDTs have costs. The metadata overhead is significant — for text, each character carries an identifier and neighbor pointers, often 10-20x the size of the raw text. Tombstones grow unboundedly without GC. And the merge semantics, while mathematically clean, can produce surprising results. If two users edit the same sentence concurrently, the merged document interleaves their changes at character-level granularity, which is "correct" but may not be what either user intended.
- CRDTs enable true peer-to-peer collaboration with no central coordination
- The metadata overhead can be 10-20x the document size
- Tombstone GC is essential for long-lived documents
- Character-level merging is mathematically correct but may surprise users
- Modern editors often combine CRDTs with an authoritative server for UX polish
The field is moving fast. Projects like Yjs and Automerge have made CRDTs practical for production applications, handling edge cases that took years to discover. If you're building a collaborative feature, start with one of these libraries rather than implementing RGA from scratch — the spec is 30 pages and the edge cases are subtle.