# Building a Collaborative Markdown Editor with CRDTs A high-level guide to understanding what it would take to build real-time collaborative editing using pure vanilla JavaScript and WebDAV sync. --- ## 1. What Problem CRDTs Solve ### The Concurrent Editing Problem Imagine two users editing the same markdown document: - **User A** types "Hello" at the start of line 1 - **User B** deletes line 1 entirely - Both happen at the same instant, offline from each other When they reconnect, what should the document look like? There's no objectively "correct" answer, but whatever we choose, both users must end up with the **same result**. This is the core challenge. ### Why "Last Write Wins" Fails for Text A naive approach: whoever syncs last overwrites everything. This works for simple values (a settings toggle, a timestamp), but destroys text editing: - User A writes a paragraph - User B fixes a typo elsewhere - User B syncs last, User A's paragraph vanishes For collaborative text, we need **granular merging** at the character or operation level, not whole-document replacement. ### Eventual Consistency CRDTs (Conflict-free Replicated Data Types) guarantee that if all users eventually receive all operations, they will converge to the **same state**, regardless of the order operations arrived. No central coordinator needed. No conflict resolution UI. The math just works. --- ## 2. Core CRDT Concepts for Text ### Operations vs State Traditional apps store **state**: "the document currently says XYZ." CRDTs store **operations**: "User A inserted 'H' at position 0, then 'e' at position 1..." The current state is derived by replaying all operations. This seems wasteful but enables merging. ### Unique IDs for Every Operation Each insert gets a globally unique ID, typically: ``` { userId: "alice", counter: 42 } ``` - `userId`: identifies who made the change (must be unique per device/session) - `counter`: a logical clock that increments with each operation This pair is unique across all users and all time. When User B deletes a character, they reference it by this ID, not by position (positions shift as text changes). ### Why Order Matters: Vector Clocks Operations have **causal dependencies**. If I type "Hello" then delete the "o", the delete depends on the insert. Vector clocks track this: ``` { alice: 5, bob: 3 } ``` This means "I've seen Alice's first 5 operations and Bob's first 3." When merging, we can tell which operations happened "before" others and which were concurrent. ### The Operation Log Every client maintains an append-only log of all operations it knows about: ```json [ { "id": ["alice", 1], "type": "insert", "char": "H", "after": null }, { "id": ["alice", 2], "type": "insert", "char": "e", "after": ["alice", 1] }, { "id": ["bob", 1], "type": "insert", "char": "!", "after": ["alice", 2] }, { "id": ["alice", 3], "type": "delete", "target": ["bob", 1] } ] ``` This log is the source of truth. The visible document is just a view computed from it. --- ## 3. Operation Log Deep Dive ### What Gets Logged For a text CRDT, two operation types suffice: **Insert** ```json { "id": ["alice", 7], "type": "insert", "char": "x", "after": ["alice", 6] } ``` - `id`: unique identifier for this character - `after`: the ID of the character this comes after (null = start of document) **Delete** ```json { "id": ["alice", 8], "type": "delete", "target": ["bob", 3] } ``` - `target`: the ID of the character being deleted Notice: deletes don't remove from the log. They add a new entry marking something as deleted. ### How Logs Grow Every keystroke adds an entry. A 10,000 character document with heavy editing might have 50,000+ log entries (inserts, deletes, re-inserts). This is the fundamental tradeoff: rich history enables merging but consumes memory. ### Compaction and Garbage Collection When can you prune the log? Only when you're **certain** no peer will ever send an operation referencing the pruned data. In practice: - **Never prune** in a simple implementation (safest) - **Prune after checkpoint**: all peers confirm they've seen operation N, prune everything before N - **Tombstone expiry**: delete markers older than X days get removed (risky if a peer was offline that long) For a proof-of-concept, skip compaction entirely. Accept that the log grows forever. ### Log File Format A simple JSON-lines format (one JSON object per line) works well: ``` {"id":["alice",1],"type":"insert","char":"H","after":null,"clock":{"alice":1}} {"id":["alice",2],"type":"insert","char":"i","after":["alice",1],"clock":{"alice":2}} {"id":["bob",1],"type":"insert","char":"!","after":["alice",2],"clock":{"alice":2,"bob":1}} ``` Each line is self-contained. Appending is safe. Parsing is line-by-line. --- ## 4. Merging Changes from Multiple Users ### Replaying in Causal Order When you receive operations from another user, you can't just append them. You must insert them into your log respecting causality: 1. Check the operation's vector clock 2. If you've already seen all its dependencies, apply it 3. If not, buffer it until dependencies arrive In practice, operations usually arrive in order. Buffering is rare but necessary for correctness. ### Concurrent Inserts at the Same Position The tricky case: Alice and Bob both insert after the same character simultaneously. Their operations are concurrent (neither caused the other). Which comes first? CRDTs use a **deterministic tiebreaker**: compare user IDs lexicographically. If `"alice" < "bob"`, Alice's character comes first. Both users apply the same rule, so both get the same result. ### Tombstones: Why Deleted Characters Leave Traces When Bob deletes a character, he can't remove it from the log. Why? Because: 1. Alice might send an insert "after" that character 2. If the character is gone, we don't know where Alice's insert belongs So deleted characters become **tombstones**: still in the log, marked as deleted, invisible in the rendered document, but available for positioning. ```json { "id": ["bob", 3], "char": "x", "deleted": true } ``` --- ## 5. WebDAV Sync Outline ### Polling Model Without WebSockets, sync happens via polling: 1. Client fetches remote log file(s) 2. Client merges remote operations into local log 3. Client uploads local operations the server hasn't seen 4. Repeat every N seconds ### File Structure on Server Option A: **Single shared log file** ``` /docs/my-document.crdt.jsonl ``` - Simple but requires read-modify-write (race conditions) Option B: **Per-user log files** ``` /docs/my-document/ alice.jsonl bob.jsonl ``` - Each user appends only to their own file - Merge happens client-side - Eliminates write conflicts on the server Option B is safer for WebDAV, which doesn't support atomic append. ### Conflict Strategy With per-user logs, there are no server-side conflicts. Each user owns their file. The CRDT merge is deterministic: given the same set of operations, all clients compute the same document. ### Sync Flow (Sequence Diagram) ``` Alice WebDAV Server Bob | | | |-- PUT alice.jsonl (ops 1-5) -->| | | | | | |<-- GET alice.jsonl, bob.jsonl -- | | |--- return both files ----------->| | | | | | [Bob merges] | | | | |<-- PUT bob.jsonl (ops 1-3) ------| | | | |-- GET alice.jsonl, bob.jsonl ->| | |<-- return both files ----------| | | | | [Alice merges] | | ``` ### Detecting New Operations Include a vector clock summary at the end of each file, or use file modification timestamps. On fetch, compare to last-known state to identify new ops. --- ## 6. Vanilla JS Sketch ### Data Structures Needed ```js // The operation log (source of truth) const opLog = []; // Vector clock: how many ops we've seen from each user const vectorClock = {}; // { alice: 5, bob: 3 } // Our user ID (generate once, persist in localStorage) const userId = localStorage.getItem('odUserId') || generateUserId(); // Local counter for generating unique IDs let localCounter = 0; ``` ### Key Functions ```js // Generate a unique ID for a new operation function nextId() { localCounter++; vectorClock[userId] = localCounter; return [userId, localCounter]; } // Create an insert operation function insertOp(char, afterId) { return { id: nextId(), type: 'insert', char: char, after: afterId, clock: { ...vectorClock } }; } // Create a delete operation function deleteOp(targetId) { return { id: nextId(), type: 'delete', target: targetId, clock: { ...vectorClock } }; } // Apply an operation to the log function applyOp(op) { // Check if we've already seen this op const [user, counter] = op.id; if (vectorClock[user] >= counter) return; // duplicate // Check causality (simplified: assume ops arrive in order) opLog.push(op); vectorClock[user] = Math.max(vectorClock[user] || 0, counter); } // Merge a remote log into ours function merge(remoteOps) { for (const op of remoteOps) { applyOp(op); } } // Rebuild document text from opLog function render() { // Build ordered list of non-deleted characters // (This is the complex part: sorting by after-references and tiebreakers) // Returns plain string } // Serialize log to JSON lines function serialize() { return opLog.map(op => JSON.stringify(op)).join('\n'); } ``` ### The Render Function (Simplified) The hardest part is `render()`. A naive approach: 1. Build a map of `id -> { char, afterId, deleted }` 2. Find the "root" (afterId = null) 3. Recursively find all characters whose `after` points to current 4. Sort concurrent siblings by userId 5. Skip deleted characters 6. Concatenate This is O(n^2) in the worst case. Real implementations use tree structures. For a proof-of-concept with small docs, naive works. --- ## 7. Tradeoffs and Limitations ### Memory The operation log grows without bound. A heavily-edited 1000-word document might have 100KB+ of log data. Options: - Accept it (storage is cheap) - Implement checkpointing (complex) - Periodically "reset" and lose history (crude but effective) ### Latency Polling WebDAV every 5 seconds means 5-second sync latency. For real-time feel, you'd want: - WebSockets (sub-second sync) - WebRTC (peer-to-peer, lowest latency) - Server-sent events WebDAV polling is fine for "near-time" collaboration (think shared notes, not Google Docs cursor tracking). ### Complexity This doc covers maybe 20% of what a production CRDT needs: - Undo/redo - Cursor positions and selections - Rich text (bold, links, headings) - Large document performance - Offline queue management - User presence indicators This is why libraries like **Yjs**, **Automerge**, and **Diamond Types** exist. They've solved these problems. For learning, building from scratch is valuable. For production, use a library. --- ## Further Reading - [A Gentle Introduction to CRDTs](https://vlcn.io/blog/gentle-intro-to-crdts) - [CRDT.tech](https://crdt.tech/) (academic papers and resources) - [Yjs documentation](https://docs.yjs.dev/) (practical library) - [Martin Kleppmann's CRDT lectures](https://www.youtube.com/watch?v=x7drE24geUw)