166 lines
4.9 KiB
Markdown
166 lines
4.9 KiB
Markdown
::: header
|
|
# how matrix works
|
|
:::
|
|
|
|
::: section
|
|
[Matrix](https://matrix.org/) is a merkle-dag based operational transform
|
|
crdt, primarily used for instant messaging. This article will try to
|
|
break it down into easier to understand parts.
|
|
|
|
## a crdt
|
|
|
|
Let's say you want to edit a document with your friends without
|
|
a central server. How would you implement it?
|
|
|
|
```rust
|
|
struct Event {
|
|
content: Any,
|
|
}
|
|
```
|
|
|
|
Let's say you send *event*s containing the changes you're making to
|
|
the document. Then, you send this event to your friends so they can apply
|
|
it. But what happens if events arrive out of order, or your friends edit
|
|
the same document you're editing? We need to know which event came first
|
|
using *ordering*!
|
|
|
|
Could we use a timestamp? It would work, but clocks can get out of sync
|
|
and can be set to different times. Luckily, having an exact real-world
|
|
time isn't necessary. All you need to know is which events came first,
|
|
or a *causual order*. What if there was a simple incrementing counter?
|
|
|
|
While we're at it, I might as well add an event_id field to know which
|
|
event is which.
|
|
|
|
```rust
|
|
struct Event {
|
|
event_id: EventId,
|
|
content: Any,
|
|
count: u64,
|
|
}
|
|
```
|
|
|
|
This is a bit better, but is still not great. Counters can be spoofed,
|
|
which will be an issue in public documents, like chatrooms or forums. We
|
|
can take a page out of git and implement a *directed acyclic graph*.
|
|
|
|
A graph is a set of nodes, like `Event`s, that are linked together. A
|
|
directed graph is a graph that where the links have directions, like
|
|
pointers or a parent and child. A directed acyclic graph is an acyclic
|
|
graph that has no loops (aka cycles).
|
|
|
|
```rust
|
|
// note: i'm using rust syntax very loosely to make this more approachable
|
|
struct Event {
|
|
event_id: EventId,
|
|
content: Any,
|
|
references: List<EventId>,
|
|
}
|
|
```
|
|
|
|
[todo: insert image of nodes linked together here]
|
|
|
|
Similarity to git, we can prevent loops by making event ids cryptographic
|
|
hashes. People can still create events pointing to very old parents,
|
|
but there's no way to make those parents point to new events.
|
|
|
|
```rust
|
|
type EventId = Sha256Hash;
|
|
|
|
fn get_event_id(event_without_id: EventWithoutId) -> Sha256Hash {
|
|
return sha256(serialize(event_without_id));
|
|
}
|
|
```
|
|
|
|
We now have a way to order events! Well, it's not a *total order*, but a
|
|
*partial order*. If two events have the same parents, they effectively
|
|
happen at the same time and forked the timeline. If two users set the
|
|
document name to the same thing, how to we resolve conflicts?
|
|
|
|
```rust
|
|
// the first event doesn't have any references
|
|
let createEvent = Event { content: { }, references: [], .. };
|
|
let name1Event = Event { content: { name: "foo" }, references: [createEvent.id], .. };
|
|
let name2Event = Event { content: { name: "bar" }, references: [createEvent.id], .. };
|
|
```
|
|
|
|
For now we can use a simple and blunt method of resolving state:
|
|
whichever event who's id bigger. Technically, we've made a way to
|
|
replicate data between peers without any conflicts. This is called a
|
|
conflict-free replicated data type, or [crdt](https://crdt.tech/). In
|
|
matrix terminology, a merkle dag of events is called a *room*.
|
|
|
|
## surely there's a better way
|
|
|
|
Ok, so maybe deciding who wins based on the event id isn't the best
|
|
idea. To improve this, we should first at least know who's sending
|
|
the events. Let's use public/private key pairs and verify events with
|
|
signatures.
|
|
|
|
```rust
|
|
struct Event {
|
|
event_id: EventId,
|
|
content: Any,
|
|
references: List<EventId>,
|
|
|
|
// new! you can use any pub/privkey system, like rsa or ed25519
|
|
sender: UserId,
|
|
signature: UserSignature,
|
|
}
|
|
```
|
|
|
|
Also, event contents have only been defined as "operations that
|
|
change state". Let's improve that a little by defining a few specific
|
|
operations. We'll have a create operation to define the root of a room,
|
|
an authorize operation to give specific users specific powers, and a
|
|
change room name operation.
|
|
|
|
```rust
|
|
struct Event {
|
|
// --- existing stuff hidden --
|
|
// new! the type of operation this event is
|
|
type: EventType,
|
|
|
|
// new! Lets give each room its own id as a quality of life change
|
|
room_id: RoomId,
|
|
}
|
|
|
|
enum EventType {
|
|
Create,
|
|
Authorize,
|
|
SetName,
|
|
}
|
|
|
|
// while we're here, lets define a basic authorization system
|
|
type AuthorizeContent = Map<UserId, Capability>;
|
|
|
|
enum Capability {
|
|
/// the user can't send any events
|
|
ReadOnly,
|
|
|
|
/// the user can only send SetName events
|
|
SetName,
|
|
|
|
/// the user can authorize more ReadOnly and SetName users
|
|
AddUsers,
|
|
|
|
/// the user can send any Authorize event
|
|
Admin,
|
|
}
|
|
```
|
|
|
|
When a room is created, it authorizes the creator to send SetName and
|
|
Authorize ops operations. Don't worry about how the authorize op is
|
|
sent for now. And with that setup out of the way, it's time to get into
|
|
state resolution.
|
|
|
|
todo: state resolution, full mesh federation
|
|
|
|
## making things go fast
|
|
|
|
todo: auth_events, partial state/fast joins, state events and message events, future matrix changes
|
|
|
|
## things matrix does
|
|
|
|
todo: the final matrix specific things that are useful to know, like redactions or soft failing
|
|
:::
|