public/matrix.md

::: header
# thoughts on matrix
:::

::: section
I've used matrix for a while now. While the core idea is excellent and
there's so many good ideas, there's so many bad parts and annoyances
with it. Parts of the protocol feel like they were patched together by
someone until they went "eh looks about right", while other parts are
incoherent or give off strong design-by-comittee vibes. I'm going to
keep using matrix, but because of these problems I won't suggest it to
any other people.

This critique looks at matrix from an end user's perspecive as well as
from a more technical perspective and is mostly a thought dump, so it
may be hard to follow.

## threads

Threads are implemented as fancy replies. This makes it easy for people
using clients without threads to accidentally reply to a threaded message
from outside of a thread, causing things to break. It also causes
per-thread typing notifications and read receipts to be chronically
broken. Once you send a message in a thread it's permanently added to
"my threads" with no way of removing it later. There's also pretty much
no qol features, like thread titles/names or being able to archive old
threads. There's no way to lock threads (though this is fairly niche and
to my knowledge would be extremely difficult to implement). On top of all
that, there's no way disable threads in a room. Thread only rooms would
be nice and technically only need special ui, but don't currently exist.

Making threads their own rooms (threads as rooms) would solve some of
these problems, but would cause problems elsewhere. Searching messages
across the main room and threads would be harder to implement. Access
control would need to be synced, and "rogue threads" which don't set
power levels for the parent room's moderators could be created. There's
no way to add other people to threads (ie. in the case of "reply in
thread"). Letting people peek in threads without joining would rely on
features that don't exist currently, like federated peeking as well as
a version of history_visibility for peeking. This could also lead to
amplification: currently, creating a thread is as expensive as sending
a message, wheras each thread as room would need its own set of state
(m.room.create, m.room.member, m.room.join_rules, etc)

(Aside: it looks like having a "main thread/chat" and "threads" are
fundementally incompatible. Most people would like messages would be
properly threaded, but there's an easier to use main thread and ui
encouraging its use. There's a fundemental tension and neither option
is quite correct.)

## push rules

Push rules are how a matrix client and server decides whether a message
sends a push notification, makes a sound, and so forth.

All push rules go in one giant blob in global account data, which I
think some of which should have been split out into per-room account
data. Push rules have seemingly arbitrary groupings: muting
a room goes in the `overrides` group, but receiving notifications for
all messages goes in the `rooms` group.

When changing a lot of room-specific push rules, the rules can overwrite
each other in flight, causing some room-specific push rules to reset/be
dropped. This is technically an issue with all push rules, but most of the
time it isn't noticable to end users. Push rules get around this by having
an entire separate set of apis for changing fine-grained account data.

The rules are written in a json <abbr title="domain specific
language">dsl</abbr>-like thing that is needlessly flexible. This causes
performance issues. Simultaneously, they're somehow extremely rigid. It
is only possible to use one of the 6 predefined conditions and the only
counters that exist are `notify_count` and `highlight_count`. Notably,
there's no `unread_count`. My workaround for now is to patch clients
to not notify unless the highlight tweak is set, and use `notify_count`
as an unread count.

Mentions are currently implemented as searching the message body for a
display name/username, which can cause accidental notifications. Saying
people's names without notifying them needs workarounds like
1337speek. This is slated to be fixed with intentional mentions,
where mentions are extracted to their own json field. Unfortunately,
it looks like encrypted messages will put the mentions in encrypted
content, breaking rules (todo: heard this from somewhere, but need a
source). Ironically, the one time metadata is made less leaky, it also
happens to be the one time where it makes sense and is needed for good ux.

The `/notifications` endpoint paginates through a list of pings/mentions,
though due to the above paragraph won't show any mentions in encrypted
rooms. There's no way to filter mentions per room/space or if it's @room.
Although "list of mentions" makes sense naively, it would be much better
to have an inbox where messages be added and removed.

No client supports marking rooms, threads, or events as unread.

There's no way to temporarily mute rooms (or all notifications). Since a
room can be in multiple spaces, there's no easy way to mute all rooms in
a space. There's no way to only mute @room notifications while allowing
user mentions.

## e2ee

Matrix's implementation of end to end encryption leaks a lot of metadata,
including but not limited to room name/topic/avatar, member per-room
displaynames/avatars, and reactions.

Newly joined members can't view old e2ee messages. This is fixable,
but not yet fixed.

It's also remarkably easy to get undecryptable messages by accident,
and is the **single biggest reason** why I'm not recommending matrix to
anyone else for now. It needs to be more robust and have a better ux.

## apis

Syncing is slow and sends tons of extra data, though sliding sync is
aiming to fix it. No comment will be made on sliding sync, as its apis
are currently in flux. EDIT: Sliding sync is shaping up quite nicely,
but I think there are some mistakes, like `room_name_like` and sorting.

Similarily, the authentication mess is slated to be replaced with OIDC,
so no comment there until it's fleshed out more properly. The current
system isn't terrible, but isn't great either. Getting a seamless ux,
both for users and server admins, looks like it will be pretty difficult
to achieve.

A lot of apis are unergonomic or difficult to use. For example, every
event/message needs to be sent with a random transaction id, which is
annoying when trying to use webhooks. The api doesn't use one event model:
there are events, events without room_id, stripped state events, and
stripped state events with `origin_server_ts`. The application service
api doesn't support e2ee without workarounds *yet*.

The media repository (media repo) lets users store blobs, but it feels
tacked on instead of part of the protocol. Each piece of media is stored
on a single canonicical and trusted server (aka point of failure), unlike
rooms. There is no way to delete uploaded blobs from the media repo.

The apis are trying to be generic while geared towards instant messaging:
`/search` is only indexes blessed keys and even types, which are
different per server. This could be fixed by having a blessed `body`
property that is always indexed, regardless of event type. Events are
able to form graphs but can't be recursively queried. Rooms didn't
have types until relatively recently, and most clients will display
all rooms regardless of type anyway. There also isn't any subtyping
(mutable types, ie. to convert a room between a normal room and dm).

I know instant messaging is the main puropse of matrix,
but for medium-large communities I *really* wish there
was some form of forum system. Something similar to
tildes.net would be ideal, especially the [hierarchical
tagging](https://docs.tildes.net/instructions/hierarchical-tags)
and [comment
labelling](https://docs.tildes.net/instructions/commenting-on-tildes#labelling-comments),
though the extra sorting/filtering is probably too much to ask.

## bots

Bots are hard to get right, as there's no way to use structured commands
(ie. discord and telegram slash commands). Bots need to parse the
message's body. This has problems, since reply fallbacks exist. Parsing
mentions and other things from the body is also difficult, requiring
parsing html.

There's no way to specify interactions, like buttons, menus, canned
responses, preset reactions, etc. Reaction-based uis are the best to
currently exist, but it takes n + 1 events where n is the number of
"buttons".

Matrix seems to be heading towards a widget-based ui system, where
widgets are embedded webpages/webapps. Clients that want to support
widgets would need to embed an entire web browser. Even though it could
be more flexible, it would be much more annoying for bot developers to
learn an entirely separate widget api as well as well as use html/css/js
instead of matrix events and json. Since widgets seem to be able to do
interactions on behalf of the user(?), clients need to take extra care
in implementing sandboxing and access control. Widget ui would also
wouldn't fit in as well with the rest of the client. Widgets might
have issues with privacy and tracking because they can use arbitrary
html/js/css.

Bots can use the full set of html, including tables and
details/summary. However, different clients may only support certain
subsets of html.

## potpourri

Matrix can be generalized as logs of arbitrary json events combined
with a crdt of map<(type, state), event>. Although this could be led
to *so many* use cases, the official client-server api isn't flexible
enough as mentioned above. However, it's entirely possible to use the
core matrix protocol with a custom client-server api + custom client,
activitypub style.

Messages can either be text or a single file, but not both (like a file
with a comment) or multiple attachments.

Clients are missing features or are buggy. A lot of spec is dictated by
synapse and element, even when spec.matrix.org says otherwise. Old and
buggy room versions exist in the wild, with no incentive to upgrade.
Federation is excellent in theory, but in practice servers tend to drop
events.

Power levels are pretty rudimentary. There's no way allow/deny specific
rel_types, only event types. There is no role based access control.

There's no way to peek into dms. (Or federated rooms, for that matter.)

There are many msc (matrix spec change) proposals that seem to have
gotten stuck, like custom emoji/stickers.

Room-specific user displaynames/avatars can be overwritten when the
global displayname/avatar is changed.

Anything added to spec can't be removed later, which is to be expected
for a protocol. But when mscs are opened, events are sent with unstable
identifiers. Clients later will need to support both the official type
and unstable identifier to support past events.

Moderation is lacking, and it's pretty much required to use a bot
([mjolnir](https://github.com/matrix-org/mjolnir)) (not that bad,
but a bit annoying for smaller rooms). When cleaning up spam, a
redaction event must be sent for each message/event - bulk redactions
([msc2244](https://github.com/matrix-org/matrix-spec-proposals/pull/2244))
have been merged but isn't in spec and doesn't seem to be implemented
by any server.

`GET /state` returns the content of a state event, not the full event
itself.

## conclusion

7.8/10 too much water

## wishlist

In no particular order. These requests shouldn't be too difficult
to implement, but would improve usability by *so* much. These aren't
nearly fleshed out to be full mscs.

- Cleanup threads and improve usability
  - ~~`DELETE /rooms/{roomId}/threads/{eventId}` to stop participating in
    a thread~~
  - `PUT /rooms/{roomId}/threads/{eventId}/participation` to change
    whether you're participating in or aren't participating in a thread.
  - Thread-only rooms.
- Redo power levels
  - Base them on `(from_event_type, rel_type, to_event_type)` tuples rather
    than `(event_type)` alone.
  - Implement rbac (roles). Ideally, it would use explicit deny as the
    permission model. May be blocked on "give every member a role" use cases.
- Add a way to make spaces sync subsets of state with the child
  rooms. msc3216 and msc2962 exist for power levels, which may be enough
  for most use cases
- Add a way to mute rooms temporarily. Maybe add a push rule condition
  for this, or better yet redo push rules entirely. (I know the whole
  "don't rewrite what works", but seriously...)
- Make aliases and visibility less confusing. There's room visibility,
  history visibility, local/published/main alias(es), and the
  room directory. In ui, room directory could be merged into room
  visibility. I'm not sure how to simplify aliases without reducing
  functionality.
- Improve /notifications to be more like a proper inbox
  - Use intentional mentions (even in encrypeted rooms) to populate the list
  - `DELETE /notifications` + `DELETE /notifications/{eventId}` along
    with automatically removing old notifications to enhance usability
  - `PUT /notifications/{eventId}` to bookmark an event for later (though,
    this may be better served with "starred/bookmarked messages")
  - Some way to reorder notifications? Maybe the `PUT` endpoint could
    take an "order" key.
- Add a `m.preset_reactions` or `m.interactions` key for bots.
- Bot supplied edits or annotations?
- `unread_count` to view the number of unread messages in a room, see msc2654
- This might only be something I want, but another presence/status type
  different from online for when you're explicitly available to talk to
  woul be nice. I really want to be able to have a clear split between
  "being on the internet" and "willing to have my attention taken"

<!--
## the far future?

Some random ideas on how matrix could look in the far future...

Memberships could be made ephemeral. A user's membership could be
`allow` (invite, join) or `deny` (kick, leave, ban). In private rooms,
the default membership would be `deny` and in public rooms it would
be `allow`. When a member's membership is explicitly set to `allow`,
they are invited to the room. A member can send a special ephemeral
(or timeline) event to tell other servers they have acknowledged the
membership change and are now joined. Of course in practice, more
memberships/states would need to be added, like `null`/`default` or
`knock`. The main problem would be making restricted rooms work.

No matter how many apis you try to add, some use case will need something
special. I wonder how feasable it would be to use wasm instead of
room versions for permissions, custom sorting/filtering/indexing,
and deriving custom unsigned/state. It could even do state resolution,
although this is extremely difficult, may have security issues, and is
probably impractical. The main problem is wasm would need to be embedded
in the event or be placed in the media repo.
-->
:::