public/matrix.md
2024-05-22 04:03:59 -07:00

14 KiB

::: header

thoughts on matrix

:::

::: section I've used matrix for a while now. While the core idea is excellent and there's so many good ideas, there's so many bad parts and annoyances with it. Parts of the protocol feel like they were patched together by someone until they went "eh looks about right", while other parts are incoherent or give off strong design-by-comittee vibes. I'm going to keep using matrix, but because of these problems I won't suggest it to any other people.

This critique looks at matrix from an end user's perspecive as well as from a more technical perspective and is mostly a thought dump, so it may be hard to follow.

threads

Threads are implemented as fancy replies. This makes it easy for people using clients without threads to accidentally reply to a threaded message from outside of a thread, causing things to break. It also causes per-thread typing notifications and read receipts to be chronically broken. Once you send a message in a thread it's permanently added to "my threads" with no way of removing it later. There's also pretty much no qol features, like thread titles/names or being able to archive old threads. There's no way to lock threads (though this is fairly niche and to my knowledge would be extremely difficult to implement). On top of all that, there's no way disable threads in a room. Thread only rooms would be nice and technically only need special ui, but don't currently exist.

Making threads their own rooms (threads as rooms) would solve some of these problems, but would cause problems elsewhere. Searching messages across the main room and threads would be harder to implement. Access control would need to be synced, and "rogue threads" which don't set power levels for the parent room's moderators could be created. There's no way to add other people to threads (ie. in the case of "reply in thread"). Letting people peek in threads without joining would rely on features that don't exist currently, like federated peeking as well as a version of history_visibility for peeking. This could also lead to amplification: currently, creating a thread is as expensive as sending a message, wheras each thread as room would need its own set of state (m.room.create, m.room.member, m.room.join_rules, etc)

(Aside: it looks like having a "main thread/chat" and "threads" are fundementally incompatible. Most people would like messages would be properly threaded, but there's an easier to use main thread and ui encouraging its use. There's a fundemental tension and neither option is quite correct.)

push rules

Push rules are how a matrix client and server decides whether a message sends a push notification, makes a sound, and so forth.

All push rules go in one giant blob in global account data, which I think some of which should have been split out into per-room account data. Push rules have seemingly arbitrary groupings: muting a room goes in the overrides group, but receiving notifications for all messages goes in the rooms group.

When changing a lot of room-specific push rules, the rules can overwrite each other in flight, causing some room-specific push rules to reset/be dropped. This is technically an issue with all push rules, but most of the time it isn't noticable to end users. Push rules get around this by having an entire separate set of apis for changing fine-grained account data.

The rules are written in a json dsl-like thing that is needlessly flexible. This causes performance issues. Simultaneously, they're somehow extremely rigid. It is only possible to use one of the 6 predefined conditions and the only counters that exist are notify_count and highlight_count. Notably, there's no unread_count. My workaround for now is to patch clients to not notify unless the highlight tweak is set, and use notify_count as an unread count.

Mentions are currently implemented as searching the message body for a display name/username, which can cause accidental notifications. Saying people's names without notifying them needs workarounds like 1337speek. This is slated to be fixed with intentional mentions, where mentions are extracted to their own json field. Unfortunately, it looks like encrypted messages will put the mentions in encrypted content, breaking rules (todo: heard this from somewhere, but need a source). Ironically, the one time metadata is made less leaky, it also happens to be the one time where it makes sense and is needed for good ux.

The /notifications endpoint paginates through a list of pings/mentions, though due to the above paragraph won't show any mentions in encrypted rooms. There's no way to filter mentions per room/space or if it's @room. Although "list of mentions" makes sense naively, it would be much better to have an inbox where messages be added and removed.

No client supports marking rooms, threads, or events as unread.

There's no way to temporarily mute rooms (or all notifications). Since a room can be in multiple spaces, there's no easy way to mute all rooms in a space. There's no way to only mute @room notifications while allowing user mentions.

e2ee

Matrix's implementation of end to end encryption leaks a lot of metadata, including but not limited to room name/topic/avatar, member per-room displaynames/avatars, and reactions.

Newly joined members can't view old e2ee messages. This is fixable, but not yet fixed.

It's also remarkably easy to get undecryptable messages by accident, and is the single biggest reason why I'm not recommending matrix to anyone else for now. It needs to be more robust and have a better ux.

apis

Syncing is slow and sends tons of extra data, though sliding sync is aiming to fix it. No comment will be made on sliding sync, as its apis are currently in flux. EDIT: Sliding sync is shaping up quite nicely, but I think there are some mistakes, like room_name_like and sorting.

Similarily, the authentication mess is slated to be replaced with OIDC, so no comment there until it's fleshed out more properly. The current system isn't terrible, but isn't great either. Getting a seamless ux, both for users and server admins, looks like it will be pretty difficult to achieve.

A lot of apis are unergonomic or difficult to use. For example, every event/message needs to be sent with a random transaction id, which is annoying when trying to use webhooks. The api doesn't use one event model: there are events, events without room_id, stripped state events, and stripped state events with origin_server_ts. The application service api doesn't support e2ee without workarounds yet.

The media repository (media repo) lets users store blobs, but it feels tacked on instead of part of the protocol. Each piece of media is stored on a single canonicical and trusted server (aka point of failure), unlike rooms. There is no way to delete uploaded blobs from the media repo.

The apis are trying to be generic while geared towards instant messaging: /search is only indexes blessed keys and even types, which are different per server. This could be fixed by having a blessed body property that is always indexed, regardless of event type. Events are able to form graphs but can't be recursively queried. Rooms didn't have types until relatively recently, and most clients will display all rooms regardless of type anyway. There also isn't any subtyping (mutable types, ie. to convert a room between a normal room and dm).

I know instant messaging is the main puropse of matrix, but for medium-large communities I really wish there was some form of forum system. Something similar to tildes.net would be ideal, especially the hierarchical tagging and comment labelling, though the extra sorting/filtering is probably too much to ask.

bots

Bots are hard to get right, as there's no way to use structured commands (ie. discord and telegram slash commands). Bots need to parse the message's body. This has problems, since reply fallbacks exist. Parsing mentions and other things from the body is also difficult, requiring parsing html.

There's no way to specify interactions, like buttons, menus, canned responses, preset reactions, etc. Reaction-based uis are the best to currently exist, but it takes n + 1 events where n is the number of "buttons".

Matrix seems to be heading towards a widget-based ui system, where widgets are embedded webpages/webapps. Clients that want to support widgets would need to embed an entire web browser. Even though it could be more flexible, it would be much more annoying for bot developers to learn an entirely separate widget api as well as well as use html/css/js instead of matrix events and json. Since widgets seem to be able to do interactions on behalf of the user(?), clients need to take extra care in implementing sandboxing and access control. Widget ui would also wouldn't fit in as well with the rest of the client. Widgets might have issues with privacy and tracking because they can use arbitrary html/js/css.

Bots can use the full set of html, including tables and details/summary. However, different clients may only support certain subsets of html.

potpourri

Matrix can be generalized as logs of arbitrary json events combined with a crdt of map<(type, state), event>. Although this could be led to so many use cases, the official client-server api isn't flexible enough as mentioned above. However, it's entirely possible to use the core matrix protocol with a custom client-server api + custom client, activitypub style.

Messages can either be text or a single file, but not both (like a file with a comment) or multiple attachments.

Clients are missing features or are buggy. A lot of spec is dictated by synapse and element, even when spec.matrix.org says otherwise. Old and buggy room versions exist in the wild, with no incentive to upgrade. Federation is excellent in theory, but in practice servers tend to drop events.

Power levels are pretty rudimentary. There's no way allow/deny specific rel_types, only event types. There is no role based access control.

There's no way to peek into dms. (Or federated rooms, for that matter.)

There are many msc (matrix spec change) proposals that seem to have gotten stuck, like custom emoji/stickers.

Room-specific user displaynames/avatars can be overwritten when the global displayname/avatar is changed.

Anything added to spec can't be removed later, which is to be expected for a protocol. But when mscs are opened, events are sent with unstable identifiers. Clients later will need to support both the official type and unstable identifier to support past events.

Moderation is lacking, and it's pretty much required to use a bot (mjolnir) (not that bad, but a bit annoying for smaller rooms). When cleaning up spam, a redaction event must be sent for each message/event - bulk redactions (msc2244) have been merged but isn't in spec and doesn't seem to be implemented by any server.

GET /state returns the content of a state event, not the full event itself.

conclusion

7.8/10 too much water

wishlist

In no particular order. These requests shouldn't be too difficult to implement, but would improve usability by so much. These aren't nearly fleshed out to be full mscs.

  • Cleanup threads and improve usability
    • DELETE /rooms/{roomId}/threads/{eventId} to stop participating in a thread
    • PUT /rooms/{roomId}/threads/{eventId}/participation to change whether you're participating in or aren't participating in a thread.
    • Thread-only rooms.
  • Redo power levels
    • Base them on (from_event_type, rel_type, to_event_type) tuples rather than (event_type) alone.
    • Implement rbac (roles). Ideally, it would use explicit deny as the permission model. May be blocked on "give every member a role" use cases.
  • Add a way to make spaces sync subsets of state with the child rooms. msc3216 and msc2962 exist for power levels, which may be enough for most use cases
  • Add a way to mute rooms temporarily. Maybe add a push rule condition for this, or better yet redo push rules entirely. (I know the whole "don't rewrite what works", but seriously...)
  • Make aliases and visibility less confusing. There's room visibility, history visibility, local/published/main alias(es), and the room directory. In ui, room directory could be merged into room visibility. I'm not sure how to simplify aliases without reducing functionality.
  • Improve /notifications to be more like a proper inbox
    • Use intentional mentions (even in encrypeted rooms) to populate the list
    • DELETE /notifications + DELETE /notifications/{eventId} along with automatically removing old notifications to enhance usability
    • PUT /notifications/{eventId} to bookmark an event for later (though, this may be better served with "starred/bookmarked messages")
    • Some way to reorder notifications? Maybe the PUT endpoint could take an "order" key.
  • Add a m.preset_reactions or m.interactions key for bots.
  • Bot supplied edits or annotations?
  • unread_count to view the number of unread messages in a room, see msc2654
  • This might only be something I want, but another presence/status type different from online for when you're explicitly available to talk to woul be nice. I really want to be able to have a clear split between "being on the internet" and "willing to have my attention taken"

:::