1
0
Fork 0
forked from mirror/grapevine

Compare commits

...

29 commits

Author SHA1 Message Date
Benjamin Lee
2e6a5f30cb
implement per-event filtering for search 2024-06-05 00:07:07 -07:00
Benjamin Lee
d2fab35868
implement more robust (not_)rooms filter for search
The previous code only handled the rooms field, and ignored not_rooms.
2024-06-05 00:07:07 -07:00
Benjamin Lee
fe5626e93a
implement per-event filtering for /context
'end_token' and 'start_token' have been refactored a bit because we need
to take the bounds of the examined events *before* filtering, otherwise
we'll send a pagination token to the client that is inside the set of
events we examined on this call. In extreme situations, this may leave a
client unable to make progress at all, because the first event that
matches if filter is more than 'load_limit' away from the base event.

This bug was present before the filtering implementation, but was less
significant because we only dropped events when they were not visible to
the user.
2024-06-05 00:07:07 -07:00
Benjamin Lee
fa86a8701d
implement filter.limit in /context
This seems to be completely redundant with the 'limit' body parameter,
with the only difference being:

> The filter may be applied before or/and after the limit parameter -
filter> whichever the homeserver prefers.

This sentence seems to apply to the 'limit' body parameter, but not to
the 'limit' field on the filter. This was probably unintentional on the
part of the spec authors, but I've switched to using same 'load_limit'
pattern we're using elsewhere anyway.
2024-06-05 00:07:07 -07:00
Benjamin Lee
1c0ead0339
implement (not_)rooms filter on /context 2024-06-05 00:07:07 -07:00
Benjamin Lee
832b41c930
implement filter limit for ephemeral events on /sync 2024-06-05 00:07:07 -07:00
Benjamin Lee
1114b66670
implement filter limit for timeline events on /sync 2024-06-05 00:07:07 -07:00
Benjamin Lee
738acd2b35
fixup: rename filter to compiled_filter in load_joined_room
Need to split this into a bunch of different fixup commits...
2024-06-05 00:07:07 -07:00
Benjamin Lee
1410b6f409
implement per-event filtering for per-room account_data on /sync 2024-06-05 00:07:06 -07:00
Benjamin Lee
7a0b8c986f
implement global account_data filtering in /sync
TODO docs on raw_event_allowed, and figure out how we want to organize
it with CompiledRoomEventFilter::raw_event_allowed
2024-06-05 00:07:06 -07:00
Benjamin Lee
c3cf97df7a
implement filter.room.include_leave for /sync 2024-06-05 00:07:06 -07:00
Benjamin Lee
d69b88566a
implement per-event filtering for ephemeral events in /sync
I've asked a few times for clarification on whether the `senders` field
in the filter applies to userids mentioned in the typing/receipt ephemeral
events, and never got a response. Synapse does not filter these userids by
sender, so we're gonna go with that.
2024-06-05 00:07:06 -07:00
Benjamin Lee
98d93da3a8
implement per-event state filtering for joined rooms in /sync 2024-06-05 00:07:06 -07:00
Benjamin Lee
f4f3be8c30
implement per-event state filtering for left rooms in /sync 2024-06-05 00:07:06 -07:00
Benjamin Lee
b85110a292
implement per-event state filtering for invited rooms on /sync
This one is a little weird, because the stripped invite state events are
not deserialized.
2024-06-05 00:07:06 -07:00
Benjamin Lee
4c9728cbad
implement per-event timeline filtering on /sync
This is the filter.room.timeline.{senders,types,contains_url} fields, and
their associated not_* pairs.

I decided not to change the `prev_batch` calculation for sliding-sync to
use the new `oldest_event_count` value, because I'm not confident in the
correct behavior. The current sliding-sync behavior is gives `prev_batch
= oldest_event_count` except when there are no new events. In this
case, `oldest_event_count` is `None`, but the current sliding-sync
implementation uses `prev_batch = since`. This is definitely wrong,
because both `since` and `prev_batch` are exclusive bounds. If the
correct thing to do is to return the lower exclusive bound of the range
of events that may have been included in the timeline, then we would
want `since - 1`. The other option would be to return `prev_batch =
None`, like we have in sync v3. I don't know which of these is correct,
so I'm just gonna keep the current (definitely incorrect) behavior to
avoid making things worse.
2024-06-05 00:07:06 -07:00
Benjamin Lee
745eaa9b48
implement room.account_data.(not_)rooms filter on /sync 2024-06-04 20:02:43 -07:00
Benjamin Lee
84f356e67b
implement room.ephemeral.(not_)rooms filter on /sync 2024-06-04 20:02:43 -07:00
Benjamin Lee
4e1d091bbc
skip left/invited rooms with no updates in /sync
Before this change we were just returning an empty object for left or
invited rooms that don't have any updates. This is valid coloredding to
the spec, but it's a nicer to debug if they are omitted it and results in
a little less network traffic. For joined rooms, we are already skipping
empty updates.

With filtering support, it's much more common to have sync responses where
many rooms are empty, because all of the state/timeline events may be
filtered out.
2024-06-04 20:02:43 -07:00
Benjamin Lee
e6f2b6c9ad
implement room.state.(not_)rooms filter on /sync 2024-06-04 20:02:43 -07:00
Benjamin Lee
c48abf9f13
implement room.timeline.(not_)rooms filter on /sync
I asked in #matrix-spec:matrix.org and go clarification that we should be
omitting the timeline field completely for rooms that are filtered out
by the timeline.(not_)rooms filter. Ruma's skip_serializing_if attribute
on the timeline field will currently cause it to be omitted when events is
empty. If [this fix][1] is merged, it will be omitted only when events is
empty, prev_batch is None, and limited is false.

[1]: https://github.com/ruma/ruma/pull/1796

TODO: maybe do something about clippy::too_many_arguments
2024-06-04 20:02:43 -07:00
Benjamin Lee
458d6842fb
implement top-level (not_)rooms filter on /sync
These are the fields at filter.room.{rooms,not_rooms}, that apply to all
categories. The category-specific room filters are in
filter.room.{state,timeline,ephemeral}.{rooms,not_rooms}.
2024-06-04 20:02:42 -07:00
Benjamin Lee
5d4aa35463
implement types and not_types filters on /message
One thing I'm a little worried about with this implementation is that
it's possible for some wildcard expressions to result in a
regex::Error::CompiledTooBig error. It seems like rejecting patterns that
would result in a ReDOS is a good idea, but the matrix spec doesn't say
anything about it.
2024-06-04 20:02:42 -07:00
Benjamin Lee
75523fa3e0
implement senders and not_senders filters on /messages 2024-06-04 20:02:42 -07:00
Benjamin Lee
0e2694a6c4
implement contains_url filter for /message
The plan is to move all of the per-event checks into the
pdu_event_allowed function.

TODO: split the `visibility_filter` function into it's own commit. I
think this one was inherited from an earlier conduwuit commit.
2024-06-04 20:02:42 -07:00
Benjamin Lee
2bcd357db2
limit total events examined in /messages 2024-06-04 20:02:42 -07:00
Benjamin Lee
93ad93a36b
respect filter.limit in the /messages endpoint
As far as I can tell, 'filter.limit' and the 'limit' query parameter are
completely redundant. I've moved the 'take(limit)' call until after
filtering, to ensure that we can return up to 'limit' events even when
some are rejected by the filter. In a future commit, I will add a global
limit on loaded events to avoid DoS.
2024-06-04 20:02:42 -07:00
Benjamin Lee
404d5fae6c
implement rooms and not_rooms filters on /message
I really doubt anybody is sending /message requests with a filter that
rejects the entire request, but it's the first step in the filter
implementation.
2024-06-04 20:02:42 -07:00
Charles Hall
a5e7ce6c33
improve "Leave event has no state" event
Now it includes the user, room, and event ID. As a bonus, the sync
function is now slightly less gigantic.

TODO: put this in a separate MR, and include a similar change for
invited rooms
2024-06-04 20:02:42 -07:00
6 changed files with 1087 additions and 398 deletions

View file

@ -5,11 +5,15 @@ use ruma::{
context::get_context, error::ErrorKind, filter::LazyLoadOptions,
},
events::StateEventType,
uint,
uint, UInt,
};
use tracing::error;
use crate::{services, Ar, Error, Ra, Result};
use crate::{
services,
utils::filter::{load_limit, CompiledRoomEventFilter},
Ar, Error, Ra, Result,
};
/// # `GET /_matrix/client/r0/rooms/{roomId}/context`
///
@ -26,6 +30,13 @@ pub(crate) async fn get_context_route(
let sender_device =
body.sender_device.as_ref().expect("user is authenticated");
let Ok(filter) = CompiledRoomEventFilter::try_from(&body.filter) else {
return Err(Error::BadRequest(
ErrorKind::InvalidParam,
"invalid 'filter' parameter",
));
};
let (lazy_load_enabled, lazy_load_send_redundant) =
match &body.filter.lazy_load_options {
LazyLoadOptions::Enabled {
@ -68,18 +79,43 @@ pub(crate) async fn get_context_route(
lazy_loaded.insert(base_event.sender.as_str().to_owned());
}
// Use limit with maximum 100
let half_limit = usize::try_from(body.limit.min(uint!(100)) / uint!(2))
.expect("0-50 should fit in usize");
let limit: usize = body
.limit
.min(body.filter.limit.unwrap_or(UInt::MAX))
.min(uint!(100))
.try_into()
.expect("0-100 should fit in usize");
let half_limit = limit / 2;
let base_event = base_event.to_room_event();
if !filter.room_allowed(&body.room_id) {
// The spec states that
//
// > The filter is only applied to events_before, events_after, and
// > state. It is not applied to the event itself.
//
// so we need to fetch the event before we can early-return after
// testing the room filter.
return Ok(Ra(get_context::v3::Response {
start: None,
end: None,
events_before: vec![],
event: Some(base_event),
events_after: vec![],
state: vec![],
}));
}
let mut start_token = None;
let events_before: Vec<_> = services()
.rooms
.timeline
.pdus_until(sender_user, &room_id, base_token)?
.take(half_limit)
.take(load_limit(half_limit))
.filter_map(Result::ok)
.inspect(|&(count, _)| start_token = Some(count))
.filter(|(_, pdu)| filter.pdu_event_allowed(pdu))
.filter(|(_, pdu)| {
services()
.rooms
@ -87,8 +123,11 @@ pub(crate) async fn get_context_route(
.user_can_see_event(sender_user, &room_id, &pdu.event_id)
.unwrap_or(false)
})
.take(half_limit)
.collect();
let start_token = start_token.map(|token| token.stringify());
for (_, event) in &events_before {
if !services().rooms.lazy_loading.lazy_load_was_sent_before(
sender_user,
@ -101,19 +140,18 @@ pub(crate) async fn get_context_route(
}
}
let start_token = events_before
.last()
.map_or_else(|| base_token.stringify(), |(count, _)| count.stringify());
let events_before: Vec<_> =
events_before.into_iter().map(|(_, pdu)| pdu.to_room_event()).collect();
let mut end_token = None;
let events_after: Vec<_> = services()
.rooms
.timeline
.pdus_after(sender_user, &room_id, base_token)?
.take(half_limit)
.take(load_limit(half_limit))
.filter_map(Result::ok)
.inspect(|&(count, _)| end_token = Some(count))
.filter(|(_, pdu)| filter.pdu_event_allowed(pdu))
.filter(|(_, pdu)| {
services()
.rooms
@ -121,8 +159,11 @@ pub(crate) async fn get_context_route(
.user_can_see_event(sender_user, &room_id, &pdu.event_id)
.unwrap_or(false)
})
.take(half_limit)
.collect();
let end_token = end_token.map(|token| token.stringify());
for (_, event) in &events_after {
if !services().rooms.lazy_loading.lazy_load_was_sent_before(
sender_user,
@ -150,10 +191,6 @@ pub(crate) async fn get_context_route(
let state_ids =
services().rooms.state_accessor.state_full_ids(shortstatehash).await?;
let end_token = events_after
.last()
.map_or_else(|| base_token.stringify(), |(count, _)| count.stringify());
let events_after: Vec<_> =
events_after.into_iter().map(|(_, pdu)| pdu.to_room_event()).collect();
@ -179,8 +216,8 @@ pub(crate) async fn get_context_route(
}
let resp = get_context::v3::Response {
start: Some(start_token),
end: Some(end_token),
start: start_token,
end: end_token,
events_before,
event: Some(base_event),
events_after,

View file

@ -9,12 +9,14 @@ use ruma::{
message::{get_message_events, send_message_event},
},
events::{StateEventType, TimelineEventType},
uint,
uint, RoomId, UInt, UserId,
};
use crate::{
service::{pdu::PduBuilder, rooms::timeline::PduCount},
services, utils, Ar, Error, Ra, Result,
services, utils,
utils::filter::{load_limit, CompiledRoomEventFilter},
Ar, Error, PduEvent, Ra, Result,
};
/// # `PUT /_matrix/client/r0/rooms/{roomId}/send/{eventType}/{txnId}`
@ -136,6 +138,13 @@ pub(crate) async fn get_message_events_route(
let sender_device =
body.sender_device.as_ref().expect("user is authenticated");
let Ok(filter) = CompiledRoomEventFilter::try_from(&body.filter) else {
return Err(Error::BadRequest(
ErrorKind::InvalidParam,
"invalid 'filter' parameter",
));
};
let from = match body.from.clone() {
Some(from) => PduCount::try_from_string(&from)?,
None => match body.dir {
@ -144,6 +153,15 @@ pub(crate) async fn get_message_events_route(
},
};
if !filter.room_allowed(&body.room_id) {
return Ok(Ra(get_message_events::v3::Response {
start: from.stringify(),
end: None,
chunk: vec![],
state: vec![],
}));
}
let to = body.to.as_ref().and_then(|t| PduCount::try_from_string(t).ok());
services()
@ -159,6 +177,7 @@ pub(crate) async fn get_message_events_route(
let limit = body
.limit
.min(body.filter.limit.unwrap_or(UInt::MAX))
.min(uint!(100))
.try_into()
.expect("0-100 should fit in usize");
@ -175,20 +194,14 @@ pub(crate) async fn get_message_events_route(
.rooms
.timeline
.pdus_after(sender_user, &body.room_id, from)?
.take(limit)
.take(load_limit(limit))
.filter_map(Result::ok)
.filter(|(_, pdu)| {
services()
.rooms
.state_accessor
.user_can_see_event(
sender_user,
&body.room_id,
&pdu.event_id,
)
.unwrap_or(false)
filter.pdu_event_allowed(pdu)
&& visibility_filter(pdu, sender_user, &body.room_id)
})
.take_while(|&(k, _)| Some(k) != to)
.take(limit)
.collect();
for (_, event) in &events_after {
@ -231,20 +244,14 @@ pub(crate) async fn get_message_events_route(
.rooms
.timeline
.pdus_until(sender_user, &body.room_id, from)?
.take(limit)
.take(load_limit(limit))
.filter_map(Result::ok)
.filter(|(_, pdu)| {
services()
.rooms
.state_accessor
.user_can_see_event(
sender_user,
&body.room_id,
&pdu.event_id,
)
.unwrap_or(false)
filter.pdu_event_allowed(pdu)
&& visibility_filter(pdu, sender_user, &body.room_id)
})
.take_while(|&(k, _)| Some(k) != to)
.take(limit)
.collect();
for (_, event) in &events_before {
@ -310,3 +317,15 @@ pub(crate) async fn get_message_events_route(
Ok(Ra(resp))
}
fn visibility_filter(
pdu: &PduEvent,
user_id: &UserId,
room_id: &RoomId,
) -> bool {
services()
.rooms
.state_accessor
.user_can_see_event(user_id, room_id, &pdu.event_id)
.unwrap_or(false)
}

View file

@ -1,4 +1,4 @@
use std::collections::BTreeMap;
use std::{borrow::Cow, collections::BTreeMap};
use ruma::{
api::client::{
@ -14,7 +14,11 @@ use ruma::{
uint, UInt,
};
use crate::{services, Ar, Error, Ra, Result};
use crate::{
services,
utils::filter::{AllowDenyList, CompiledRoomEventFilter},
Ar, Error, Ra, Result,
};
/// # `POST /_matrix/client/r0/search`
///
@ -30,15 +34,28 @@ pub(crate) async fn search_events_route(
let search_criteria = body.search_categories.room_events.as_ref().unwrap();
let filter = &search_criteria.filter;
let Ok(compiled_filter) = CompiledRoomEventFilter::try_from(filter) else {
return Err(Error::BadRequest(
ErrorKind::InvalidParam,
"invalid 'filter' parameter",
));
};
let room_ids = filter.rooms.clone().unwrap_or_else(|| {
services()
.rooms
.state_cache
.rooms_joined(sender_user)
.filter_map(Result::ok)
.collect()
});
let mut room_ids = vec![];
if let AllowDenyList::Allow(allow_set) = &compiled_filter.rooms {
for &room_id in allow_set {
if services().rooms.state_cache.is_joined(sender_user, room_id)? {
room_ids.push(Cow::Borrowed(room_id));
}
}
} else {
for result in services().rooms.state_cache.rooms_joined(sender_user) {
let room_id = result?;
if compiled_filter.rooms.allowed(&room_id) {
room_ids.push(Cow::Owned(room_id));
}
}
}
// Use limit or else 10, with maximum 100
let limit = filter
@ -51,13 +68,6 @@ pub(crate) async fn search_events_route(
let mut searches = Vec::new();
for room_id in room_ids {
if !services().rooms.state_cache.is_joined(sender_user, &room_id)? {
return Err(Error::BadRequest(
ErrorKind::forbidden(),
"You don't have permission to view this room.",
));
}
if let Some(search) = services()
.rooms
.search
@ -100,6 +110,7 @@ pub(crate) async fn search_events_route(
.timeline
.get_pdu_from_id(result)
.ok()?
.filter(|pdu| compiled_filter.pdu_event_allowed(pdu))
.filter(|pdu| {
services()
.rooms

File diff suppressed because it is too large Load diff

View file

@ -1,4 +1,5 @@
pub(crate) mod error;
pub(crate) mod filter;
use std::{
borrow::Cow,

393
src/utils/filter.rs Normal file
View file

@ -0,0 +1,393 @@
//! Helper tools for implementing filtering in the `/client/v3/sync` and
//! `/client/v3/rooms/:roomId/messages` endpoints.
//!
//! The default strategy for filtering is to generate all events, check them
//! against the filter, and drop events that were rejected. When significant
//! fraction of events are rejected, this results in a large amount of wasted
//! work computing events that will be dropped. In most cases, the structure of
//! our database doesn't allow for anything fancier, with only a few exceptions.
//!
//! The first exception is room filters (`room`/`not_room` pairs in
//! `filter.rooms` and `filter.rooms.{account_data,timeline,ephemeral,state}`).
//! In `/messages`, if the room is rejected by the filter, we can skip the
//! entire request. The outer loop of our `/sync` implementation is over rooms,
//! and so we are able to skip work for an entire room if it is rejected by the
//! top-level `filter.rooms.room`. Similarly, when a room is rejected for all
//! events in a particular category, we can skip work generating events in that
//! category for the rejected room.
//!
//! The second exception is ephemeral event types (`type`/`not_type` in
//! `filter.room.ephemeral`). For these, we can skip work generating events of a
//! particular type in `/sync` if it is rejected.
use std::{borrow::Cow, collections::HashSet, hash::Hash};
use regex::RegexSet;
use ruma::{
api::client::filter::{
Filter, FilterDefinition, RoomEventFilter, RoomFilter, UrlFilter,
},
serde::Raw,
OwnedUserId, RoomId, UserId,
};
use serde::Deserialize;
use tracing::error;
use crate::{Error, PduEvent};
// 'DoS' is not a type
#[allow(clippy::doc_markdown)]
/// Returns the total limit of events to example when evaluating a filter.
///
/// When a filter matches only a very small fraction of available events, we may
/// need to example a very large number of events before we find enough allowed
/// events to fill the supplied limit. This is a possible DoS vector, and a
/// performance issue for legitimate requests. To avoid this, we put a "load
/// limit" on the total number of events that will be examined. This value is
/// always higher than the original event limit.
pub(crate) fn load_limit(limit: usize) -> usize {
// the 2xlimit value was pulled from synapse, and no real performance
// measurement has been done on our side yet to determine whether it's
// appropriate.
limit.saturating_mul(2)
}
/// Structure for testing against an allowlist and a denylist with a single
/// `HashSet` lookup.
///
/// The denylist takes precedence (an item included in both the allowlist and
/// the denylist is denied).
pub(crate) enum AllowDenyList<'a, T: ?Sized> {
/// TODO: fast-paths for allow-all and deny-all?
Allow(HashSet<&'a T>),
Deny(HashSet<&'a T>),
}
impl<'a, T: ?Sized + Hash + PartialEq + Eq> AllowDenyList<'a, T> {
fn new<A, D>(allow: Option<A>, deny: D) -> AllowDenyList<'a, T>
where
A: Iterator<Item = &'a T>,
D: Iterator<Item = &'a T>,
{
let deny_set = deny.collect::<HashSet<_>>();
if let Some(allow) = allow {
AllowDenyList::Allow(
allow.filter(|x| !deny_set.contains(x)).collect(),
)
} else {
AllowDenyList::Deny(deny_set)
}
}
fn from_slices<O: AsRef<T>>(
allow: Option<&'a [O]>,
deny: &'a [O],
) -> AllowDenyList<'a, T> {
AllowDenyList::new(
allow.map(|allow| allow.iter().map(AsRef::as_ref)),
deny.iter().map(AsRef::as_ref),
)
}
pub(crate) fn allowed(&self, value: &T) -> bool {
match self {
AllowDenyList::Allow(allow) => allow.contains(value),
AllowDenyList::Deny(deny) => !deny.contains(value),
}
}
}
pub(crate) struct WildcardAllowDenyList {
allow: Option<RegexSet>,
deny: Option<RegexSet>,
}
/// Converts a wildcard pattern (like in filter.room.timeline.types) to a regex.
///
/// Wildcard patterns are all literal strings except for the `'*'` character,
/// which matches any sequence of characters.
fn wildcard_to_regex(pattern: &str) -> String {
let mut regex_pattern = String::new();
regex_pattern.push('^');
let mut parts = pattern.split('*').peekable();
while let Some(part) = parts.next() {
regex_pattern.push_str(&regex::escape(part));
if parts.peek().is_some() {
regex_pattern.push_str(".*");
}
}
regex_pattern.push('$');
regex_pattern
}
impl WildcardAllowDenyList {
fn new<S: AsRef<str>>(
allow: Option<&[S]>,
deny: &[S],
) -> Result<WildcardAllowDenyList, regex::Error> {
Ok(WildcardAllowDenyList {
allow: allow
.map(|allow| {
RegexSet::new(
allow
.iter()
.map(|pattern| wildcard_to_regex(pattern.as_ref())),
)
})
.transpose()?,
deny: if deny.is_empty() {
None
} else {
Some(RegexSet::new(
deny.iter()
.map(|pattern| wildcard_to_regex(pattern.as_ref())),
)?)
},
})
}
pub(crate) fn allowed(&self, value: &str) -> bool {
self.allow.as_ref().map_or(true, |allow| allow.is_match(value))
&& self.deny.as_ref().map_or(true, |deny| !deny.is_match(value))
}
}
/// Wrapper for a [`ruma::api::client::filter::FilterDefinition`], preprocessed
/// to allow checking against the filter efficiently.
///
/// The preprocessing consists of merging the `X` and `not_X` pairs into
/// combined structures. For most fields, this is a [`AllowDenyList`]. For
/// `types`/`not_types`, this is a [`WildcardAllowDenyList`], because the type
/// filter fields support `'*'` wildcards.
pub(crate) struct CompiledFilterDefinition<'a> {
pub(crate) account_data: CompiledFilter<'a>,
pub(crate) room: CompiledRoomFilter<'a>,
}
pub(crate) struct CompiledFilter<'a> {
pub(crate) types: WildcardAllowDenyList,
pub(crate) senders: AllowDenyList<'a, UserId>,
}
pub(crate) struct CompiledRoomFilter<'a> {
rooms: AllowDenyList<'a, RoomId>,
pub(crate) account_data: CompiledRoomEventFilter<'a>,
pub(crate) timeline: CompiledRoomEventFilter<'a>,
pub(crate) ephemeral: CompiledRoomEventFilter<'a>,
pub(crate) state: CompiledRoomEventFilter<'a>,
}
pub(crate) struct CompiledRoomEventFilter<'a> {
// TODO: consider falling back a more-efficient
// AllowDenyList<TimelineEventType> when none of the type patterns
// include a wildcard.
types: WildcardAllowDenyList,
pub(crate) rooms: AllowDenyList<'a, RoomId>,
senders: AllowDenyList<'a, UserId>,
url_filter: Option<UrlFilter>,
}
impl<'a> TryFrom<&'a FilterDefinition> for CompiledFilterDefinition<'a> {
type Error = Error;
fn try_from(
source: &'a FilterDefinition,
) -> Result<CompiledFilterDefinition<'a>, Error> {
Ok(CompiledFilterDefinition {
account_data: (&source.account_data).try_into()?,
room: (&source.room).try_into()?,
})
}
}
impl<'a> TryFrom<&'a Filter> for CompiledFilter<'a> {
type Error = Error;
fn try_from(source: &'a Filter) -> Result<CompiledFilter<'a>, Error> {
Ok(CompiledFilter {
types: WildcardAllowDenyList::new(
source.types.as_deref(),
&source.not_types,
)?,
senders: AllowDenyList::from_slices(
source.senders.as_deref(),
&source.not_senders,
),
})
}
}
impl<'a> TryFrom<&'a RoomFilter> for CompiledRoomFilter<'a> {
type Error = Error;
fn try_from(
source: &'a RoomFilter,
) -> Result<CompiledRoomFilter<'a>, Error> {
Ok(CompiledRoomFilter {
// TODO: consider calculating the intersection of room filters in
// all of the sub-filters
rooms: AllowDenyList::from_slices(
source.rooms.as_deref(),
&source.not_rooms,
),
account_data: (&source.account_data).try_into()?,
timeline: (&source.timeline).try_into()?,
ephemeral: (&source.ephemeral).try_into()?,
state: (&source.state).try_into()?,
})
}
}
impl<'a> TryFrom<&'a RoomEventFilter> for CompiledRoomEventFilter<'a> {
type Error = Error;
fn try_from(
source: &'a RoomEventFilter,
) -> Result<CompiledRoomEventFilter<'a>, Error> {
Ok(CompiledRoomEventFilter {
types: WildcardAllowDenyList::new(
source.types.as_deref(),
&source.not_types,
)?,
rooms: AllowDenyList::from_slices(
source.rooms.as_deref(),
&source.not_rooms,
),
senders: AllowDenyList::from_slices(
source.senders.as_deref(),
&source.not_senders,
),
url_filter: source.url_filter,
})
}
}
impl CompiledFilter<'_> {
// TODO: docs
pub(crate) fn raw_event_allowed<Ev>(&self, event: &Raw<Ev>) -> bool {
// We need to deserialize some of the fields from the raw json, but
// don't need all of them. Fully deserializing to a ruma event type
// would involve a lot extra copying and validation.
#[derive(Deserialize)]
struct LimitedEvent<'a> {
sender: Option<OwnedUserId>,
#[serde(rename = "type")]
kind: Cow<'a, str>,
}
let event = match event.deserialize_as::<LimitedEvent<'_>>() {
Ok(event) => event,
Err(e) => {
// TODO: maybe rephrase this error, or propagate it to the
// caller
error!("invalid event in database: {e}");
return false;
}
};
let sender_allowed = match &event.sender {
Some(sender) => self.senders.allowed(sender),
// sender allowlist means we reject events without a sender
None => matches!(self.senders, AllowDenyList::Deny(_)),
};
sender_allowed && self.types.allowed(&event.kind)
}
}
impl CompiledRoomFilter<'_> {
/// Returns the top-level [`AllowDenyList`] for rooms (`rooms`/`not_rooms`
/// in `filter.room`).
///
/// This is useful because, with an allowlist, iterating over allowed rooms
/// and checking whether they are visible to a user can be faster than
/// iterating over visible rooms and checking whether they are allowed.
pub(crate) fn rooms(&self) -> &AllowDenyList<'_, RoomId> {
&self.rooms
}
}
impl CompiledRoomEventFilter<'_> {
/// Returns `true` if a room is allowed by the `rooms` and `not_rooms`
/// fields.
///
/// This does *not* test the room against the top-level `rooms` filter.
/// It is expected that callers have already filtered rooms that are
/// rejected by the top-level filter using [`CompiledRoomFilter::rooms`], if
/// applicable.
pub(crate) fn room_allowed(&self, room_id: &RoomId) -> bool {
self.rooms.allowed(room_id)
}
/// Returns `true` if an event type is allowed by the `types` and
/// `not_types` fields.
///
/// This is mainly useful to skip work generating events for a particular
/// type, if that event type is always rejected by the filter.
pub(crate) fn type_allowed(&self, kind: &str) -> bool {
self.types.allowed(kind)
}
/// Returns `true` if a PDU event is allowed by the filter.
///
/// This tests against the `senders`, `not_senders`, `types`, `not_types`,
/// and `url_filter` fields.
///
/// This does *not* check whether the event's room is allowed. It is
/// expected that callers have already filtered out rejected rooms using
/// [`CompiledRoomEventFilter::room_allowed`] and
/// [`CompiledRoomFilter::rooms`].
pub(crate) fn pdu_event_allowed(&self, pdu: &PduEvent) -> bool {
self.senders.allowed(&pdu.sender)
&& self.type_allowed(&pdu.kind.to_string())
&& self.allowed_by_url_filter(pdu)
}
/// Similar to [`CompiledRoomEventFilter::pdu_event_allowed`] but takes raw
/// JSON.
pub(crate) fn raw_event_allowed<Ev>(&self, event: &Raw<Ev>) -> bool {
// We need to deserialize some of the fields from the raw json, but
// don't need all of them. Fully deserializing to a ruma event type
// would involve a lot extra copying and validation.
#[derive(Deserialize)]
struct LimitedEvent<'a> {
sender: OwnedUserId,
#[serde(rename = "type")]
kind: Cow<'a, str>,
url: Option<Cow<'a, str>>,
}
let event = match event.deserialize_as::<LimitedEvent<'_>>() {
Ok(event) => event,
Err(e) => {
// TODO: maybe rephrase this error, or propagate it to the
// caller
error!("invalid event in database: {e}");
return false;
}
};
let allowed_by_url_filter = match self.url_filter {
None => true,
Some(UrlFilter::EventsWithoutUrl) => event.url.is_none(),
Some(UrlFilter::EventsWithUrl) => event.url.is_some(),
};
allowed_by_url_filter
&& self.senders.allowed(&event.sender)
&& self.type_allowed(&event.kind)
}
// TODO: refactor this as well?
fn allowed_by_url_filter(&self, pdu: &PduEvent) -> bool {
let Some(filter) = self.url_filter else {
return true;
};
// TODO: is this unwrap okay?
let content: serde_json::Value =
serde_json::from_str(pdu.content.get()).unwrap();
match filter {
UrlFilter::EventsWithoutUrl => !content["url"].is_string(),
UrlFilter::EventsWithUrl => content["url"].is_string(),
}
}
}