Search With Permissions

Once you implement fine-grained authorization to protect your resources, search becomes a more complex problem, because the user's access to each resource now has to be validated before the resource can be shown.

The search problem can then be summarized as:

"Given a particular search filter and a sort order, what objects can the user access"?

The OpenFGA service does not store object metadata (names of files, creation dates, time of last update, etc), which means completing any search request by filtering and sorting according to certain criteria will require data from your database.

The services responsible for performing these actions are:

Filter: Your database
Sort: Your database
Authorize: OpenFGA

To return the set of results that match the user's search query, you will need to get the intersection of the results from the services above.

Possible options

There are three possible ways to do this:

Option 1: Search, then check

Pre-filter, then call OpenFGA Batch Check endpoint.

Filter and sort on your database.
Call /batch-check to check access for multiple objects in a single request.
Filter out objects the user does not have access to.
Return the filtered result to the user.

Option 2: Build a local index from changes endpoint, search, then check

Consume the GET /changes endpoint to create a local index you can use to do an intersection on the two sets of results.

Call the OpenFGA changes API.
For the particular authorization model version(s) you are using in production, flatten/expand the changes (e.g. user:anne, writer, doc:planning becomes two tuples: user:anne, writer, doc:planning and user:anne, reader, doc:planning).
Build the intersection between the objects in your database and the flattened/expanded state you created.
You can then call /check on each resource in the resulting set before returning the response to filter out any resource with permissions revoked but whose authorization data has not made it into your index yet.

Option 3: Build a list of IDs, then search

Call the GET /list-objects API to get a list of object IDs the user has access to, then run the filter restricting by the object IDs returned.

Call the OpenFGA List Objects API. to get the list of all resources a user can access.
Pass in the set of object IDs to the database query to limit the search.
Return the filtered result to the user.

Be aware that the performance characteristics of the ListObjects endpoint vary drastically depending on the model complexity, number of tuples, and the relations it needs to evaluate. Relations with and or but not are more expensive to evaluate than relations with or.

Choosing the best option

Which option to choose among the three listed above depends on the following criteria:

Number of objects that your database can return from a search query
Number of objects of a certain type the user could have access to
Percentage of objects in a type the user could have access to

Consider the following scenarios:

A. The number of objects a search query could return from the database is low.

Search then Check is the recommended solution.

Use-case: Situations where the search query can be optimized to return a small number of results.

B. The number of objects of a certain type the user could have access to is low, and the percentage of objects in a namespace a user could have access to is high.

Search then Check is recommended to get the final list of results.

Note that this use case, because the user has access to a low number of objects which are still a high percentage of the total objects in the system, that means that the total number of objects in the system is low.

C. The number of objects of a certain type the user could have access to is low (~ 1000), and the percentage of the total objects that the user can have access to is also low.

In this case, using the GET /list-objects would make sense. You can query this API to get a list of object IDs and then pass these IDs to your filter function to limit the search to them.

As this number increases, this solution becomes impractical, because you would need to paginate over multiple pages to get the entire list before being able to search and sort. A partial list from the API is not enough, because you won't be able to sort using it.

So while List of IDs then Search would be useful for this in some situations, we would recommend Local Index from Changes Endpoint, Search then Check for the cases when the number of objects is high enough.

D. The number of objects of a certain type the user could have access to is high, and the percentage of the total objects that the user can have access to is low.

The recommended option for this case is to use Local Index from Changes Endpoint, Search then Check.

List of IDs then Search would not work because you would have to get and paginate across thousands or tens of thousands (or in some cases more) of results from OpenFGA, only after you have retrieved the entire set can you start searching within your database for matching results. This would mean that your user could be waiting for a long time before they can start seeing results.
Search then Check would also not be ideal, as you will be retrieving and checking against a lot of items and discarding most of them.

Use case: Searching in Google Drive, where the list of documents and folders that a user has access to can be very high, but it still is a small percentage of the entire set of documents in Google Drive.

You can consider the following strategies to transform this scenario to a type B one:

Duplicate logic from the authorization model when querying your database. For example, in a multi-tenant scenario, you can filter all resources based on the tenant the user is logged-in to. Duplicating logic from the authorization model is not ideal, but it can be a reasonable trade-off.
Retrieve a higher-level resource ID list with lower cardinality for efficient filtering. For example, in a document management application, first obtain the list of accessible folders for the user. You can then filter documents by these folders in your database query. This approach increases the likelihood that the user can access the documents in those folders, optimizing the query’s effectiveness.

E. The number of objects of a certain type the user could have access to is high, and the percentage of the total objects that the user can have access to is also high.

In this case a Local Index from Changes Endpoint, Search then Check would be useful. If you do not want to maintain a local index, and if the user can access a high percentage of the total, meaning that the user is more likely than not to have access to the results returned by the search query, then Search then Check would work just as well.

Use-case: Searching on Twitter. Most Twitter users have their profiles set to public, so the user is more likely to have access to the tweets when performing a search. So searching first then running checks against the set of returned results would be appropriate.

Summary

Scenario	Use Case	# of objects returned from database query	# of objects user can access in a type	% of objects user can access in a type	Preferred Option
A	Search criteria enough to narrow down results	Low	-	-	1
B	Few objects the user has access to, but still a high % of total objects	Low	Low	High	1
C	Cannot narrow down search results, very high probability search returns objects user cannot access, total number of objects user can access is low enough to fit in a response	High	Low	Low	3 or 2
D	Google Drive: User has access to a lot of documents, but low percentage from total	High	High	Low	2
E	Twitter Search: Most profiles are public, and the user can access them	High	High	High	1 or 2

Possible options​

Option 1: Search, then check​

Option 2: Build a local index from changes endpoint, search, then check​

Option 3: Build a list of IDs, then search​

Choosing the best option​

Summary​