Search With Permissions
Once you implement fine-grained authorization to protect your resources, search becomes a more complex problem, because the user's access to each resource now has to be validated before the resource can be shown.
The search problem can then be summarized as:
"Given a particular search filter and a sort order, what objects can the user access"?
The OpenFGA service does not store object metadata (names of files, creation dates, time of last update, etc), which means completing any search request by filtering and sorting according to certain criteria will require data from your database.
The services responsible for performing these actions are:
- Filter: Your database
- Sort: Your database
- Authorize: OpenFGA
To return the set of results that match the user's search query, you will need to get the intersection of the results from the services above.
Possible Options
There are three possible ways to do this:
Option 1: Search, Then Check
Pre-filter, then call OpenFGA Check endpoint.
- Filter and sort on your database.
- Call
/check
in parallel on each object returned from your database. - Filter out objects the user does not have access to.
- Return the filtered result to the user.
Option 2: Build A Local Index From Changes Endpoint, Search, Then Check
Consume the GET /changes
endpoint to create a local index you can use to do an intersection on the two sets of results.
- Call the OpenFGA changes API.
- For the particular authorization model version(s) you are using in production, flatten/expand the changes (e.g.
user:anne, writer, doc:planning
becomes two tuples:user:anne, writer, doc:planning
anduser:anne, reader, doc:planning
). - Build the intersection between the objects in your database and the flattened/expanded state you created.
- You can then call
/check
on each resource in the resulting set before returning the response to filter out any resource with permissions revoked but whose authorization data has not made it into your index yet.
Option 3: Build A List Of IDs, Then Search
Call the GET /list-objects
API to get a list of object IDs the user has access to, then run the filter restricting by the object IDs returned.
- Call the OpenFGA List Objects API. to get the list of all resources a user can access.
- Pass in the set of object IDs to the database query to limit the search.
- Return the filtered result to the user.
Be aware that the performance characteristics of the ListObjects endpoint vary drastically depending on the model complexity, number of tuples, and the relations it needs to evaluate. Relations with and
or but not
are more expensive to evaluate than relations with or
.
Choosing The Best Option
Which option to choose among the three listed above depends on the following criteria:
- Number of objects that your database can return from a search query
- Number of objects of a certain type the user could have access to
- Percentage of objects in a type the user could have access to
Consider the following scenarios:
A. The number of objects a search query could return from the database is low.
Search then Check is the recommended solution.
Use-case: Situations where the search query can be optimized to return a small number of results.
B. The number of objects of a certain type the user could have access to is low, and the percentage of objects in a namespace a user could have access to is high.
Search then Check is recommended to get the final list of results.
Note that this use case, because the user has access to a low number of objects which are still a high percentage of the total objects in the system, that means that the total number of objects in the system is low.
C. The number of objects of a certain type the user could have access to is low (~ 1000), and the percentage of the total objects that the user can have access to is also low.
In this case, using the GET /list-objects
would make sense. You can query this API to get a list of object IDs and then pass these IDs to your filter function to limit the search to them.
As this number increases, this solution becomes impractical, because you would need to paginate over multiple pages to get the entire list before being able to search and sort. A partial list from the API is not enough, because you won't be able to sort using it.
So while List of IDs then Search would be useful for this in some situations, we would recommend Local Index from Changes Endpoint, Search then Check for the cases when the number of objects is high enough.
D. The number of objects of a certain type the user could have access to is high, and the percentage of the total objects that the user can have access to is low.
The recommended option for this case is to use Local Index from Changes Endpoint, Search then Check.
-
List of IDs then Search would not work because you would have to get and paginate across thousands or tens of thousands (or in some cases more) of results from OpenFGA, only after you have retrieved the entire set can you start searching within your database for matching results. This would mean that your user could be waiting for a long time before they can start seeing results.
-
Search then Check would also not be ideal, as you will be retrieving and checking against a lot of items and discarding most of them.
Use case: Searching in Google Drive, where the list of documents and folders that a user has access to can be very high, but it still is a small percentage of the entire set of documents in Google Drive.
E. The number of objects of a certain type the user could have access to is high, and the percentage of the total objects that the user can have access to is also high.
In this case a Local Index from Changes Endpoint, Search then Check would be useful. If you do not want to maintain a local index, and if the user can access a high percentage of the total, meaning that the user is more likely than not to have access to the results returned by the search query, then Search then Check would work just as well.
Use-case: Searching on Twitter. Most Twitter users have their profiles set to public, so the user is more likely to have access to the tweets when performing a search. So searching first then running checks against the set of returned results would be appropriate.
Summary
Scenario | Use Case | # of objects returned from database query | # of objects user can access in a type | % of objects user can access in a type | Preferred Option |
---|---|---|---|---|---|
A | Search criteria enough to narrow down results | Low | - | - | 1 |
B | Few objects the user has access to, but still a high % of total objects | Low | Low | High | 1 |
C | Cannot narrow down search results, very high probability search returns objects user cannot access, total number of objects user can access is low enough to fit in a response | High | Low | Low | 3 or 2 |
D | Google Drive: User has access to a lot of documents, but low percentage from total | High | High | Low | 2 |
E | Twitter Search: Most profiles are public, and the user can access them | High | High | High | 1 or 2 |