Running OpenFGA in Production

The following list outlines best practices for running OpenFGA in a production environment:

Configure Authentication
Enable HTTP TLS or gRPC TLS or both
Set the log format to "json" and log level to "info"
Disable the Playground
Set Cluster
Set Database Options
Set Maximum Results
Set Concurrency Limits

Cluster recommendations

We recommend:

Turn on in-memory caching in Check API via flags. This will reduce latency of requests, but it will increase the staleness of OpenFGA's responses. Please see Cache Expiration for details on the flags.
Prefer having a small pool of servers with high capacity (memory and CPU cores) instead of a big pool of servers, to increase cache hit ratios and simplify pool management.
Turn on metrics collection via the flags --metrics-enabled and --datastore-metrics-enabled. This will allow you to debug issues.
Turn on tracing via the flag --trace-enabled, but set sampling ratio to a low value, for example --trace-sample-ratio=0.3. This will allow you to debug issues without overwhelming the tracing server. However, keep in mind that enabling tracing comes with a slight performance cost.

Database recommendations

To ensure good performance for OpenFGA, it is recommended that the database be:

Co-located in the same physical datacenter and network as your OpenFGA servers. This will minimize latency of database calls.
Used exclusively for OpenFGA and not shared with other applications. This allows scaling the database independently and avoiding contention with your database.
Bootstrapped and managed with the openfga migrate command. This will ensure the appropriate database indexes are created.

It's strongly recommended to fine-tune your server database connection settings to avoid having to re-establish database connections frequently. Establishing database connections is slow and will negatively impact performance, and so here are some guidelines for managing database connection settings:

The server setting OPENFGA_DATASTORE_MAX_OPEN_CONNS should be set to be equal to your database's max connections. For example, in Postgres, you can see this value via running the SQL query SHOW max_connections;. If you are running multiple instances of the OpenFGA server, you should divide this setting equally among the instances. For example, if your database's max_connections is 100, and you have 2 OpenFGA instances, OPENFGA_DATASTORE_MAX_OPEN_CONNS should be set to 50 for each instance.
The OPENFGA_DATASTORE_MAX_IDLE_CONNS should be set to a value no greater than the maximum open connections (see the bullet point above), but it should be set sufficiently high enough to avoid having to recreate connections on each request.

If, when monitoring your database stats, you see a lot of database connections being closed and subsequently reopened, then you should consider setting the OPENFGA_DATASTORE_MAX_IDLE_CONNS to the same number as OPENFGA_DATASTORE_MAX_OPEN_CONNS.

If idle connections are getting reaped frequently, then consider increasing the OPENFGA_DATASTORE_CONN_MAX_IDLE_TIME to a large value. When in doubt, prioritize keeping connections around for longer rather than shorter, because doing so will drastically improve performance.

Concurrency limits

note

Before modifying concurrency limits please make sure you've followed the guidance for Database Recommendations

OpenFGA queries such as Check, ListObjects and ListUsers can be quite database and CPU intensive in some cases. If you notice that a single request is consuming a lot of CPU or creating a high degree of database contention, then you may consider setting some concurrency limits to protect other requests from being negatively impacted by overly aggressive queries.

The following table enumerates the server's concurrency specific settings:

flag	env	config
--max-concurrent-reads-for-list-objects	OPENFGA_MAX_CONCURRENT_READS_FOR_LIST_OBJECTS	maxConcurrentReadsForListObjects
--max-concurrent-reads-for-list-users	OPENFGA_MAX_CONCURRENT_READS_FOR_LIST_USERS	maxConcurrentReadsForListUsers
--max-concurrent-reads-for-check	OPENFGA_MAX_CONCURRENT_READS_FOR_CHECK	maxConcurrentReadsForCheck
--resolve-node-limit	OPENFGA_RESOLVE_NODE_LIMIT	resolveNodeLimit
--resolve-node-breadth-limit	OPENFGA_RESOLVE_NODE_BREADTH_LIMIT	resolveNodeBreadthLimit
--max-concurrent-checks-per-batch-check	OPENFGA_MAX_CONCURRENT_CHECKS_PER_BATCH_CHECK	maxConcurrentChecksPerBatchCheck

Determining the right values for these settings will be based on a variety of factors including, but not limited to, the database specific deployment topology, the FGA model(s) involved, and the relationship tuples in the system. However, here are some high-level guidelines:

If a single ListObjects or ListUsers query is negatively impacting other query endpoints by increasing their latency or their error rate, then consider setting a lower value for OPENFGA_MAX_CONCURRENT_READS_FOR_LIST_OBJECTS or OPENFGA_MAX_CONCURRENT_READS_FOR_LIST_USERS.
If a single Check query is negatively impacting other query endpoints by increasing their latency or their error rate, then consider setting a lower value for OPENFGA_MAX_CONCURRENT_READS_FOR_CHECK.

If you still see high request latencies despite the guidance above, then you may additionally consider setting stricter limits on the query resolution behavior by limiting the resolution depth and resolution breadth. These can be controlled with the OPENFGA_RESOLVE_NODE_LIMIT and OPENFGA_RESOLVE_NODE_BREADTH_LIMIT settings, respectively. Consider these guidelines:

OPENFGA_RESOLVE_NODE_LIMIT limits the resolution depth of a single query, and thus it sets an upper bound on how deep a relationship hierarchy may be. A high value will allow a single query to involve more hierarchical resolution and therefore more database queries, while a low value will reduce the number of hierarchical resolutions that will be allowed and thus reduce the number of database queries.
OPENFGA_RESOLVE_NODE_BREADTH_LIMIT limits the resolution breadth. It sets an upper bound on the number of in-flight resolutions that can be taking place on one or more usersets. A high value will allow a single query to involve more concurrent evaluations to take place and therefore more database queries and server processes, while a low value will reduce the overall number of concurrent resolutions that will be allowed and thus reduce the number of database queries and server processes.

Maximum results

Both the ListObjects and ListUsers endpoints will continue retrieving results until one of the following conditions is met:

The maximum number of results is found
The entire pool of possible results has been searched
The API times out

By default, both ListObjects and ListUsers have a maximum results limit of 1,000. The higher the quantity of potential results in the system, the more time and resource-intensive it becomes to search for a large number of maximum results. This increased load can impact performance, potentially leading to time-outs in some cases. If your use case allows, consider setting a lower max results value via the OPENFGA_LIST_OBJECTS_MAX_RESULTS or OPENFGA_LIST_USERS_MAX_RESULTS configuration properties. This adjustment can lead to immediate improvements in time and resource efficiency.

Data and API Best Practices

Learn the best practices for managing data and invoking APIs in production environment

Migrating Relations

Learn how to migrate relations in a production environment

Cluster recommendations​

Database recommendations​

Concurrency limits​

Maximum results​

Related Sections​

Cluster recommendations

Database recommendations

Concurrency limits

Maximum results

Related Sections