I’m late to the party, but wanted to sneak in a PGSQL Phriday entry for this community blog post series.
Check out my last post PGSQL Phriday #001 — Query Stats, Log Tags, and N+1s.
In this post I’ll share my experience and recommendations for
pg_stat_statements, PgHero, and the Query ID.
- Setup and Restart
- Computing a Query ID
- Query ID and 14
- Wrap Up
pg_stat_statements is a PostgreSQL extension that provides query statistics. You can use these statistics to better understand your query workload, including identifying slow queries.
Statistics are collected for normalized forms of queries within a group. With specific query parameters removed, a normalized form with a unique Query ID is stored, and statistics for queries that fall into this group include the total number of calls, the duration, and more.
What does this mean for query performance analysis?
This means your analysis can move to the Query ID or normalized query level, instead of at the level of specific instances of a query within that group.
Improving the performance of an entire “group” of queries increases the impact of your optimization efforts!
Some of the statistics collected are total calls and average duration. Queries that are larger contributors to poor performance will stand out.
Check out this Postgres.fm episode on
pg_stat_statements for more information.
Now that you know a little about the PGSS extension, and the Query ID, how can you install and configure this extension to get started?
Setup and Restart
You’ll want to check out the official docs1 if you’re self-hosting PostgreSQL or check out docs from your cloud provider.
AWS has a nice video called How do I implement Postgres extensions in Amazon Relational Database Service for PostgreSQL? that shows how to set up
pg_stat_statements on AWS RDS.
You’ll add the extension to
shared_preload_libraries, confirm it’s enabled with the
\dx meta command, and enable it for each database where you want to collect statistics.
To enable it, run
create extension pg_stat_statements;.
Query the captured stats using a client like
psql. Take a look at your Top 10 Worst Queries.
While that process works well for individuals that know where to look, and how to query the information, on a team I recommend adding a tool that makes this process easier.
For that tooling I recommend PgHero. PgHero has a nice presentation of Query Stats that can help your team have a shared view of all queries, including slow queries. This can make collaboration faster and easier.
With PGSS enabled and PgHero connected to your database, Query Statistics are now visible in the Queries tab.
PgHero is a Rails Engine and is available as a Ruby gem or in a Docker container.
For Rails developers, the code structure of PgHero is familiar since it’s a Rails Engine.
Creator Andrew Kane is a great maintainer of the project. Open an Issue or PR to discuss your proposed changes to PgHero, or run your changes on a fork.
Where I work, we’re running a fork with a couple of small changes that I felt were useful but didn’t make it back to the main project.
Next, let’s dive in to the Query ID attribute.
Computing a Query ID
Since PostgreSQL 14, enable
compute_query_id to compute a Query ID.
While a Query ID was available in earlier versions with PGSS, it’s now available in more places like
pg_stat_activity and the PostgreSQL log.
Configure these tools so that you can connect a PGSS Query ID with a sample from your log file or in your activity view.
How would you set that up?
Query ID and 14
In the post Using Query ID in Postgres 14, the author shows how to connect the Query ID from PGSS to query text logged in the
The query text is the full text of the query including the specific parameters. Any variations of the query with the same structure but different parameters will have the same Query ID.
compute_query_id set to
on, the author shows how to use
log_line_prefix to print the query ID.
Use the fragment
%Q is the Query ID.
You probably need to enable
log_duration as well. I wasn’t able to log the Query ID without
With that in place, you can now follow this workflow.
- Get the Query ID from PGSS for a slow query group you wish to optimize
- Log the
- Search the log file to find matches by Query ID. The query text will probably be on the next line and not on the same line.
- Once you’ve collected a query text sample with parameters, get the execution plan manually with
EXPLAIN (ANALYZE, VERBOSE)
Michael from pgMustard pointed out improvements coming to PostgreSQL 16, where
auto_explain.log_verbose will include the Query ID. See: PostgreSQL: Record queryid when auto_explain.log_verbose is on.
This will make it even easier to connect the Query ID from PGSS to samples from the log, and even their execution plans logged automatically with
There you have it. Let’s recap.
pg_stat_statementsis a useful extension. I recommend enabling it for every database you work with.
- Query the normalized queries and statistics captured using a client, or consider using PgHero to present this data to a team.
- From a Query ID in PGSS, find matching Query ID values for samples of queries with text and parameters from logs starting in Version 14.
Thanks for reading!
Update: Linked from the pgMustard Roundup!