pgreplay − PostgreSQL log file replayer for performance tests |
pgreplay [parse options] [replay
options] [-d level] [infile] |
pgreplay reads a PostgreSQL log file (not a WAL file), extracts the SQL statements and executes them in the same order and relative time against a PostgreSQL database cluster. In the first form, the log file infile is replayed at the time it is read. With the -f option, pgreplay will not execute the statements, but write them to a ’replay file’ outfile that can be replayed with the third form. With the -r option, pgreplay will execute the statements in the replay file infile that was created by the second form. If the execution of statements gets behind schedule, warning messages are issued that indicate that the server cannot handle the load in a timely fashion. The idea is to replay a real-world database workload as exactly as possible. To create a log file that can be parsed by pgreplay, you need to set the following parameters in postgresql.conf: |
log_min_messages=error (or more) |
The database cluster against which you replay the SQL statements must be a clone of the database cluster that generated the logs from the time immediately before the logs were generated. pgreplay is useful for performance tests, particularly in the following situations: |
* |
You want to compare the performance of your PostgreSQL application on different hardware or different operating systems. |
||
* |
You want to upgrade your database and want to make sure that the new database version does not suffer from performance regressions that affect you. |
Moreover, pgreplay can give you some feeling as to how your application might scale by allowing you to replay the workload at a different speed. Be warned, though, that 500 users working at double speed is not really the same as 1000 users working at normal speed. |
Parse options: |
-c |
Specifies that the log file is in 'csvlog' format and not in 'stderr' format. |
-b timestamp |
Only log entries greater or equal to that timestamp will be parsed. The format is YYYY-MM-DD HH:MM:SS.FFF like in the log file. An optional time zone part will be ignored. |
-e timestamp |
Only log entries less or equal to that timestamp will be parsed. The format is YYYY-MM-DD HH:MM:SS.FFF like in the log file. An optional time zone part will be ignored. |
Replay options: |
-h hostname |
Host name where the target database cluster is running
(or directory where the UNIX socket can be found). Defaults
to local connections. |
-p port |
TCP port where the target database cluster can be reached. |
-W password |
By default, pgreplay assumes that the target database cluster is configured for trust authentication. With the -W option you can specify a password that will be used for all users in the cluster. |
-s factor |
Speed factor for replay, by default 1. This can be any valid positive floating point number. A factor less than 1 will replay the workload in ’slow motion’, while a factor greater than 1 means ’fast forward’. |
-E encoding |
Specifies the encoding of the log file, which will be used as client encoding during replay. If it is omitted, your default client encoding will be used. |
Output options: |
-o outfile |
specifies the replay file where the statements will be written for later replay. |
Debug options: |
-d level |
Specifies the trace level (between 1 and 3). Increasing levels will produce more detailed information about what pgreplay is doing. |
-v |
Prints the program version and exits. |
PGHOST |
Specifies the default value for the -h option. |
||
PGPORT |
Specifies the default value for the -p option. |
PGCLIENTENCODING |
Specifies the default value for the -E option. |
pgreplay can only replay what is logged by PostgreSQL. This leads to some limitations: |
* |
COPY statements will not be replayed, because the copy data are not logged. |
||
* |
Fast-path API function calls are not logged and will not be replayed. Unfortunately, this includes the Large Object API. |
||
* |
Since the log file is always in the server encoding (which you can specify with the -E switch of pgreplay), all SET client_encoding statements will be ignored. |
||
* |
Since the preparation time of prepared statements is not logged (unless log_min_messages is debug2 or more), these statements will be prepared immediately before they are first executed during replay. |
Written by Laurenz Albe <laurenz.albe@wien.gv.at>. |