2010-06-30

More Drizzle Syslog. Logging big and slow queries.

One extremely useful feature in MySQL is the "slow query log". In the stock distribution, it records every query that takes more than 1 second to execute, and every query that requires a full table scan. A useful part of the Percona patches extend this feature to measure sub-second slow queries, which became important as machines got faster.

When someone asks "Why is the database slow?", the first counter question to ask is "What's in the slow query log?"

One of the confusing parts of administering MySQL is that the "query log" and the "slow query log" are different things. They have different controlling options, and different log formats.

When I ripped out and replaced the logging system during the Drizzle fork, I took these two logging systems, which were implemented as completely separate code, and merged them together. There is now only the "query log", but it can be filtered in-process to only record queries that are "too slow", "too large", or "too expensive".

This filtering is controlled with three configuration options or server variables:

syslog-logging-threshold-slow
syslog-logging-threshold-big-resultset
syslog-logging-threshold-big-examined

These all have a default value of is zero.

The "-slow" variable is in units of milliseconds. If a query takes longer than that much time, it gets logged.

The "-big-resultset" variable means that any query that has a resultset with more rows than this value logged.

The "-big-examined" is a bit more complicated. It counts the number of rows that were retrieved from the underlying storage engines. Any query that read more than the value gets logged. This is useful for detecting table scans.

For example, if a table PARTS has 10,000 rows, and has a column SID that is not indexed, then the query
SELECT * FROM PARTS WHERE SID='0042';
than the number of rows examined will be 10,000, no matter how many rows match the query. But if that column is indexed, and that index has a high cardinality, than the number of rows examined will be much less. In the best case, it will be the same as the number of rows returned.


Another major annoyance with stock MySQL is that to enable the query log and the slow query log, you have to restart the server to enable them. This is a problem for large systems with uptime requirements.

In Drizzle, on the other hand, the variables syslog-errmsg-enable and syslog-logging-enable can be set while the server is running.

Also, the threshold variables can be changed while the server is running. So you can leave them at the default values of zero when first setting up an application, and get a full log. Then turn them up to high values, and only get logged events when something unusual happens, and then turn the thresholds back down to lower more sensitive values to investigate the problem, all without restarting the server.

1 comment:

  1. Do you always log queries when they complete, not as they enter?

    That is one big difference between the general log and the slow log in MySQL. It is why you can use the general log for information such as when a query is causing a crash since it is already logged by the time it starts executing.

    Also, in MySQL 5.1 you can enable or disable slow and/or general logging on the fly.

    Finally, why no log-queries-not-using-indexes type option? big-examined can kinda do that, but not as precisely and wouldn't be as useful in certain environments (but more useful in others).

    ReplyDelete