Configuration & Usage of ExtSQL Introduction The Extended usage statistics for SQL (ExtSQL) were designed to give the DBA complete control over the type and amount of data to be recorded by the server. The syntax to display information is fairly simple and is similar to SQL. This document describes how to configure and use Extended Statistics. For information on how to install them on your server, please consult documentation at http://www.extsql.com/documentation.php NOTE: Version/platform specific information is marked by bracketed [MySQL Only] text. Terminology In ExtSQL the different types of things that can interact with the server are called Classes. In its current release these are: db - a database. user - a user. server - the server itself. [Next three not yet available for PostgreSQL] host - the host machine of a connection. conuser - a connection combination of user@host. condb - a connection combination of db@host. The conuser & condb Classes track the host origin of activity for a specific user or database. The conuser reports on activity by a specific user and host combination. The condb tracks activity from a user at a specific host based on the current db selected. e.g. a remote connected MySQL user can change databases during a session by issuing a 'use' command. Within a Class, specific instantiations are called Instances. An instance of a user could be domenic or christina. For each instance we are allowed to track individual data items called Vars. The server Class is unique, the only instance is the server itself. [MySQL Versions] - Vars include almost all of the values displayed in the present SHOW STATUS command or available on an admin tool such as phpMyAdmin as "MySQL runtime information." You do need to use the name of the Var as displayed in the SHOW STATUS command, e.g. Com_select tracks select statements. [PostgreSQL] - Currently limited to: Com_begin, Com_commit, Com_create_table, Com_delete, Com_drop_table, Com_insert, Com_rollback, Com_select, Com_update, Connections, Questions Configuration/Startup Options By default, when the ExtSQL server is started, no data collection is done. All version can report on Vars that record basic SQL commands such as SELECT, INSERT, UPDATE, DELETE, etc... Additional data items (Vars) are available depending on the implementation. The DBA configures the ExtSQL server by describing which Vars they are interested in tracking and setting different characteristics on how data will be collected. Currently support tracking times for all version of ExtSQL can be found here: http://www.ExtSQL.com/tracking.php In patch versions < 2.0, the Classes and Vars being tracked are established at server startup and cannot be changed without a restart. ExtSQL remains disabled on a server unless you have defined a 'extsql_class_list' in your [MySQL] /etc/my.cnf file or [PostgreSQL] data/postgresql.conf file. It may also be defined as a command line option to the server, e.g. [MySQL versions] mysqld_safe -user=mysql --extsql_class_list="user,max-50,time-60,units-m,(Com_select,Com_insert)" [PostgreSQL versions] bin/pg_ctl -D data -l logfile -o '-c extsql_class_list="db,max-24,time-15,units-m,(Questions, Com_select)"' start For all version, if you have something already defined in the config file, a value specified on the server command line will take precedence. The syntax is defined as: extsql_class_list=" Class,max-numInstances,time-timeLimit,units-(m|h|d),(Var,...),Class..." Here is a more complete example: extsql_class_list=" user, max-50, time-60, units-m,(Com_select, Com_update), db, max-50, time-10, units-h,(Com_insert, Com_select, Com_update, Connections), host, max-5, time-3, units-d,(Com_select, Com_update, Questions), server, max-5, time-3, units-d,(Questions, Connections) NOTE: the string must be on one physical line in the config file. In the above we want the server to track information on three different Classes: user, db, and host. We will also keep cumulative statistics for the server as a whole. Similar to the MySQL SHOW STATUS, but with historical data. For each Class, the 'max-' precedes the number of Instances of that Class to be supported. The 'time-' precedes the number of historical time units to store and 'units-' designates the period of interest: 'm' minutes, 'h' hours, and 'd' days. After limits for each Class, a comma separated list of Vars is included. We can now read the line starting with 'db' above as a directive to track up to a maximum of 50 dbs (db, max-50) and record historical activity for the last 10 hours (time-10, units-h). The Vars we are interested in are: inserts (Com_insert), selects (Com_select), updates (Com_update) and connections made. Time Recording and Reporting A more indepth explanation is useful on how ExtSQL records and reports time. For example if you choose to store data using the following time constraint 'user, max-50, time-10, units-h' on the user Class. ExtSQL will actually create a buffer that holds 11 time periods. The '0' buffer always holds cumulative numbers of usage since server start. This is the number displayed in a simple: 'SHOW STATISTICS * FROM user'. Historical activity is captured for 10 'hour' periods in a circular manner and ONLY WHEN activity occurs, e.g. You start the server and 20 hours laters give the command: SHOW STATISTICS * FROM user HISTORY a) Assuming there was user activity during each of the hours since server start. You would see, reported by hour, the amount of user activity for the last 10 hours ONLY. Each row would be timestamped to the hour. b) Assuming there was user activity recorded in each of the first 10 hours of server operation and then only in the last 5 hours, i.e. there is a 5 hour gap with no activity. You would see, reported by hour, the activity that occured between hours 5-10 of server operation, and then the last 5 hours. Each row would be timestamped to the hour. None of the storage space would be used when no activity occurs. In either case above if you have then gave the command: 'SHOW STATISTICS * FROM user' -- you would see just one line of cumulative totals from all 20 hours of operation. NOTE: There was some discussion when this feature was implemented. For some, it might be more natural to expect ExtSQL to report just zero data for the time intervals with no activity. It is under consideration (awaiting user feedback) to also implement a configuration option to allow both behaviours. Security Statistical information is something you may not want to share with every user of the system. By default only the root users may display usage information. The config file also supports another option: extsql_users="john,domenic,phil,mary" This allows a DBA to give access to named users of the system. The root user is always allowed access. The SHOW STATISTICS command, without any options, will show authorized users. The command will not display any information to unauthorized users. Using ExtSQL -- the SHOW STATISTICS Syntax A goal of the implementation was to keep the syntax used to query the statistics as close to standard SQL as possible. The syntax described below should look very familiar, especially if you replace SHOW STATISTICS with SELECT. The significant differences are the use of LIKE to match specific Instances and the new keyword HISTORY to produce historical output. The LIMIT option operates differently depending on whether HISTORY has been specified. Without HISTORY it operates as in normal SQL, limiting the number of result rows displayed. If HISTORY is chosen it functions as a time limit, e.g. show me just the last 3 hours for each user. SHOW STATISTICS ( * | Var list) FROM Class [WHERE var ( '<' | '>' | '=' ) num] [LIKE 'Instance pattern'] [ORDER BY Var] [HISTORY] [LIMIT rows_or_time] Because this is a new SQL addition, if you type just SHOW STATISTICS, you will get some usage and ExtSQL subsystem information -- not just an error message. In normal SQL you could have expected: SHOW STATISTICS * FROM user WHERE user like '%joe%' and Com_select > 500 Yes, we had to cheat by making LIKE a separate clause in the syntax that matched just Class instances. That is our development goal, but at present syntax is limited in the WHERE clause and ORDER BY is also being worked on. This is available if INFORMATION SCHEMA is supported in your base version of the ExtSQL server. Usage Examples The examples below show a cross section of commands and output. When using the "*" column headings are in the same order as specified in the extsql_class_list. ## Show number of select and insert statements given by all users ## since server start: sql> SHOW STATISTICS Com_select, Com_insert, Questions FROM user; +----------+-------------+------------+------------+ | user | Com_select | Com_insert | Questions | +----------+-------------+------------+------------+ | bandala | 8302675 | 95973 | 23153940 | | sandymao | 1702812 | 6205 | 3829023 | | ponnetli | 24909 | 4784 | 95646 | ## Show number of select, total queries issued, and updates made by ## all client hosts which have connected to this server: sql> SHOW STATISTICS Com_select, Questions, Com_update FROM host; +-------------------+--------------+----------+--------------+ | host | Com_select | Questions | Com_update | +-------------------+--------------+----------+--------------+ | db2.adomain.com | 17715223 | 44224076 | 4143634981 | | lathe.adomain.com | 2738061 | 9743215 | 3913397495 | | telkomadsl.co.za | 195 | 5390 | 539604 | ## Show number of select, total queries, and updates made by all ## users@client host machines where more than 10,000 queries were ## issued: sql> SHOW STATISTICS Com_select, Questions, Com_update FROM conuser WHERE Questions > 10000; +----------------------------+-------------+-----------+------------+ | conuser (user@host) | Com_select | Questions | Com_update | +--------- ------------------+-------------+-----------+------------+ | bandala@db2.adomain.com | 8306726 | 23163320 | 3439850933 | | sandymao@db2.adomain.com | 1704040 | 3831803 | 3365501841 | | ponnetli@lathe.adomain.com | 24920 | 95662 | 156529077 | ## Show number of select, total queries, and updates to DB ## bandala for the past three minutes: sql> SHOW STATISTICS Com_select, Questions, Com_update FROM db LIKE 'bandala' HISTORY LIMIT 3; +---------+-------------+------------+-----------+------------+ | db | minutes | Com_select | Questions | Com_update | +---------+-------------+------------+-----------+------------+ | bandala | 11/20 13:56 | 216 | 382 | 318343 | | bandala | 11/20 13:55 | 642 | 1618 | 1386347 | | bandala | 11/20 13:54 | 280 | 699 | 646855 | Sample Configurations and allowed tracking Vars Good defaults - Here are in an example simple configuration. It shows activity by user and database along with server totals. You should set the max, time, and units values as desired. extsql_users="youruser,afriend" # NOTE - following should be on ONE physical line! extsql_class_list="db, max-50, time-100, units-h, (Com_create_table, Com_delete, Com_insert, Com_select, Com_update, Connections, Questions), user, max-50, time-100, units-h, (Com_create_table, Com_delete, Com_insert, Com_select, Com_update, Connections, Questions), server, max-1, time-10, units-d, (Com_create_table, Com_delete, Com_insert, Com_select, Com_update, Connections, Questions)" [MySQL Versions] There are over 200 different variables available. For SAFETY variables not shown above will not be accepted by ExtSQL at startup; however, experienced DBAs can try to use any variable listed on SHOW STATUS. To do this it is necessary to set an OVERRIDE debug flag during server start, e.g. include the following in your my.cnf file before the extsql_class_list: extsql_debug=8192 Current tracking items and example config files for all version of ExtSQL can be found here: http://www.ExtSQL.com/tracking.php INFORMATION SCHEMA Syntax As many of you are aware INFORMATION SCHEMA is already part of the SQL standard and its purpose was to make SQL databases and object more "self describing". [MySQL/PostgreSQL] We have an example working implementation for MySQL 5.0.x as briefly demonstrated below. PostgreSQL does support INFORMATION SCHEMA, but a patch set with ExtSQL changes is not yet available. mysql> use INFORMATION_SCHEMA; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysql> SHOW TABLES; +---------------------------------------+ | Tables_in_information_schema | +---------------------------------------+ | CHARACTER_SETS | | COLLATIONS | | COLLATION_CHARACTER_SET_APPLICABILITY | | COLUMNS | | COLUMN_PRIVILEGES | | KEY_COLUMN_USAGE | | PROFILING | | ROUTINES | | SCHEMATA | | SCHEMA_PRIVILEGES | | STATISTICS | ## Note: the additional tables configured by the DBA for ExtSQL are ## prefaced with 'EXTSTATS_' | EXTSTATS_condb | | EXTSTATS_conuser | | EXTSTATS_db | | EXTSTATS_host | | EXTSTATS_server | | EXTSTATS_user | ## End of added tables | TABLES | | TABLE_CONSTRAINTS | | TABLE_PRIVILEGES | | TRIGGERS | | USER_PRIVILEGES | | VIEWS | mysql> SELECT user, minutes, Com_select, Questions FROM EXTSTATS_user WHERE TIMEDIFF(now(), minutes) < '10:00'; +---------+---------------------+------------+-----------+ | user | minutes | Com_select | Questions | +---------+---------------------+------------+-----------+ | thebook | 2008-07-19 09:28:00 | 2 | 4 | | thebook | 2008-07-19 09:27:00 | 3 | 3 | | thebook | 2008-07-19 09:26:00 | 1 | 1 | | thebook | 2008-07-19 09:25:00 | 2 | 73 | | root | 2008-07-19 09:53:00 | 4 | 4 | | root | 2008-07-19 09:52:00 | 0 | 1 | | root | 2008-07-19 09:50:00 | 1 | 1 | ## NOTE: unlike the special time handling as part of SHOW STATISTICS, ## no changes were made to internal handling of INFORMATION SCHEMA ## queries. This makes the full syntax of the SQL parser available, ## BUT.... it exposes that data is recorded in a circular buffer by ## time interval (minutes in our example), with the time '0000-00-00' ## containing totals since server start. mysql> SELECT user, minutes, Com_select, Questions FROM EXTSTATS_user WHERE user='thebook' LIMIT 5; +---------+---------------------+------------+-----------+ | user | minutes | Com_select | Questions | +---------+---------------------+------------+-----------+ | thebook | 2008-07-19 09:28:00 | 2 | 4 | | thebook | 2008-07-19 09:27:00 | 3 | 3 | | thebook | 2008-07-19 09:26:00 | 1 | 1 | | thebook | 2008-07-19 09:25:00 | 2 | 73 | | thebook | 2008-07-18 16:44:00 | 0 | 14 | [MySQL Only] Monitoring and controlling ExtSQL with SHOW and SET Other commands allow you to know what is going on within the subsystem and even exercise control over its operation. You can see all the available items with: SHOW [GLOBAL] VARIABLES LIKE 'extsql%'; extsql_active: 1 - active, 0 - disabled Allows you to suspend statistical tracking and reactivate it later without a server restart. Data recording will pick up where it left off, no old data will be lost. extsql_debug: Bit mask value. The source contains various test points that will output data to the log. To be used with caution under the direction of our service staff! It is okay to use the 8192 value at server start to allow additional variable tracking. Each of the values above can be modified by the root user using the SET command, e.g. SET GLOBAL extsql_active = 0; Controlling ExtSQL and managing data [ExtSQL patch version > 2.0] Additional commands allow the DBA to temporarily suspend data collection, store data, and even change what data is being collected without a server restart. STATISTICS ( ON | OFF ) -- set data collection on or off. Historical data is not lost. NOTE: Due to inadequate testing, the following commands are disabled by default in current versions. They may be used if the following is set in the startup config file: extsql_debug = 16 STATISTICS RESET [conf_file] -- clear any old data. Start collecting using the existing Class definitions, or use those in specified conf_file. STATISTICS WRITE reload_file -- save a snapshot of current data to a file. Useful in storing data from an anamolous event for later analysis. STATISTICS READ reload_file -- load statistics data from the specified file. A configuration option called "extsql_reload_file" allows you to store collected data which is automatically saved on a server shutdown. Future Directions In addition to items mentioned above work is also planned in the following areas: * Extending the SHOW STATISTICS command to support a more complete syntax in the WHERE and ORDER BY clauses. * Addition of triggers to support administrator notification when predefine usage thresholds are exceeded. * Development of an external tool similar to MRTG to query the server over time and create detailed usage graphs for historical tracking. Like any new feature we expect a strong amount of development to occur and the Software Workshop plans on providing a focal point for community discussion, bug reports and fixes, along with enhancements. As the ExtSQL subsystem matures we expect it will also be picked up into the main MySQL distribution.