2008-06-17

How to Truncate All or Some of the Tables in a MySQL Database

Truncating all tables in a database is a common problem which arises a lot during testing or debugging.
One of the most common answers to this question is to drop & recreate the database, most likely utilizing the shell. For example, something like this:

mysqldump --add-drop-table --no-data [dbname] | mysql [dbname]

This dumps the entire schema structure to disk, without dumping any data, and with commands for dropping existing tables. When loading it back into mysql, it essentially truncates all the tables in the database. Basically, this is a decent solution for many uses.

We had a requirement for a solution that needed these additional features:

  1. Does not require shell (We work with both Linux and Windows)
  2. Resides inside the MySQL Server (To minimize outside dependencies - for example - the mysql command line client)
  3. Can truncate only specified tables using a some sort of filter

In other words, the main requirement here is encapsulation and simplicity of use for the developers.
Basically, this means I've got no other choice except writing a stored procedure for this purpose, ugly as it may be. Specifically, this will require using the pseudo-dynamic SQL in MySQL (more about this later) and a server-side cursor.

The code of the Stored Procedure:

DELIMITER $$
CREATE PROCEDURE TruncateTables()
BEGIN
DECLARE done BOOL DEFAULT FALSE;
DECLARE truncate_command VARCHAR(512);
DECLARE truncate_cur
CURSOR FOR /*This is the query which selects the tables we want to truncate*/
SELECT CONCAT('TRUNCATE TABLE ',table_name)
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME LIKE 'prefix_%';
DECLARE
CONTINUE HANDLER FOR
SQLSTATE '02000'
SET done = TRUE;

OPEN truncate_cur;

truncate_loop: LOOP
FETCH truncate_cur INTO truncate_command;
SET @truncate_command = truncate_command;

IF done THEN
CLOSE truncate_cur;
LEAVE truncate_loop;
END IF;

/*Main part - preparing and executing the statement*/
PREPARE truncate_command_stmt FROM @truncate_command;
EXECUTE truncate_command_stmt;

END LOOP;
END$$

DELIMITER ;

This is a generalized version, you can (and should) modify it to suit your needs.

Regarding dynamic SQL. Initially I've tried to make this query a lot simpler, and use something along the lines of the previous blog post. This would look something like:

DELIMITER $$
CREATE PROCEDURE TruncateTables()
BEGIN /*Build a long user variable containing the varchars*/
SELECT GROUP_CONCAT('TRUNCATE TABLE ',table_name SEPARATOR ';')
INTO @truncate_command
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME LIKE 'prefix_%';

SET @truncate_command = CONCAT(@truncate_command,';');

PREPARE truncate_command_stmt FROM @truncate_command;
EXECUTE truncate_command_stmt;

END$$

DELIMITER ;

The main (and unsolvable) problem with this code is that the PREPARE cannot prepare more than one statement. In other words, while in other databases you could have sent an entire batch (real dynamic SQL), in MySQL the only way to do this is to run statement by statement. On the other hand, there is no GROUP_CONCAT() in other databases (even though it can apparently be created with ease in PostgreSQL).

The second and less worrying problem is the fact that by default, GROUP_CONCAT() result tends to be limited to a certain length. This is easily solved by setting group_concat_max_len to something higher than default.

End Transmission.

10 comments:

  1. If the mysql-dump command isn't buffered this could cause huge problems (imagine the drop command executed by MySQL while the dump is still underway). This probably isn't a problem because the dump program loads all data before rendering the output, but it still seems needlessly risky to me. I'd split those into two separate commands.

    ReplyDelete
  2. Using FOREIGN KEYs also helps since the TRUNCATE on the parent table also nukes all the children.

    ReplyDelete
  3. Tomer: According to the manual, the mysqldump does not buffer contents in memory by default, when dumping table data. There is a config option "--skip-quick" that enables buffering. Then again, that's only for table data, not for DDL.
    That's an interesting point tho. It doesn't happen in reality (at least it didn't to me) but it could happen. There is probably some linux command that enables buffering the result in between the commands.
    Using a temporary file is also an option though.

    Tim: This is also a good point. I've wrote this script to truncate tables in a database that has no foreign keys at all, but a lot of tables.
    Truncating a lot of interconnected tables could be a problem, since AFAIK you can't truncate a table when it has records pointing to it (you can only DELETE from it). I'll need to think of a solution for that case then. Maybe somehow find out who are child tables, and truncate them before others...

    ReplyDelete
  4. Another way to do it atomically for an entire database is:

    create database dummy_db;

    for each table in db:
    create table x_new like x;
    ..

    rename table x to dummy_db.x_old, x_new to x,
    y to dummy_db.y_old, y_new to y,
    ...;

    drop database dummy_db;

    ReplyDelete
  5. Hi Shlomo! Feel free to add your code as a code snippet on forge.mysql.com!

    good stuff!

    ReplyDelete
  6. I probably should.. It needs a bit of clean-up & error handling etc before that...

    And thanks!
    I've got your book here on my bookshelf at work, by the way :)

    ReplyDelete
  7. Didn't get in-depth into what you've written here, but regarding this:
    There is probably some linux command that enables buffering the result in between the commands.

    The pipe buffer in Linux, defined by the PIPE_SIZE constant, is 4K. It's hardcoded into the kernel. I guess you could solve this with a mediator, such as this (found through Google, didn't test):
    http://www.glines.org/wiki/bfr

    As a side note, turns out enlarging your pipe buffer does not improve performance:
    http://ussg.iu.edu/hypermail/linux/kernel/0001.1/0401.html

    ReplyDelete
  8. Hey. This was exactly what I needed. I've started to do integration tests, and I needed a fresh database at each run, but I needed to keep the data in some of the tables.

    I put together a guide for cloning and truncating a mysql database using this method. (have remembered to give you credit ;) )

    If you are interested, it is here:
    http://cimicdk.blogspot.com/2012/03/clone-mysqldatabase-for-integration.html

    ReplyDelete