Showing posts with label MySQL. Show all posts
Showing posts with label MySQL. Show all posts

2008-06-17

How to Truncate All or Some of the Tables in a MySQL Database

Truncating all tables in a database is a common problem which arises a lot during testing or debugging.
One of the most common answers to this question is to drop & recreate the database, most likely utilizing the shell. For example, something like this:

mysqldump --add-drop-table --no-data [dbname] | mysql [dbname]

This dumps the entire schema structure to disk, without dumping any data, and with commands for dropping existing tables. When loading it back into mysql, it essentially truncates all the tables in the database. Basically, this is a decent solution for many uses.

We had a requirement for a solution that needed these additional features:

  1. Does not require shell (We work with both Linux and Windows)
  2. Resides inside the MySQL Server (To minimize outside dependencies - for example - the mysql command line client)
  3. Can truncate only specified tables using a some sort of filter

In other words, the main requirement here is encapsulation and simplicity of use for the developers.
Basically, this means I've got no other choice except writing a stored procedure for this purpose, ugly as it may be. Specifically, this will require using the pseudo-dynamic SQL in MySQL (more about this later) and a server-side cursor.

The code of the Stored Procedure:

DELIMITER $$
CREATE PROCEDURE TruncateTables()
BEGIN
DECLARE done BOOL DEFAULT FALSE;
DECLARE truncate_command VARCHAR(512);
DECLARE truncate_cur
CURSOR FOR /*This is the query which selects the tables we want to truncate*/
SELECT CONCAT('TRUNCATE TABLE ',table_name)
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME LIKE 'prefix_%';
DECLARE
CONTINUE HANDLER FOR
SQLSTATE '02000'
SET done = TRUE;

OPEN truncate_cur;

truncate_loop: LOOP
FETCH truncate_cur INTO truncate_command;
SET @truncate_command = truncate_command;

IF done THEN
CLOSE truncate_cur;
LEAVE truncate_loop;
END IF;

/*Main part - preparing and executing the statement*/
PREPARE truncate_command_stmt FROM @truncate_command;
EXECUTE truncate_command_stmt;

END LOOP;
END$$

DELIMITER ;

This is a generalized version, you can (and should) modify it to suit your needs.

Regarding dynamic SQL. Initially I've tried to make this query a lot simpler, and use something along the lines of the previous blog post. This would look something like:

DELIMITER $$
CREATE PROCEDURE TruncateTables()
BEGIN /*Build a long user variable containing the varchars*/
SELECT GROUP_CONCAT('TRUNCATE TABLE ',table_name SEPARATOR ';')
INTO @truncate_command
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME LIKE 'prefix_%';

SET @truncate_command = CONCAT(@truncate_command,';');

PREPARE truncate_command_stmt FROM @truncate_command;
EXECUTE truncate_command_stmt;

END$$

DELIMITER ;

The main (and unsolvable) problem with this code is that the PREPARE cannot prepare more than one statement. In other words, while in other databases you could have sent an entire batch (real dynamic SQL), in MySQL the only way to do this is to run statement by statement. On the other hand, there is no GROUP_CONCAT() in other databases (even though it can apparently be created with ease in PostgreSQL).

The second and less worrying problem is the fact that by default, GROUP_CONCAT() result tends to be limited to a certain length. This is easily solved by setting group_concat_max_len to something higher than default.

End Transmission.

2008-06-12

How To Perform a Connection Massacre

Occasionally, you find yourself in the need to kill problematic connections to the database.
Usually if it's only one or two connections, you can use the combination of "SHOW PROCESSLIST" command to identify the problematic connection ID, and run a "KILL ID" command.

What do you do if you need to kill 10 connections? Or 56? I wouldn't want to type in all those kill commands, it's just dirty work. What we need is a more neat manner to perform those kills. Mass kill, if you wish.

Alternative way: use the INFORMATION_SCHEMA's PROCESSLIST table, to construct the kill statements semi-automatically.

SELECT CONCAT('kill ',id,';') AS kill_list
FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE command='Sleep';

This select will return something like this (when using the command line client):

+-----------+
| kill_list |
+-----------+
| kill 28; |
| kill 1; |
+-----------+
2 rows in set (0.04 sec)

You can also use GROUP_CONCAT() to get the kill commands in the same line, which may be more useful when you can't copy&paste easily:

SELECT GROUP_CONCAT('kill ',id SEPARATOR '; ') AS kill_list
FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE command='Sleep';

Returns:

+------------------+
| kill_list |
+------------------+
| kill 28;,kill 1; |
+------------------+
1 row in set (0.03 sec)

Note, that you can use any filter in the WHERE clause, such as "WHERE db IN('dbname1','dbname2')", "WHERE user = 'appuser'" or "WHERE time>600". If you got any more clever uses, feel free to post them in the comments.

Now, all needed is to copy&paste those kill commands into the mysql command console (whichever you're using: GUI, command line, etc) and you're done.

Since most programs are kind enough to format the output in some way that prevents convenient copying/pasting, here's another tip. The mysql command line client can be asked to strip some of the output by using the "-s" optional parameter, which stands for "silent". You can use it once, and remove the ASCII art, timing and row count, and if you use it again (-s -s), you only get the actual rows.

An example doing just this (copy&paste friendly!):

mysql -s -s -e "SELECT CONCAT('kill ',id,';') AS kill_list FROM INFORMATION_SCHEMA.PROCESSLIST WHERE command='Sleep';"

Gets you this (with different numbers of course):

kill 28;
kill 1;

Another way to do this, is to use SQLyog or any other GUI for MySQL, which can output the results in text. Most of them have this as a configurable option (Ctrl-L keyboard shortcut in SQLyog).

Hope this is will save someone some precious time.

Update:
I didn't include a suggestion on doing this automatically, since I personally prefer to run kill commands manually most of the time. Plus, some of the comments gave me an idea. Why not just pipe the stripped kill_list into mysql?
mysql -s -s -e "SELECT GROUP_CONCAT('kill ',id SEPARATOR '; ') AS kill_list FROM INFORMATION_SCHEMA.PROCESSLIST WHERE command='Sleep';" | mysql

2008-05-31

Taking Cache Warm-up Time Into Consideration

Last week there there was a short scheduled downtime for the system, and I've used the chance to upgrade the MySQL servers to a newer version. We've tested it for more than a month before deploying to production, so I wasn't worried about any potential problems.

The upgrade itself went smoothly and was complete in about less than 10 minutes. Speaking of which, this is one of the things I like in MySQL upgrades as opposed to SQL Server.

  1. Copy new version
  2. Shut down the server
  3. Rename directories
  4. Start server
  5. Run mysql_upgrade
  6. Restart server

All of the above usually takes about 2 minutes, and can be reverted quite easily, if you have backup for the data files. This is compared to a SQL Server upgrade, or even a Service Pack install which includes running an installer for what can be 20 minutes or more. Uninstalling is equality lengthy, though both are accompanied by nice and useless progress bars.

After the upgrade was complete, we started the application and it wouldn't work - it was reporting timeouts. Why? Was there a problem with the upgrade? Some bug we haven't found out about? The circumstantial evidence pointed directly to the upgrade as the problem. After some digging by Pasha, the problematic query was found - it was a group by query updating some counter table. It ran in under 5 seconds before, but now it just wouldn't finish in the 30sec timeframe before timing out.

This immediately pointer to the real cause of the problem: the server restart. The buffer cache was flushed, and the index scanned by this query was not in memory. Fixing was easy, I ran the problematic query manually (it took almost 2 minutes to run). After this, the application worked again. Problem solved? Not really.

Queries of this sort should almost never exist in a system. This specific query was long since been noticed due to proactive slow-log monitoring. This problem was (ironically) fixed on the day of this incident, but the newer version was not yet deployed.

Conclusion: Think of how your application works with a cold database cache. If you need to warm it up, do so. A better path is to design the database more efficiently and avoid certain types of queries. In this case, the counter table was now updated on batches of inserts and deletes - instead of periodically.

2008-04-26

Using the MySQL Event Scheduler to Emulate Threads

The MySQL Event Schedule is one of the new features in MySQL 5.1. It is explained well in this article from the MySQL site, and in the manual itself. I'd like to demonstrate an interesting way to use it. What gave me this idea is actually a feature in SQL Server 2005, called Service Broker. It's a sort of queue manager / messaging platform, which can also be exploited in the same way.

Lets say you are writing a stored procedure, that loads data into the database. Benchmarks show that loading data in parallel from several threads increases overall throughput, as MySQL is unable to utilize several CPUs to process a single SELECT query, moreover a bulk load operation. To work in parallel, you'd probably need to write a script that uses threads, open several connections to the database, and perform those queries in parallel.

I propose an alternative.

The MySQL Event Scheduler is capable of performing several tasks at once, since it's highly probable that several events might need to run at the same time. Why not use it?
What we will create, is a "self-destructible" event, that performs a task once and disappears. The syntax that allows this is self explanatory:

CREATE EVENT IF NOT EXISTS event_name ON SCHEDULE AT NOW() ON COMPLETION NOT PRESERVE
DO statement_or_batch;

I don't think this is the place to descend into the implementation detail of using it for data loads, so I won't. Note, however, how flexible it is.

  1. You could use the event scheduler to perform several SELECT statements in parallel into some summary or union table, and collect the returned result afterwards.
  2. To see if an event is still running, you can poll the mysql.event system table for the event name, or names (when waiting for several events to complete).
  3. Combine it with the FEDERATED engine. That is one feature which is seriously in need of more development efforts. It is not even included in 5.1.24 due to severe problems. When it goes back to a stable status, you could use events to execute remote queries in your cluster in parallel - with a single command, from a single server.
  4. Use it for an unordered set of commands that your application shouldn't wait for at all, or it doesn't care when it gets done.
  5. Use it to delay execution, just replace NOW() with NOW() + INTERVAL 5 MINUTE, as demonstrated earlier.

The possibilities are endless. Well, not endless, but I'm sure that the event schedule can be tricked into doing a lot more cool things than these... If you can think of anything of that sort, I'd love to hear it.

Some points:

  1. As the mysql.events table is using the MyISAM engine, I'm not sure how transactional adding an event is. Most likely, it isn't.
  2. If you want to test the scheduler, make sure it's turned on (SET GLOBAL EVENT_SCHEDULER = ON).
  3. Each event run adds two lines to the error log (hostname.err) file reporting start and end of the event. This might be a problem similar to leaving the general log on, when using events extensively.

2008-03-20

Two MySQL Monitoring Tools and a Hidden Knowledge Base

This post is incomplete without mentioning the previous one about development tools, so make sure you read it before/after this.
In this post, I'd like to mention two monitoring tools, and one knowledge base.

Two of the tools are made by Quest Software/ToadSoft that make Toad for MySQL. Both of them are Freeware.
A side note: TOAD was born as "Tool for Oracle Application Developers". Because "Tool for Oracle Application Developers for MySQL" sounds weird, TOAD is known officially as "Tool For Application Developers".

Spotlight for MySQL
Spotlight is a tool perhaps more known among SQL Server and Oracle users. It shows you the entire flow of data throughout your database in a nicely laid out executive-summary kind of way. It has lots of colors and it's impressive to look at. When it works. Seriously, I've had a lot of trouble with it, to make it work the way it should too many times. It's also slow as hell, and tends to hang or crash, like other tools by Quest. But, when it does work - it's very nice. Quest released this tool for MySQL as well, and since it's free it doesn't hurt to try it out.

Knowledge Xpert for MySQL
This is basically a well laid-out knowledge base, of what Quest Software thinks you should know about MySQL. Full of useful examples, and clear explanations for things that the MySQL manual doesn't mention. I recommend installing it and taking a look, you might find something interesting. It was helpful to me when I started learning MySQL.
Warning: practically this is just one big help file, but Quest decided to bundle an annoying interface, a slow installer and one of those load-on-startup-take-memory-do-nothing-system-tray-icon which you need to get rid off after installation.

The next one is made by Webyog, (the same guys that make SQLyog which I mentioned before).

MONyog
MONyog is a small program with a web front-end which can monitor several MySQL servers and supports some customizable metrics. I won't go into detail on the good and bad things in MONyog here (I plan to write a review of it later on), but I do suggest you try the trial version. MONyog is not free, and doesn't have a free or community "lite" version.
At the time of writing the latest stable version is 1.52, but 2.0 beta 2 is already available and brings additional features like historical reports and better graphs.

2008-03-06

Misleading Error Message When Running 32bit Applications on 64bit Ubuntu/Linux

Short version (aka, Executive Summary)

If you try to run a 32bit program under 64bit Ubuntu, there is a chance you might see this misleading error message:

-bash: ./executable_filename: No such file or directory

What the shell is actually trying to say is "You can't run 32 bit applications here without installing support for 32 applications!".

Longer version

We purchased the Webyog's MONyog MySQL monitoring tool, after some trial version testing in a Windows environment. When trying to run it on one of our Ubuntu servers, I got this:

myuser@myserver:~/MONyog/bin$ ./MONyog-bin
-bash: ./MONyog-bin: No such file or directory

I checked again: the file was there, permissions were okay. Hmm. Why was bash saying the file wasn't there? Here it is! It's not like the error came from the application, it was bash itself saying that it's not found.
I googled for a while till I opened a support ticket at Webyog, and this is how I found out the real reason. The solution is to install support for 32 bit applications which is package called ia32-libs-sdl (which also installs a number of prerequisites). Once it's installed, the program works.

I'll use this opportunity to praise Webyog's support, which was fast and to the point. I got my initial reply with an explanation after few hours, and the solution the next day - without having to chase them for it.

2008-02-28

MySQL Date and Time Arithmetic

One thing that sometimes bugs me in MySQL is the huge amount of errata. The database is packed with features and abilities that are side-effects from "historical decisions". Decisions that were made when it wasn't planned to be an enterprise product, or changes made to ease porting applications to MySQL from other databases. Legacy, so to speak.

The timestamp data type is a good example - the timestamp manual section is a long list of exceptions and special cases. I usually re-read the manual page before using it. Another good example is the amount of date and time functions that exist in MySQL. There are way too many of them, some overlapping in function, and some just exist as aliases to other functions for compatibility. Just go over the list.

On the other hand, this artistic freedom has its merits.
For example, take the date arithmetic. Chances are that if you came from SQL Server background like me, you were used to all the variations on the DATEDIFF and DATEADD commands. While it works as well (the DATEADD function is there and works almost the same way), MySQL supports a much readable form of date and time arithmetic.

For example:

SELECT NOW() - INTERVAL 5 MINUTE

Will return, naturally, current date+time minus 5 minutes. Or:

SELECT NOW() + INTERVAL 10 DAY

You guessed it, this will return the current date+time plus 10 days.

Clean and simple, isn't it? The syntax goes like this:

date + INTERVAL expression unit

The unit can be MINUTE, HOUR, DAY, MONTH, YEAR etc. The expression can be any legal T-SQL expression, and the date is just the date variable, which can be a result from another expression, a variable or a table column. Check out the manual for more examples.

This is an elegant solution to a common problem, which is also easy to remember and use. Why so few people use it is a mystery to me - I've almost never seen it in code samples.

2008-02-21

Hello Planet MySQL!

My blog been added several days ago to the Planet MySQL feed, and am now one step closer to world domination.

I'll assume that most people who see this entry at their feed readers would be unfamiliar with this blog, so I should be including an introduction here. Instead, I invite you to take a look at the home page and look at some olds posts of mine that might interest you. Not everything is about MySQL, and the Planet MySQL feed takes only the ones that I've labeled as such.

The following posts are the top search engine keywords that this blog gets, so you might want to start there:

I wonder how many comments of this sort I'll get?

What do you want me to do?  LEAVE?  Then they'll keep being wrong!

2008-02-17

MySQL 5.1.23 Regression Bugs

I've started testing the new Release Candidate (5.1.23) today. This was a version with a huge amount of fixes relative to latest MySQL releases - but I didn't expect to run into two bugs this fast.

  • Bug #34626 which I just filed in today and was already verified. This is a basic scenario which is used in the deployment scripts we use at Delver. Just try and pipe an empty file into the mysql command line client - and it crashes.
  • Bug #32633 which was apparently fixed in 5.1.24, and should have also presented itself in 5.1.22, but didn't for some reason. The upgrade brought it up.
    This one basically means that you just can't create stored procedures at all, unless you clean the sql_mode variable from certain tags. Which was really frustrating to find. I almost filed another bug but managed to find someone who already filed a duplicate 6 days ago.
    The SQL Mode is a session/global variable that determines which SQL compatibility mode the current session works as. More details here if it interests you.
I could say that I'm doing the community a favor by testing products or whatnot, but I'd just prefer the QA procedures at MySQL AB to be a bit stricter - I consider these as really trivial scenarios.
Conclusion: new versions fix old bugs and bring new ones. Trust no one. Test for yourself.

2008-01-27

Another Nasty Bug Squashed

I was so damn proud of myself to stumble upon a cool critical bug in MySQL, only to find it was a duplicate of another bug discovered 3 days earlier by someone else, and not just someone, but Paul DuBois, the author of no less than 3 books about MySQL and the likes.
My test case was simpler though - judge for yourself (#32376 vs #32260).

It goes like this - Find a MySQL 5.1.22 (or less). Connect. Run the next lines of code. Server is down. Simple, eh?

create table test.t(c int) engine = innodb;

select @var := c as c
from t
order by c asc;

This is documented as a casual "Some uses of user variables in a query could result in a server crash." in the 5.1.23 change log. Speaking of which, there is a new change log for 5.1.24 already, so this means 5.1.23 will be released really soon.

2008-01-22

Comments in Stored Procedures - Fixed in MySQL 5.1.23

I've written before in the big Why not learn MySQL blog post, that comments in Stored Procedures in MySQL are useless.
The reason is simple - the backup/export/dump procedure does dump them into the file correctly, but when loading the file by piping it into the mysql command line client - the comments get stripped. This was a sort of an "optimization" in the client, which is now configurable.

The 5.1.23 change log says so:

mysql stripped comments from statements sent to the server. Now the --comments or --skip-comments option can be used to control whether to retain or strip comments. The default is --skip-comments. (Bug#11230, Bug#26215)

Too bad it's still not released yet, it's in development since September 2007 and the change log is already huge in comparison to the other 5.1.x releases.

2007-12-19

MySQL T-SQL CASE Statement: Bad Language Design

Imagine you want to do something like this in a stored procedure:

SET @var = SELECT var FROM t; --some generic value
CASE
WHEN @var = 1 THEN CALL do_something();
WHEN @var = 2 THEN CALL do_something_else();
END CASE;

And it works fine for a while until you get and error:

ERROR 1339 (20000): Case not found for CASE statement

This happens due to the fact there is no WHEN clause to handle this specific value. It's unlike the regular switch clause logic in C, but it seems reasonable. So I've tried adding an empty ELSE clause that would do nothing:

SET @var = SELECT var FROM t; --some generic value
CASE
WHEN @var = 1 THEN CALL do_something();
WHEN @var = 2 THEN CALL do_something_else();
ELSE --or ELSE; or ELSE NULL; or any combination you might think of
END CASE;

Nope, we get the usual "You have an error in your SQL syntax; check the manual....". Naturally, the manual didn't say anything about it.
First I did some dummy SET @tmp = NULL; assignment to fill the ELSE clause, but then I did some research, and got to Bug #14042.

Short version:

The "correct" way to do this, is like so:

SET @var = SELECT var FROM t; --some generic value
CASE
WHEN @var = 1 THEN CALL do_something();
WHEN @var = 2 THEN CALL do_something_else();
ELSE BEGIN END;
END CASE;

Longer version:

When found this bug, it was a low severity feature request that asked for the reasonable thing - just remove this error check and exit the CASE. Unfortunately, the MySQL decision was to mark it as "Won't fix" and close the bug.

It seemed to me a really twisted logic, but I wanted others to know about it, so I've added a comment on the MySQL Manual page with a short explanation and example.
After about 30 minutes (which is a really fast response time from the moderator) I got an email stating that my comment was removed, and that now the bug is reopened as a "Documentation Bug". The manual is still not updated with this issue, but I guess it will be added sometime soon.

Apparently if something is a really well documented bug with a well known workaround, it becomes a feature.

Update: The bug was closed and the manual page now includes an explanation. Also, it seems that this is not the fault of MySQL at all- this is actually a standards-compliant feature (take a look at the comments).

2007-12-10

MySQL Development Tools - The Good, The Bad and The Ugly

One of the first things I did when I started learning MySQL is to find a decent GUI tool to administer the server, since I didn't think that the bundled MySQL Query Browser/MySQL Administrator is what most people use for complex work. Luckily, I was right.
There is a huge variety of MySQL development tools - some better and some worse. I'd prefer if I only had one tool that does it all - but if that's impossible, the important thing now is to pick the right one for the job.

Here it is, the (nearly) complete list of 3rd party tools. It took me a while to find out about all of them - so I hope this saves people quite some time there.

Here are the best ones I use on a regular basis, each one is good for different set of tasks:

  • SQLyog by Webyog (commercial, with free community version)
    Best combination - simple, fast, does just what it should and doesn't get in your way. Plus, the community version is free (except a small nag-screen). I use it most of the time and highly recommend it for most uses. Something that really annoys me though is that it looks like it has hanged when running long queries - though it doesn't really.
  • Toad for MySQL by ToadSoft/Quest Software (freeware)
    Ultra feature rich, and free. This one has it all. Very, very, very rich with features, from debugger to custom script generator ala SQL Server DTS.
    The problem: it has it all. Takes time to understand how it all works, and it's overkill for most users. It's completely riddled with bugs. Version 3.1 helped the stability yes still the mean-time-between-crashes is about 30 minutes. I really hope that some future version 3.x will make this one stable. When it doesn't crash, it makes good use of it's multi-threaded architecture, and it's very responsive when executing long queries.
  • MyDeveloper Studio by Core Lab (commercial)
    High quality, "Visual Studio"-like UI. Fast and efficient. Great solid and fast user interface, pleasure to use.
    We've actually got a license to this one, since it's the best program that includes a built-in debugger, and it was before ToadSoft released a version with a debugger as well. I use it all the time when I'm writing complex Stored Procedures.
  • MySQL Administrator
    It may look simple at first, but this little puppy is very powerful.
    I recommend you download this XML definition file for the health monitor - it really helps a lot to understanding what's going on inside your server.

I've tried these ones out, but didn't like them:

  • HeidiSQL (freeware)
    Lacks some advanced features, but stable and friendly.
  • MySQL DreamCoder (commercial, with freeware version)
    Doesn't do anything special, could be used instead of SQLyog I guess.
  • Navicat for MySQL (commercial, with a free "lite" version)
    I didn't like the interface, but overall the program is quite solid.
  • SQL-Front (commercial)
    Seems too rough round the edges. On the plus side, it is one of the few tools I've seen that support partitions. Problem is, if you need partitions your schema's data is too big to be edited using this tool (or almost any other) and you just need a SQL command prompt which any tool can give you.
  • DBArtisan by Embarcadero Technologies (commercial)
    Bit of a bloat-ware. A 165MB install just doesn't make sense to me.
    You can control any database you want, including Sybase, SQL Server, Oracle and whatever. Problem is with tools of this kind is that they tend to be too general and not really tailored to MySQL.
  • Aqua Data Studio (commercial)
    Another one of those control-all-your-different-databases-from-one-tool approaches.
  • Database Workbench (commercial, with lite version coming soon?)
    Functional but not exceptional. This is another tool that can administer all your databases from a single point. Has some nice touches like auto-generating Stored Procedures statements for DML, and the GUI is fast and clean. The free version could prove to be a SQLYog replacement.

Web based tools (I don't use them myself):

I didn't even try these ones out, since the website and the screenshots/descriptions didn't convince me I should:

That's really a lot of different tools. I hope I got all the important ones here.

Now, you can be one of them Open Source hippies and say that "It's amazing how the open source world nurtured an entire thriving ecosystem of wonderful free and commercial software around MySQL to enhance productivity and promote cultural diversity for the good of the world and preservation of endangered species" and all that stuff.
But if you're like me, you'd prefer there to be just one tool, that does everything just right. I know I do. Perhaps it's because I was born in communist Russia and I like the unified-centralistic approach to things? Perhaps it's because none of them is really feature-complete, stable and usable the way I'd like it to be? I guess we'll never know.
SQL Server Management Studio - I miss you.

2007-12-08

Why not to learn MySQL?

There Ain't No Such Thing As A Free Lunch.
The price of a product, is only part of it's value. While not paying for the server itself, you end up paying more in development costs, as expected. It all makes perfect sense, and I'm certain everyone reading these words understands this without me explaining the obvious.

What I do intend to mention, are the specific shortcomings and limitations of MySQL, mostly compared to SQL Server - as this is the background I'm coming from.
Let's divide this to development tools, the database itself, and other stuff.

Development Tools
MySQL AB provides: MySQL Administrator, Query Browser and now the new MySQL Workbench.
There is also a myriad of 3rd party development tools for MySQL, but after getting used to the comforts of SQL Server's Management Studio, I can't say that any of them is truly on par with the Microsoft standard of usability. This is a lot better than Oracle's way, which is providing no tools at all except the dreaded SQL*Plus and relying on 3rd party products.
I intend to go in depth about development tools for MySQL in a separate post, so lets leave it at that.

Database Engine
MySQL in many ways is still behind the big commercial databases.
Most notable drawbacks/limitations, which no one would know when first researching about MySQL:

  • Partitioning is nice in theory, in practice it's very basic.
    • Can't move partitions between tables.
    • Can't disconnect partitions from a table.
    • No parallelism in queries over partitioned tables.
  • Stored Procedures are nice in theory, while badly implemented.
    • Can't store comments in Stored Procedures (to be more precise, you can, but when you dump & restore a database - you lose them, so it's a bit pointless).
    • T-SQL "compiler":
      • Requires changing delimiter to be able to load a stored procedure code. To me, this looks like a nasty hack. If SQL Server can do it, so should MySQL.
      • Most of the compilation errors are the "general error - RTFM" type, which point you to the manual - and most of the time doesn't even tell you which row of the stored procedure is the problematic one.
    • No built-in debug ability, unless using development tools that do clever tricks.
    • All the stored procedure code is stored at the "mysql" schema, and not in the correct database.
  • No computed columns support, not even talking about persistent computed columns.
  • Subqueries are nice in theory, while still badly implemented.
    • Most subqueries do just what they're told, and not translated by the optimizer into join-like queries, resulting in poor performance.
  • Explain plans are hard to read and understand - compared the the beautiful explains that the SQL Server tools generate.
  • No parallelism in the query level - each query is one thread.
    • Bad for data warehouses - won't help that big GROUP BY query to have 1 CPU at 100% while the others are idling.
    • Higher latency for heavy queries.
  • Simplified security model
    • No schema vs. database differentiation
    • No groups.
  • Any schema change requires rebuilding entire table - even if it's a simple metadata change. Adding a simple nullable column is as heavyweight as rebuilding the entire table.
  • No online table maintenance - table is locked during index rebuilds (online table maintenance is new in SQL Server 2005 Enterprise, so this is forgivable).
  • Can't build index using sort - adding a new index is slow.
  • Does not support hash joins.
  • Can't detach & attach InnoDB databases between instances (although it is possible with MyISAM)
  • The FEDERATED table type is completely broken as I see it: the code executed on the remote server is horrible, the feature is poorly documented, transactions do not work. Don't expect an efficient "Linked Server" implementation like in SQL Server.
  • InnoDB tables can't be Clustered (or Index Organized in Oracle-speak) by any other than the primary key. So you can't cluster a table by a arbitrary DateTime column, for example.
  • Replication is far from being in the same level of flexibility as SQL Server's.
  • Backup is problematic if you're used to SQL Server
    • Backups by mysqldump are slow and are just a list of SQL statements to recreate the database. Nothing like the binary backup files in SQL Server.
    • No hot backup at all for MyISAM (you can have "warm backup" if you stop the writes and flush data in memory to disk)
    • No built-in hot backup for InnoDB, unless you pay for Innobase's backup solution.
    • Restore is not a binary restore of the database files, but a full reconstruction of the entire database. This is faster than you might think, but still very slow.
    • There's a good explanation of the problems here at the Zmanda Team Blog.
Other
  • Books Online/MSDN is amazingly better than the MySQL online manual. One of MySQL Enterprise's big features is access to a high quality Knowledge Base - 'nuff said.
  • No neat bundled features like Analysis Services, or Reporting Services.
Well, I think this is about it.
This post seemed like a rant when I read it all over again - maybe because those are things I was expecting, and was surprised to be missing/broken/bad in a product which sounded so perfect when reading the "New Features in MySQL 5.1" type executive summaries.

So, consider yourself warned - and learn MySQL anyway, you know you should.

2007-12-07

Why learn MySQL?

I've been a DBA for SQL Server databases since 2003, but always wanted to learn MySQL.
There is something cool in the fact that it's Open Source. Well, "cool" is actually a nice definition, and also is the all time favorite "warm fuzzy feeling", but if you think about it, it breaks down into the following (at least for me):

From the DBA/Developer perspective:

  • You know you have nothing to worry about when you're installing another server.
  • All the features are available - you don't need to make sure that you have that Visual Studio Ultra-Duper edition with Super-Amazing addon license.
  • The bug-tracking system is ugly, but it's there out in the open, the way things work is transparent and you really feel that you can contribute to the product (my first critical bug in MySQL!).
  • You grow with the software - once something new is added, it's there, forever. You don't have the "Enterprise Edition vs Standard Edition" comparison tables.
  • You get to learn the Linux ecosystem. For someone like me who came from the Windows world, this is a big plus.
  • If you can do with Open Source what other people can do with proprietary - some of the arbitrage between the business cost of the two solutions goes your way.
From the Management/Business perspective:
  • It's free.
  • Really, it's free! The more servers you have, the more money you save. For example, one CPU (not core) license for SQL Server Enterprise is around $20,000.
  • You get to say you're just like all the other Web 2.0 startups, and use MySQL. Hey, even the big mighty Google uses MySQL.
  • It's free.
I'm ignoring the MySQL Enterprise offering, since the way i see it, it's really a support contract. You give money, and get a more orderly flow of version updates, some monitoring tools, and support. None of those you get when you pay for SQL Server or Oracle basic licenses.

In the next post - why not to learn MySQL.