2007-12-19

MySQL T-SQL CASE Statement: Bad Language Design

Imagine you want to do something like this in a stored procedure:

SET @var = SELECT var FROM t; --some generic value
CASE
WHEN @var = 1 THEN CALL do_something();
WHEN @var = 2 THEN CALL do_something_else();
END CASE;

And it works fine for a while until you get and error:

ERROR 1339 (20000): Case not found for CASE statement

This happens due to the fact there is no WHEN clause to handle this specific value. It's unlike the regular switch clause logic in C, but it seems reasonable. So I've tried adding an empty ELSE clause that would do nothing:

SET @var = SELECT var FROM t; --some generic value
CASE
WHEN @var = 1 THEN CALL do_something();
WHEN @var = 2 THEN CALL do_something_else();
ELSE --or ELSE; or ELSE NULL; or any combination you might think of
END CASE;

Nope, we get the usual "You have an error in your SQL syntax; check the manual....". Naturally, the manual didn't say anything about it.
First I did some dummy SET @tmp = NULL; assignment to fill the ELSE clause, but then I did some research, and got to Bug #14042.

Short version:

The "correct" way to do this, is like so:

SET @var = SELECT var FROM t; --some generic value
CASE
WHEN @var = 1 THEN CALL do_something();
WHEN @var = 2 THEN CALL do_something_else();
ELSE BEGIN END;
END CASE;

Longer version:

When found this bug, it was a low severity feature request that asked for the reasonable thing - just remove this error check and exit the CASE. Unfortunately, the MySQL decision was to mark it as "Won't fix" and close the bug.

It seemed to me a really twisted logic, but I wanted others to know about it, so I've added a comment on the MySQL Manual page with a short explanation and example.
After about 30 minutes (which is a really fast response time from the moderator) I got an email stating that my comment was removed, and that now the bug is reopened as a "Documentation Bug". The manual is still not updated with this issue, but I guess it will be added sometime soon.

Apparently if something is a really well documented bug with a well known workaround, it becomes a feature.

Update: The bug was closed and the manual page now includes an explanation. Also, it seems that this is not the fault of MySQL at all- this is actually a standards-compliant feature (take a look at the comments).

2007-12-14

Amazon Announced SimpleDB

Amazon just added a new product to their cloud computing offering besides EC2 and S3 - the SimpleDB. It's quite an interesting idea.  It's a key-value store, which can also understand the internal structure of the value column. They have a  new set of names for old ideas. translated from RDBMS speak - tables are "domains", rows are "items", values are "attributes". You can get an item by it's identifier (classic key-value store example) and you can also query the "domain" for items with specific "attributes".

I really wonder how fast it's going to be. They are kind of vague on the performance, except the fact they promise it to be "quick", "fast", "high performance" and "real time", but only if you promise to be good and put everything on their cool SLA-less computing cloud.

Interesting notes:

  • The data is limited to 10GB for the duration of the beta.
  • They mention this interesting fact: "Amazon S3 and Amazon SimpleDB use different types of physical storage.  Amazon S3 uses dense storage drives that are optimized for storing larger objects inexpensively.  Amazon SimpleDB stores smaller bits of data and uses less dense drives that are optimized for data access speed." In other words, they use fast 10/15K RPM SCSI drives for the DB and cheap 7200 SATA drives for S3.

Can we do this using an RDBMS?

Yes, you can. You could mimic this system right now by using an RDBMS, up to a degree.

"Amazon SimpleDB automatically indexes all of your data, enabling you to easily query for an item based on attributes and their values.  In the above example, you could submit a query for items where (color = blue AND description = dress shirt), and Amazon SimpleDB would quickly return item 456 as the result." 
Create a table with ID,XML columns. Scan the XML column by using an XPath query, for matching items. In MySQL this can be achieved even now - albeit slowly, since each such action is a heavy table scan with an XPath expression processed for each row. In SQL Server, there is a new XML index type since 2005, which might do just that. It's slower than regular indices on columns, but it works.

2007-12-13

Hi-Tech Insulation From the Elements

Technology keeps surprising me. image

I spend nearly 4 hours a day in trains/buses or riding my motorcycle to work and back. It goes like this:

  1. Motorcycle or bus if raining, to the train station in Haifa.
  2. Train or bus to Herzelia.
  3. Walk to work for about 1.8km from the train station.

Read backwards for the return route.

Thinking like a DBA, I'm always looking for ways to optimize everything, commuting included. This time I decided I need a piece of clothing that fits my so called "lifestyle", with the following seemingly impossible combination of requirements:

  • Good wind protection (while riding the motorcycle for 10 km a day to the train and back)
  • Very Warm (not to freeze when riding the bike, or when walking to work and back)
  • Very Light (the weather in Israel can change it's mood between morning and evening, so it can't be too heavy to carry in my laptop backpack)
  • Water repellant (for those long walks by the Herzelian highways while it's raining)
  • Stowable (so that it will fit in the backpack) - and that's a real word by the way.

So, after bit of searching (about 10 minutes), I've bought myself this piece of modern technology - The North Face "Red Point Jacket".
It's extremely light and extremely insulating at the same time. It's also extremely expensive. No wonder it's marketed as "ideal for extreme mountain sports". How rad am I?!

Those North Face dudes really know their stuff.

P.S.
I really hate spending that much time on the road - but on the other hand if I didn't, I wouldn't have the time to watch House and to write stupid blog posts about apparel.

2007-12-10

MySQL Development Tools - The Good, The Bad and The Ugly

One of the first things I did when I started learning MySQL is to find a decent GUI tool to administer the server, since I didn't think that the bundled MySQL Query Browser/MySQL Administrator is what most people use for complex work. Luckily, I was right.
There is a huge variety of MySQL development tools - some better and some worse. I'd prefer if I only had one tool that does it all - but if that's impossible, the important thing now is to pick the right one for the job.
Here it is, the (nearly) complete list of 3rd party tools. It took me a while to find out about all of them - so I hope this saves people quite some time there.
Here are the best ones I use on a regular basis, each one is good for different set of tasks:
  • SQLyog by Webyog (commercial, with free community version)
    Best combination - simple, fast, does just what it should and doesn't get in your way. Plus, the community version is free (except a small nag-screen). I use it most of the time and highly recommend it for most uses. Something that really annoys me though is that it looks like it has hanged when running long queries - though it doesn't really.
  • Toad for MySQL by ToadSoft/Quest Software (freeware)
    Ultra feature rich, and free. This one has it all. Very, very, very rich with features, from debugger to custom script generator ala SQL Server DTS.
    The problem: it has it all. Takes time to understand how it all works, and it's overkill for most users. It's completely riddled with bugs. Version 3.1 helped the stability yes still the mean-time-between-crashes is about 30 minutes. I really hope that some future version 3.x will make this one stable. When it doesn't crash, it makes good use of it's multi-threaded architecture, and it's very responsive when executing long queries.
  • MyDeveloper Studio by Devart (commercial)
    High quality, "Visual Studio"-like UI. Fast and efficient. Great solid and fast user interface, pleasure to use.
    We've actually got a license to this one, since it's the best program that includes a built-in debugger, and it was before ToadSoft released a version with a debugger as well. I use it all the time when I'm writing complex Stored Procedures.
  • MySQL Administrator
    It may look simple at first, but this little puppy is very powerful.
    I recommend you download this XML definition file for the health monitor - it really helps a lot to understand what's going on inside your server.
I've tried these ones out, but didn't like them:
  • HeidiSQL (freeware)
    Lacks some advanced features, but stable and friendly.
  • MySQL DreamCoder (commercial, with freeware version)
    Doesn't do anything special, could be used instead of SQLyog I guess.
  • Navicat for MySQL (commercial, with a free "lite" version)
    I didn't like the interface, but overall the program is quite solid.
  • SQL-Front (commercial)
    Seems too rough round the edges. On the plus side, it is one of the few tools I've seen that support partitions. Problem is, if you need partitions your schema's data is too big to be edited using this tool (or almost any other) and you just need a SQL command prompt which any tool can give you.
  • DBArtisan by Embarcadero Technologies (commercial)
    Bit of a bloat-ware. A 165MB install just doesn't make sense to me.
    You can control any database you want, including Sybase, SQL Server, Oracle and whatever. Problem is with tools of this kind is that they tend to be too general and not really tailored to MySQL.
  • Aqua Data Studio (commercial)
    Another one of those control-all-your-different-databases-from-one-tool approaches.
  • Database Workbench (commercial, with lite version coming soon?)
    Functional but not exceptional. This is another tool that can administer all your databases from a single point. Has some nice touches like auto-generating Stored Procedures statements for DML, and the GUI is fast and clean. The free version could prove to be a SQLYog replacement.
Web based tools (I don't use them myself):
I didn't even try these ones out, since the website and the screenshots/descriptions didn't convince me I should:
That's really a lot of different tools. I hope I got all the important ones here.
Now, you can be one of them Open Source hippies and say that "It's amazing how the open source world nurtured an entire thriving ecosystem of wonderful free and commercial software around MySQL to enhance productivity and promote cultural diversity for the good of the world and preservation of endangered species" and all that stuff.
But if you're like me, you'd prefer there to be just one tool, that does everything just right. I know I do. Perhaps it's because I was born in communist Russia and I like the unified-centralistic approach to things? Perhaps it's because none of them is really feature-complete, stable and usable the way I'd like it to be? I guess we'll never know.
SQL Server Management Studio - I miss you.

2007-12-07

Why not to learn MySQL?

There Ain't No Such Thing As A Free Lunch.
The price of a product, is only part of it's value. While not paying for the server itself, you end up paying more in development costs, as expected. It all makes perfect sense, and I'm certain everyone reading these words understands this without me explaining the obvious.

What I do intend to mention, are the specific shortcomings and limitations of MySQL, mostly compared to SQL Server - as this is the background I'm coming from.
Let's divide this to development tools, the database itself, and other stuff.

Development Tools
MySQL AB provides: MySQL Administrator, Query Browser and now the new MySQL Workbench.
There is also a myriad of 3rd party development tools for MySQL, but after getting used to the comforts of SQL Server's Management Studio, I can't say that any of them is truly on par with the Microsoft standard of usability. This is a lot better than Oracle's way, which is providing no tools at all except the dreaded SQL*Plus and relying on 3rd party products.
I intend to go in depth about development tools for MySQL in a separate post, so lets leave it at that.

Database Engine
MySQL in many ways is still behind the big commercial databases.
Most notable drawbacks/limitations, which no one would know when first researching about MySQL:
  • Partitioning is nice in theory, in practice it's very basic.
    • Can't move partitions between tables.
    • Can't disconnect partitions from a table.
    • No parallelism in queries over partitioned tables.
  • Stored Procedures are nice in theory, while badly implemented.
    • Can't store comments in Stored Procedures (to be more precise, you can, but when you dump & restore a database - you lose them, so it's a bit pointless).
    • T-SQL "compiler":
      • Requires changing delimiter to be able to load a stored procedure code. To me, this looks like a nasty hack. If SQL Server can do it, so should MySQL.
      • Most of the compilation errors are the "general error - RTFM" type, which point you to the manual - and most of the time doesn't even tell you which row of the stored procedure is the problematic one.
    • No built-in debug ability, unless using development tools that do clever tricks.
    • All the stored procedure code is stored at the "mysql" schema, and not in the correct database.
  • No computed columns support, not even talking about persistent computed columns.
  • Subqueries are nice in theory, while still badly implemented.
    • Most subqueries do just what they're told, and not translated by the optimizer into join-like queries, resulting in poor performance.
  • Explain plans are hard to read and understand - compared the the beautiful explains that the SQL Server tools generate.
  • No parallelism in the query level - each query is one thread.
    • Bad for data warehouses - won't help that big GROUP BY query to have 1 CPU at 100% while the others are idling.
    • Higher latency for heavy queries.
  • Simplified security model
    • No schema vs. database differentiation
    • No groups.
  • Any schema change requires rebuilding entire table - even if it's a simple metadata change. Adding a simple nullable column is as heavyweight as rebuilding the entire table.
  • No online table maintenance - table is locked during index rebuilds (online table maintenance is new in SQL Server 2005 Enterprise, so this is forgivable).
  • Can't build index using sort - adding a new index is slow.
  • Does not support hash joins.
  • Can't detach & attach InnoDB databases between instances (although it is possible with MyISAM)
  • The FEDERATED table type is completely broken as I see it: the code executed on the remote server is horrible, the feature is poorly documented, transactions do not work. Don't expect an efficient "Linked Server" implementation like in SQL Server.
  • InnoDB tables can't be Clustered (or Index Organized in Oracle-speak) by any other than the primary key. So you can't cluster a table by a arbitrary DateTime column, for example.
  • Replication is far from being in the same level of flexibility as SQL Server's.
  • Backup is problematic if you're used to SQL Server
    • Backups by mysqldump are slow and are just a list of SQL statements to recreate the database. Nothing like the binary backup files in SQL Server.
    • No hot backup at all for MyISAM (you can have "warm backup" if you stop the writes and flush data in memory to disk)
    • No built-in hot backup for InnoDB, unless you pay for Innobase's backup solution.
    • Restore is not a binary restore of the database files, but a full reconstruction of the entire database. This is faster than you might think, but still very slow.
    • There's a good explanation of the problems here at the Zmanda Team Blog.
Other
  • Books Online/MSDN is amazingly better than the MySQL online manual. One of MySQL Enterprise's big features is access to a high quality Knowledge Base - 'nuff said.
  • No neat bundled features like Analysis Services, or Reporting Services.
Well, I think this is about it.
This post seemed like a rant when I read it all over again - maybe because those are things I was expecting, and was surprised to be missing/broken/bad in a product which sounded so perfect when reading the "New Features in MySQL 5.1" type executive summaries.

So, consider yourself warned - and learn MySQL anyway, you know you should.

Why learn MySQL?

I've been a DBA for SQL Server databases since 2003, but always wanted to learn MySQL.
There is something cool in the fact that it's Open Source. Well, "cool" is actually a nice definition, and also is the all time favorite "warm fuzzy feeling", but if you think about it, it breaks down into the following (at least for me):

From the DBA/Developer perspective:
  • You know you have nothing to worry about when you're installing another server.
  • All the features are available - you don't need to make sure that you have that Visual Studio Ultra-Duper edition with Super-Amazing addon license.
  • The bug-tracking system is ugly, but it's there out in the open, the way things work is transparent and you really feel that you can contribute to the product (my first critical bug in MySQL!).
  • You grow with the software - once something new is added, it's there, forever. You don't have the "Enterprise Edition vs Standard Edition" comparison tables.
  • You get to learn the Linux ecosystem. For someone like me who came from the Windows world, this is a big plus.
  • If you can do with Open Source what other people can do with proprietary - some of the arbitrage between the business cost of the two solutions goes your way.
From the Management/Business perspective:
  • It's free.
  • Really, it's free! The more servers you have, the more money you save. For example, one CPU (not core) license for SQL Server Enterprise is around $20,000.
  • You get to say you're just like all the other Web 2.0 startups, and use MySQL. Hey, even the big mighty Google uses MySQL.
  • It's free.
I'm ignoring the MySQL Enterprise offering, since the way i see it, it's really a support contract. You give money, and get a more orderly flow of version updates, some monitoring tools, and support. None of those you get when you pay for SQL Server or Oracle basic licenses.

In the next post - why not to learn MySQL.

2007-12-05

First Post

Well, I had to start a blog one day.

I'm gonna try out some "backup-posts" at first, and see how it goes. You know, the kind of things I'd be happy to find online, when I was searching for them.
At the moment I'm working at Semingo as a MySQL DBA (hence the title), so I'd expect mainly MySQL related professional stuff to appear here for a while. But then - who knows?