2008-11-03

Emulating MySQL’s GROUP_CONCAT() Function in SQL Server 2005

Sometimes, the small things make all the difference. Last month, a friend of mine that works at Intel asked me how to do something seeming easy – concatenate the rows eliminated during a GROUP BY and present them in a column of their own. This is actually something typical when trying to do all sorts of reports, and just makes thing easier for everyone.
The answer? GROUP_CONCAT of course. This could have been easy. Alas, they were working with SQL Server 2005, which does not have such a function.
Now, all of the above isn’t exactly news – there are tons of articles and blog posts about it on the web. Some give you the code for special UDFs and some give you solutions that use cursors. A 3 year old post by Xaprb suggests using local variables.
I suggest a different approach based on a “hack” which utilizes the new xml functions in SQL Server 2005 to concatenate the values in a column.
It goes like this:
SELECT my_column AS [text()]
FROM   my_table
FOR XML PATH('')
This gives you the concatenated values of all of the values of this column in the table. From here on, it’s all just simple SQL. I have made two complete examples, in the order I’ve tried them. I’ve used the information_schema tables, since they exists everywhere.
First intuitive version:
SELECT table_name,
       LEFT(column_names,LEN(column_names) - 1)   AS column_names
FROM   (SELECT table_name,
               (SELECT column_name + ',' AS [text()]
                FROM   information_schema.columns AS internal
                WHERE  internal.table_name = table_names.table_name
                FOR xml PATH ('')
               ) AS column_names
        FROM   (SELECT   table_name
                FROM     information_schema.columns
                GROUP BY table_name) AS table_names) AS pre_trimmed;
Second version (admittedly inspired by this post, which I stumbled on after writing the first version):
SELECT table_name,
       LEFT(column_names,LEN(column_names) - 1)   AS column_names
FROM   information_schema.columns AS extern
       CROSS APPLY (SELECT column_name + ','
                    FROM   information_schema.columns AS intern
                    WHERE  extern.table_name = intern.table_name
                    FOR XML PATH('')
                   ) pre_trimmed (column_names)
GROUP
 BY table_name,column_names;
The second CROSS APPLY version is supposedly slower according to the execution plan, but I didn’t conduct any benchmarks at all.

I hope this saves someone several hours of work :)

2008-10-19

Problems Uninstalling MySQL Connector .NET

This is a short post that might save someone some valuable time, if Google decides to rank it high enough.

I've tried to install a newer version of the MySQL Connector .NET, namely 5.2.3 instead of the old 5.1.3 I had installed.
When trying to install 5.2.3, I got this error message:


Apparently the connector does not support upgrades from 5.1.x to 5.2.x. We should just remove the old one.

Here lies the problem: when I tried removing the old 5.1.3, I got a weird error of which I took no screenshot. It consisted of a blank error message showing a computer screen with a icon of a moon on it. Something resembling a "sleep mode" icon. Huh?


Anyway, to make a long story short, nothing I did to remove the 5.1.3 gracefully worked. I tried using the old .MSI file, tried reinstalling, changing the install configuration, editing registry keys... I also tried using msizap.exe. Nothing worked.

The solution was to use this tool from Microsoft - called "Windows Installer CleanUp Utility". It's supposed to be a wrapper for msizap, but it did the job, while msizap did not.

Regardless of MySQL, this is a good tool to be familiar with. It removed 3 other entries in my "Programs and Features" list that refused to leave by themselves.

Update: I've had to solve this problem a few more times since posting. As it seems, this "Windows Installer CleanUp Utility" is basically a wrapper for msizap, so it probably used some other parameters that made it work. In addition, I've encountered a time where none of the tools worked, but a registry "Seek & Destroy" for "MySQL Connector" did the job. Go figure...

2008-07-25

Recommendation: PuTTY Connection Manager

Linux desktop users: this post is probably not for you. You've got like a gazillion of these tools available.

Windows users: If you're like me, and use PuTTY all the time to manage your Linux MySQL servers, you'll appreciate this gem. It's called PuTTY Connection Manager and it, obviously, manages PuTTY connections. Not a very creative name, but a fine piece of software.

It's actively developed (latest alpha version out about two months ago), and boasts these features (taken from the developer website):
  • Tabs and dockable windows for PuTTY instances.
  • Fully compatible with PuTTY configuration (using registry).
  • Easily customizable to optimize workspace (fullscreen, minimze to tray, add/remove toolbar, etc...).
  • Automatic login feature regardless to protocol restrictions (user keyboard simulation).
  • Post-login commands (execute any shell command when logged).
  • Connection Manager : Manage a large number of connections with specific configuration (auto-login, specific PuTTY Session, post-command, etc...).
  • Quick connect toolbar to quickly launch a PuTTY connection.
  • Import/Export whole connections informations to XML format (generate your configuration automatically from another tool and import it, or export your configuration for backup purpose).
  • Encrypted configuration database option available to store connections informations safely (external library supporting AES algorithm used with key sizes of 128, 192 and 256 bits, please refer for the legal status of encryption software in your country).
  • Standalone executable, no setup required.
  • Completely free for commercial and personal use : PuTTY Connection Manager is freeware.
 Things that I personally liked about it:
  1. It's fast, since it's only a reasonably lightweight wrapper for the already fast PuTTY.
  2. It can save a portable connection database in a file, unlike PuTTY, which uses the registry for it's connections.
  3. It looks and feels like the Visual Studio window manager, so you can arrange and dock your PuTTY windows in all sorts of creative rectangular ways.
  4. It is relatively feature rich, but it doesn't get in your way, and just works most of the time.
  5. It's completely free.
Get it here.

Alternatives:

If you don't like it, there is another alternative that I personally like less for SSH-only work, which is called mRemote. It is much more advanced, has support for VNC and RDP, but I prefer the simpler and more flexible interface of PuTTY Connection Manager.

2008-06-17

How to Truncate All or Some of the Tables in a MySQL Database

Truncating all tables in a database is a common problem which arises a lot during testing or debugging.
One of the most common answers to this question is to drop & recreate the database, most likely utilizing the shell. For example, something like this:

mysqldump --add-drop-table --no-data [dbname] | mysql [dbname]

This dumps the entire schema structure to disk, without dumping any data, and with commands for dropping existing tables. When loading it back into mysql, it essentially truncates all the tables in the database. Basically, this is a decent solution for many uses.

We had a requirement for a solution that needed these additional features:

  1. Does not require shell (We work with both Linux and Windows)
  2. Resides inside the MySQL Server (To minimize outside dependencies - for example - the mysql command line client)
  3. Can truncate only specified tables using a some sort of filter

In other words, the main requirement here is encapsulation and simplicity of use for the developers.
Basically, this means I've got no other choice except writing a stored procedure for this purpose, ugly as it may be. Specifically, this will require using the pseudo-dynamic SQL in MySQL (more about this later) and a server-side cursor.

The code of the Stored Procedure:

DELIMITER $$
CREATE PROCEDURE TruncateTables()
BEGIN
DECLARE done BOOL DEFAULT FALSE;
DECLARE truncate_command VARCHAR(512);
DECLARE truncate_cur
CURSOR FOR /*This is the query which selects the tables we want to truncate*/
SELECT CONCAT('TRUNCATE TABLE ',table_name)
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME LIKE 'prefix_%';
DECLARE
CONTINUE HANDLER FOR
SQLSTATE '02000'
SET done = TRUE;

OPEN truncate_cur;

truncate_loop: LOOP
FETCH truncate_cur INTO truncate_command;
SET @truncate_command = truncate_command;

IF done THEN
CLOSE truncate_cur;
LEAVE truncate_loop;
END IF;

/*Main part - preparing and executing the statement*/
PREPARE truncate_command_stmt FROM @truncate_command;
EXECUTE truncate_command_stmt;

END LOOP;
END$$

DELIMITER ;

This is a generalized version, you can (and should) modify it to suit your needs.

Regarding dynamic SQL. Initially I've tried to make this query a lot simpler, and use something along the lines of the previous blog post. This would look something like:

DELIMITER $$
CREATE PROCEDURE TruncateTables()
BEGIN /*Build a long user variable containing the varchars*/
SELECT GROUP_CONCAT('TRUNCATE TABLE ',table_name SEPARATOR ';')
INTO @truncate_command
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = DATABASE() AND TABLE_NAME LIKE 'prefix_%';

SET @truncate_command = CONCAT(@truncate_command,';');

PREPARE truncate_command_stmt FROM @truncate_command;
EXECUTE truncate_command_stmt;

END$$

DELIMITER ;

The main (and unsolvable) problem with this code is that the PREPARE cannot prepare more than one statement. In other words, while in other databases you could have sent an entire batch (real dynamic SQL), in MySQL the only way to do this is to run statement by statement. On the other hand, there is no GROUP_CONCAT() in other databases (even though it can apparently be created with ease in PostgreSQL).

The second and less worrying problem is the fact that by default, GROUP_CONCAT() result tends to be limited to a certain length. This is easily solved by setting group_concat_max_len to something higher than default.

End Transmission.

2008-06-12

How To Perform a Connection Massacre

Occasionally, you find yourself in the need to kill problematic connections to the database.
Usually if it's only one or two connections, you can use the combination of "SHOW PROCESSLIST" command to identify the problematic connection ID, and run a "KILL ID" command.

What do you do if you need to kill 10 connections? Or 56? I wouldn't want to type in all those kill commands, it's just dirty work. What we need is a more neat manner to perform those kills. Mass kill, if you wish.

Alternative way: use the INFORMATION_SCHEMA's PROCESSLIST table, to construct the kill statements semi-automatically.

SELECT CONCAT('kill ',id,';') AS kill_list
FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE command='Sleep';

This select will return something like this (when using the command line client):

+-----------+
| kill_list |
+-----------+
| kill 28; |
| kill 1; |
+-----------+
2 rows in set (0.04 sec)

You can also use GROUP_CONCAT() to get the kill commands in the same line, which may be more useful when you can't copy&paste easily:

SELECT GROUP_CONCAT('kill ',id SEPARATOR '; ') AS kill_list
FROM INFORMATION_SCHEMA.PROCESSLIST
WHERE command='Sleep';

Returns:

+------------------+
| kill_list |
+------------------+
| kill 28;,kill 1; |
+------------------+
1 row in set (0.03 sec)

Note, that you can use any filter in the WHERE clause, such as "WHERE db IN('dbname1','dbname2')", "WHERE user = 'appuser'" or "WHERE time>600". If you got any more clever uses, feel free to post them in the comments.

Now, all needed is to copy&paste those kill commands into the mysql command console (whichever you're using: GUI, command line, etc) and you're done.

Since most programs are kind enough to format the output in some way that prevents convenient copying/pasting, here's another tip. The mysql command line client can be asked to strip some of the output by using the "-s" optional parameter, which stands for "silent". You can use it once, and remove the ASCII art, timing and row count, and if you use it again (-s -s), you only get the actual rows.

An example doing just this (copy&paste friendly!):

mysql -s -s -e "SELECT CONCAT('kill ',id,';') AS kill_list FROM INFORMATION_SCHEMA.PROCESSLIST WHERE command='Sleep';"

Gets you this (with different numbers of course):

kill 28;
kill 1;

Another way to do this, is to use SQLyog or any other GUI for MySQL, which can output the results in text. Most of them have this as a configurable option (Ctrl-L keyboard shortcut in SQLyog).

Hope this is will save someone some precious time.

Update:
I didn't include a suggestion on doing this automatically, since I personally prefer to run kill commands manually most of the time. Plus, some of the comments gave me an idea. Why not just pipe the stripped kill_list into mysql?
mysql -s -s -e "SELECT GROUP_CONCAT('kill ',id SEPARATOR '; ') AS kill_list FROM INFORMATION_SCHEMA.PROCESSLIST WHERE command='Sleep';" | mysql

2008-05-31

Taking Cache Warm-up Time Into Consideration

Last week there there was a short scheduled downtime for the system, and I've used the chance to upgrade the MySQL servers to a newer version. We've tested it for more than a month before deploying to production, so I wasn't worried about any potential problems.

The upgrade itself went smoothly and was complete in about less than 10 minutes. Speaking of which, this is one of the things I like in MySQL upgrades as opposed to SQL Server.

  1. Copy new version
  2. Shut down the server
  3. Rename directories
  4. Start server
  5. Run mysql_upgrade
  6. Restart server

All of the above usually takes about 2 minutes, and can be reverted quite easily, if you have backup for the data files. This is compared to a SQL Server upgrade, or even a Service Pack install which includes running an installer for what can be 20 minutes or more. Uninstalling is equality lengthy, though both are accompanied by nice and useless progress bars.

After the upgrade was complete, we started the application and it wouldn't work - it was reporting timeouts. Why? Was there a problem with the upgrade? Some bug we haven't found out about? The circumstantial evidence pointed directly to the upgrade as the problem. After some digging by Pasha, the problematic query was found - it was a group by query updating some counter table. It ran in under 5 seconds before, but now it just wouldn't finish in the 30sec timeframe before timing out.

This immediately pointer to the real cause of the problem: the server restart. The buffer cache was flushed, and the index scanned by this query was not in memory. Fixing was easy, I ran the problematic query manually (it took almost 2 minutes to run). After this, the application worked again. Problem solved? Not really.

Queries of this sort should almost never exist in a system. This specific query was long since been noticed due to proactive slow-log monitoring. This problem was (ironically) fixed on the day of this incident, but the newer version was not yet deployed.

Conclusion: Think of how your application works with a cold database cache. If you need to warm it up, do so. A better path is to design the database more efficiently and avoid certain types of queries. In this case, the counter table was now updated on batches of inserts and deletes - instead of periodically.

2008-04-26

Using the MySQL Event Scheduler to Emulate Threads

The MySQL Event Schedule is one of the new features in MySQL 5.1. It is explained well in this article from the MySQL site, and in the manual itself. I'd like to demonstrate an interesting way to use it. What gave me this idea is actually a feature in SQL Server 2005, called Service Broker. It's a sort of queue manager / messaging platform, which can also be exploited in the same way.

Lets say you are writing a stored procedure, that loads data into the database. Benchmarks show that loading data in parallel from several threads increases overall throughput, as MySQL is unable to utilize several CPUs to process a single SELECT query, moreover a bulk load operation. To work in parallel, you'd probably need to write a script that uses threads, open several connections to the database, and perform those queries in parallel.

I propose an alternative.

The MySQL Event Scheduler is capable of performing several tasks at once, since it's highly probable that several events might need to run at the same time. Why not use it?
What we will create, is a "self-destructible" event, that performs a task once and disappears. The syntax that allows this is self explanatory:

CREATE EVENT IF NOT EXISTS event_name ON SCHEDULE AT NOW() ON COMPLETION NOT PRESERVE
DO statement_or_batch;

I don't think this is the place to descend into the implementation detail of using it for data loads, so I won't. Note, however, how flexible it is.

  1. You could use the event scheduler to perform several SELECT statements in parallel into some summary or union table, and collect the returned result afterwards.
  2. To see if an event is still running, you can poll the mysql.event system table for the event name, or names (when waiting for several events to complete).
  3. Combine it with the FEDERATED engine. That is one feature which is seriously in need of more development efforts. It is not even included in 5.1.24 due to severe problems. When it goes back to a stable status, you could use events to execute remote queries in your cluster in parallel - with a single command, from a single server.
  4. Use it for an unordered set of commands that your application shouldn't wait for at all, or it doesn't care when it gets done.
  5. Use it to delay execution, just replace NOW() with NOW() + INTERVAL 5 MINUTE, as demonstrated earlier.

The possibilities are endless. Well, not endless, but I'm sure that the event schedule can be tricked into doing a lot more cool things than these... If you can think of anything of that sort, I'd love to hear it.

Some points:

  1. As the mysql.events table is using the MyISAM engine, I'm not sure how transactional adding an event is. Most likely, it isn't.
  2. If you want to test the scheduler, make sure it's turned on (SET GLOBAL EVENT_SCHEDULER = ON).
  3. Each event run adds two lines to the error log (hostname.err) file reporting start and end of the event. This might be a problem similar to leaving the general log on, when using events extensively.

2008-03-20

Two MySQL Monitoring Tools and a Hidden Knowledge Base

This post is incomplete without mentioning the previous one about development tools, so make sure you read it before/after this.
In this post, I'd like to mention two monitoring tools, and one knowledge base.

Two of the tools are made by Quest Software/ToadSoft that make Toad for MySQL. Both of them are Freeware.
A side note: TOAD was born as "Tool for Oracle Application Developers". Because "Tool for Oracle Application Developers for MySQL" sounds weird, TOAD is known officially as "Tool For Application Developers".

Spotlight for MySQL
Spotlight is a tool perhaps more known among SQL Server and Oracle users. It shows you the entire flow of data throughout your database in a nicely laid out executive-summary kind of way. It has lots of colors and it's impressive to look at. When it works. Seriously, I've had a lot of trouble with it, to make it work the way it should too many times. It's also slow as hell, and tends to hang or crash, like other tools by Quest. But, when it does work - it's very nice. Quest released this tool for MySQL as well, and since it's free it doesn't hurt to try it out.

Knowledge Xpert for MySQL
This is basically a well laid-out knowledge base, of what Quest Software thinks you should know about MySQL. Full of useful examples, and clear explanations for things that the MySQL manual doesn't mention. I recommend installing it and taking a look, you might find something interesting. It was helpful to me when I started learning MySQL.
Warning: practically this is just one big help file, but Quest decided to bundle an annoying interface, a slow installer and one of those load-on-startup-take-memory-do-nothing-system-tray-icon which you need to get rid off after installation.

The next one is made by Webyog, (the same guys that make SQLyog which I mentioned before).

MONyog
MONyog is a small program with a web front-end which can monitor several MySQL servers and supports some customizable metrics. I won't go into detail on the good and bad things in MONyog here (I plan to write a review of it later on), but I do suggest you try the trial version. MONyog is not free, and doesn't have a free or community "lite" version.
At the time of writing the latest stable version is 1.52, but 2.0 beta 2 is already available and brings additional features like historical reports and better graphs.

2008-03-09

SQL Server Data Services Coming Soon

Microsoft decided to jump into Amazon's territory and provide a data storage web service accessible from anywhere on the internet.
It's not live yet, and you have to register to see the beta when it comes - but it looks interesting.

Take a look at the FAQ. Looks interesting. Note the following points (taken from the FAQ):
  • Benefit: Scale, with virtually no restriction on storage
  • Target Customers: Running applications that have limited or batched data access.
  • Typical scenarios include business solutions like HR services, healthcare records management, data archiving and internet facing applications like social networking and picture sharing.
  • The SQL Server Data Storage Service can store multiple types of data, from birth to archival. Users can upload and query structured data, semi-structured data, stored in flexible entities, and unstructured data. Customers will be able to associate entities with large unstructured data objects (blobs) which could be accessed as URL addressable resources.
  • Data is stored in large storage clusters in various Microsoft data centers located across North America. We are planning to offer the service from international locations such as Europe and Asia. Users can group their data into authorities, which are affiliated with specific data centers and therefore provide control over the geographic location of the data.
  • Easy access with REST and SOAP protocols
There is not much information available about it yet, so I'll leave my speculations for later.

2008-03-07

Formatting Large Volumes Using FAT32

First, some background (Executive Summary below)

During this weekend I purchased a Western Digital My Book Home Edition 500GB. This is the triple-interface model that includes FireWire 400 and eSATA in addition to USB 2.0.
I wanted the FireWire model since it's faster than USB 2.0, and connecting a FireWire drive in a Mac is the same as connecting a USB one. Just perform these two easy steps - connect cable, wait 3 seconds.
Anyway, this is just a prelude to the story, since this drive came pre-formatted with FAT32 to suit both Mac and Windows computers, so I didn't have to format it. Western Digital also sells some drives pre-formatted with HFS+ and some with NTFS, by the way.

The next thing I wanted to do after setting up the My Book was to format my old 60GB NTFS external drive using FAT32, so that I could use it with both computers.
I copied it's contents over to the 500GB drive (OS X has read-only access to mounted NTFS volumes) and plugged it into a Vista machine to format it as one big FAT32 volume.
This is when I found out that Windows Disk Management allows you to format drives larger than 32GB using only NTFS.
So I can use a 500GB FAT32 drive in Windows, but I can't format a 60GB one. This also means by the way, that if you re-partition a 500GB drive you bought, and you don't know how to bypass this limitation, you're basically doomed to creating ~15 32GB partitions to use your drive, unless of course you use NTFS.

I've done my research+trial+error and eventually used my Macintosh to format the drive for Windows. Sort of poetic justice.
The 500GB monster eventually got converted to HFS+ since it's about 2 times faster in writes compared to FAT32. This is no official benchmark: I just monitored the HD read/write sustained speeds while copying files during setup.

Executive Summary

Windows XP/2000/Vista don't allow creating FAT32 partitions larger than 32GB. This is to promote NTFS, and because FAT32 becomes slower as volume size increases. Reading/Writing of any volume size is possible.

If you want to format a drive larger than 32GB with FAT32, you have the following choices:
  1. Run the Windows XP installer from the CD - just make sure you quit it in time just after "preparing" the drive for installation.
  2. Use any Linux bootable CD - fdisk can do it. If you can boot into Linux, you can also Google for instructions about using fdisk so I'll stop here.
  3. Use the free FAT32 Formatter. Small windows utility that does what its name implies. I didn't use it myself, but it should do the trick.
  4. Use your Mac. The Mac "Disk Utility" can format FAT32 of any size.
    Open Disk Utility, select the volume, click erase, and select "MS-DOS (FAT)" as the file system.
    Leopard also includes the fdisk utility so you can use that one if you are some kind of a masochist.
I didn't want to include detailed instructions, since I assume people should know how to use Google and common sense, but if someone has questions about anything, the comments are open as always.

P.S. I published the previous post by a clicking mistake, deleted it right away, checked that it wasn't there - only to find out that blogger entered 3 identical posts, marked 2 as drafts and one got published anyway.

2008-03-06

Misleading Error Message When Running 32bit Applications on 64bit Ubuntu/Linux

Short version (aka, Executive Summary)

If you try to run a 32bit program under 64bit Ubuntu, there is a chance you might see this misleading error message:

-bash: ./executable_filename: No such file or directory

What the shell is actually trying to say is "You can't run 32 bit applications here without installing support for 32 applications!".

Longer version

We purchased the Webyog's MONyog MySQL monitoring tool, after some trial version testing in a Windows environment. When trying to run it on one of our Ubuntu servers, I got this:

myuser@myserver:~/MONyog/bin$ ./MONyog-bin
-bash: ./MONyog-bin: No such file or directory

I checked again: the file was there, permissions were okay. Hmm. Why was bash saying the file wasn't there? Here it is! It's not like the error came from the application, it was bash itself saying that it's not found.
I googled for a while till I opened a support ticket at Webyog, and this is how I found out the real reason. The solution is to install support for 32 bit applications which is package called ia32-libs-sdl (which also installs a number of prerequisites). Once it's installed, the program works.

I'll use this opportunity to praise Webyog's support, which was fast and to the point. I got my initial reply with an explanation after few hours, and the solution the next day - without having to chase them for it.

2008-02-28

MySQL Date and Time Arithmetic

One thing that sometimes bugs me in MySQL is the huge amount of errata. The database is packed with features and abilities that are side-effects from "historical decisions". Decisions that were made when it wasn't planned to be an enterprise product, or changes made to ease porting applications to MySQL from other databases. Legacy, so to speak.

The timestamp data type is a good example - the timestamp manual section is a long list of exceptions and special cases. I usually re-read the manual page before using it. Another good example is the amount of date and time functions that exist in MySQL. There are way too many of them, some overlapping in function, and some just exist as aliases to other functions for compatibility. Just go over the list.

On the other hand, this artistic freedom has its merits.
For example, take the date arithmetic. Chances are that if you came from SQL Server background like me, you were used to all the variations on the DATEDIFF and DATEADD commands. While it works as well (the DATEADD function is there and works almost the same way), MySQL supports a much readable form of date and time arithmetic.

For example:

SELECT NOW() - INTERVAL 5 MINUTE

Will return, naturally, current date+time minus 5 minutes. Or:

SELECT NOW() + INTERVAL 10 DAY

You guessed it, this will return the current date+time plus 10 days.

Clean and simple, isn't it? The syntax goes like this:

date + INTERVAL expression unit

The unit can be MINUTE, HOUR, DAY, MONTH, YEAR etc. The expression can be any legal T-SQL expression, and the date is just the date variable, which can be a result from another expression, a variable or a table column. Check out the manual for more examples.

This is an elegant solution to a common problem, which is also easy to remember and use. Why so few people use it is a mystery to me - I've almost never seen it in code samples.

2008-02-21

Hello Planet MySQL!

My blog been added several days ago to the Planet MySQL feed, and am now one step closer to world domination.

I'll assume that most people who see this entry at their feed readers would be unfamiliar with this blog, so I should be including an introduction here. Instead, I invite you to take a look at the home page and look at some olds posts of mine that might interest you. Not everything is about MySQL, and the Planet MySQL feed takes only the ones that I've labeled as such.

The following posts are the top search engine keywords that this blog gets, so you might want to start there:

I wonder how many comments of this sort I'll get?

What do you want me to do?  LEAVE?  Then they'll keep being wrong!

2008-02-18

Mac OS X: Neverending Windows Bashing

This is a screen capture I made just now of a standard network browsing feature - this is how Mac users see Windows computers on the network.


Gotta love the "preview" of how a user experience on a Windows computer feels like: old CRT screens and BSODs.

All this "Macs are better" elitism is amusing at times, but very immature. This Penny Arcade comic really got it right.

2008-02-17

MySQL 5.1.23 Regression Bugs

I've started testing the new Release Candidate (5.1.23) today. This was a version with a huge amount of fixes relative to latest MySQL releases - but I didn't expect to run into two bugs this fast.
  • Bug #34626 which I just filed in today and was already verified. This is a basic scenario which is used in the deployment scripts we use at Delver. Just try and pipe an empty file into the mysql command line client - and it crashes.
  • Bug #32633 which was apparently fixed in 5.1.24, and should have also presented itself in 5.1.22, but didn't for some reason. The upgrade brought it up.
    This one basically means that you just can't create stored procedures at all, unless you clean the sql_mode variable from certain tags. Which was really frustrating to find. I almost filed another bug but managed to find someone who already filed a duplicate 6 days ago.
    The SQL Mode is a session/global variable that determines which SQL compatibility mode the current session works as. More details here if it interests you.
I could say that I'm doing the community a favor by testing products or whatnot, but I'd just prefer the QA procedures at MySQL AB to be a bit stricter - I consider these as really trivial scenarios.
Conclusion: new versions fix old bugs and bring new ones. Trust no one. Test for yourself.

2008-02-15

Recommendation: ImgBurn

I'd like to recommend a fine piece of software which I found to be the best program to use for casual CD/DVD burning. I use it exclusively for more than a year. It's fast, simple and functional. Moreover, it's completely free.
So now that you know, please stop using that cracked version of Nero. It's not as good as ImgBurn anyway.

2008-02-09

Assumption is The Mother of All Fuckups - Firefox Password Manager

Consider the following:
  1. Are you using Firefox?
  2. Do you use the "Save Password" feature for web forms?
If you answered yes to these questions, you are at risk. Not the kind of "cool buffer overflow" risk, that I personally ignore most of the time, but the kind of "anyone knowing this sitting near my computer knows all my passwords in 2 seconds" risk.
You are probably assuming, that some geek security freak at the Mozilla Foundation made sure your passwords are safe, encrypted one-way into a unbreakable AES/MD5/SomeHashBuzzTLA.
Well, they are not.

Follow these steps:

Open the preferences screen in Firefox (this is the OS X version, but the Windows version is just the same)

Click on "Show Passwords"
We're not there yet. Click "Show Passwords" one more time (it can't be that easy), and get the following:


I blurred the images for obvious reasons, but you can guess how it looks - the entire site/username/password list is there in clear text.

I know this "feature" is well documented if you bother to look it up, and it can be somewhat mitigated if you place a master-password over the configuration. Still, I find it unacceptable for a browser to behave this way by default.

My recommendation - assume passwords saved in Firefox are compromised to begin with, and only save passwords for sites where you don't care if someone knows the password.

2008-01-29

Opaqueness and the Illusion of Greatness

When you don't know how things work, you image them as how you would want them to be, or according to an impression based on a random fact or advertising.

There is this old joke that goes like this:

When I was a child, I slept well since I knew someone was guarding me.
When I was in the army, I didn't sleep since I was the one guarding.
When I left the army, I didn't sleep since I knew who was guarding me.

If you served in the Army, I presume that you already know the joke and/or can relate to how sad it really is.

"What has this got to do with anything?" you wonder? When using open-source vs. closed source software, this is sometimes the case.

I got it when I was reading Jay Pipe's book "Pro MySQL". He mentions there that although the manual describes different parts of MySQL as different and encapsulated components, working beautifully together - in reality, it's all just a big pile of code, tightly coupled together and not always as modular as you'd like to imagine. He also tells of parts of the engine that were apparently written by different teams with different standards, where the code just looks and feels very different from the rest of the system. I didn't dig into the source code myself to this depth, but I believe him.
If you have worked before on the source code of an open source project, this is probably not a surprise for you - but for me it was a sort of a moment of re-enlightenment. The ones writing the database are people just like you and me, and they even write bad code sometimes. Only in MySQL, it's bad code anyone curious enough can see.

I wasn't used to this type of thinking for years, having grown up in the Microsoft closed-source ecosystem. I had no one from SQL Server come to me and tell me that they did some stupid mistakes here and there, and that the code is a mess. Yeah, there were Service Packs change logs with lists of fixed bugs, but I doubt someone in Microsoft would release a statement in the change log that tells how the sort sometimes didn't work (famous bug in InnoDB a while back).
The solid UI, the great documentation and the overall behavior of the database does its best to hide the implementation details from you. Since it was solid on the outside, I assumed it was solid on the inside. It was sort of anti-FUD. I wanted to believe it's as elegant inside as it seems outside.

I said "re-enlightenment" before, since this kind of realization actually did happen to me before, but I didn't let it sink in completely.
I've been working a lot with SQL Server in the past, and if you are familiar with it as well, you know that you can access the source code of all the internal system stored procedures. There are a lot of them, and usually you don't need to dig into their code, unless you're using some cool undocumented trick or trying to find out if something is a bug. I can tell you one thing - much of the code in those stored procedures is extremely ugly. No proper casing/indentation, hard-coded magic numbers everywhere... Maybe later I'll dig up one of these gems to show to the world. I mean yeah, it works, but still... Yuck.

Thing is, this anti-FUD was so strong, I didn't want to believe that it's that ugly on the inside - so I left my "beliefs" then as they were.

Conclusions:

  1. Don't stay in the army.
  2. Bad programmers can write both open and closed source programs.

2008-01-27

Another Nasty Bug Squashed

I was so damn proud of myself to stumble upon a cool critical bug in MySQL, only to find it was a duplicate of another bug discovered 3 days earlier by someone else, and not just someone, but Paul DuBois, the author of no less than 3 books about MySQL and the likes.
My test case was simpler though - judge for yourself (#32376 vs #32260).

It goes like this - Find a MySQL 5.1.22 (or less). Connect. Run the next lines of code. Server is down. Simple, eh?

create table test.t(c int) engine = innodb;

select @var := c as c
from t
order by c asc;

This is documented as a casual "Some uses of user variables in a query could result in a server crash." in the 5.1.23 change log. Speaking of which, there is a new change log for 5.1.24 already, so this means 5.1.23 will be released really soon.

2008-01-22

Comments in Stored Procedures - Fixed in MySQL 5.1.23

I've written before in the big Why not learn MySQL blog post, that comments in Stored Procedures in MySQL are useless.
The reason is simple - the backup/export/dump procedure does dump them into the file correctly, but when loading the file by piping it into the mysql command line client - the comments get stripped. This was a sort of an "optimization" in the client, which is now configurable.

The 5.1.23 change log says so:

mysql stripped comments from statements sent to the server. Now the --comments or --skip-comments option can be used to control whether to retain or strip comments. The default is --skip-comments. (Bug#11230, Bug#26215)

Too bad it's still not released yet, it's in development since September 2007 and the change log is already huge in comparison to the other 5.1.x releases.

2008-01-21

Open heart surgery

Or at least it felt like that. I've just finished upgrading my Mac mini with an extra 1GB of RAM.
The little thing is so damn crammed inside, it's amazing. I mostly just followed this great tutorial - take a look if you're curious how it's done.

Well worth the trouble though, as it cost me half the price compared to the upgrade option at the Apple Store, and I'm even left with two new 512MB SO-DIMM sticks. Anyone needs those?

iTunes on the Mac and Hebrew

Hebrew, It works, bitches. Apparently all the rumors were just another case of FUD.

I'm of recent the owner of a brand new Mac mini 2.0 Ghz, but more on that later. The thing is, I've decided to give a chance to the iLife and iTunes other iBuzzword products.
The moment I imported my library to iTunes, I've noticed it cannot read my Hebrew song names from the ID3 tags. I assumed that according to what I've heard before, the Hebrew support on the Mac is not so good, so I left it as it was. Later on, when browsing through the library, I saw some of the names in Hebrew. Hmm. *Ding*.

Apparently, iTunes works great with Unicode, but not with the Windows Hebrew Codepage, also known as CP1255. It even provides a built-in feature to convert from ASCII in the ISO-Latin1 codepage (and only it), to Unicode. After wandering around the Interweb, I found a nice little utility (aptly named "Unicode Rewriter") that does just that - almost automagically. Choose a source codepage, and you're done. In 15 minutes, I converted all of the songs to Unicode with the click of many buttons and a bit of dragging & dropping. It might even work in Windows, I didn't try it.

Now all I'm left with is the endless task of importing and organizing all of my music.

2008-01-18

D'bah

When people that are not somehow related to computers ask me what I do - I tell them I'm a programmer. This profession has become enough of a commodity in the last years, so that there everyone know what a programmer looks like. The knowledgeable ones ask in which language I do my programming. Then it gets tricky.

-"Well you see, I'm not really a programmer, I'm actually a DBA!"
-"What's a D'bah?"

Now it's down to two options:

  1. Explain what databases are, and what part they take in the products they know and love. Explain why it is different from programming, and why you have a specific person to do it. Dive into the implicit and explicit duties and responsibilities of a DBA, and describe how a person in this spot interacts with the development team and management.
    Usually this a great choice if you want to be left alone or be met with bored looks.
  2. Say "It's like playing with Lego for money." or "It's like herding data."

How can you explain that being a DBA is different from being a programmer? Maybe it shouldn't be? What really defines a DBA's role?
The answer is the typical and classic reply you'll usually get from a DBA: "Well, that depends".

I'll try and go into more detail about this in another post.