Jayson Minard: Yes, I Crashed the Site!

Jayson Minard wrote a very good article on upgrading a production site and what can go wrong and what we can learn from it.

Yesterday, I performed an upgrade to a third-party package used with Zend Developer Zone. It has an automated schema update system which silently performs actions on the database that had a large impact on ZDZ and related sites causing an outage. So, there are good lessons from my post-mortem that I would like to share with the community.

The Start of the Problem

First, let us look at the actual list of actions that started the issue:

1. The upgrade does a schema check on first load

2. The upgrade then corrects the schema to be valid for the new release (performing table changes via DDL)

3. The upgrade then may modify large amounts of existing data, or delete large amounts of old data

These schema and data updates can cause huge potential issues when the database and tables are used concurrently by the online site. First, the DDL changes will lock the affected tables. And for some storage engines (i.e. MyISAM in MySQL) the modifications or deletes will also cause table locks, and in other engines they could cause contention on locked rows, and in other engines cause things like rollback segments to overflow. The online site then waits behind the locks or contention causing threads in the web server to be held until no more threads are left to serve actual user requests. No more threads, no more site.

Learn from the problem; via Yes, I Crashed the Site!.

Getting Started with Zend_Test

Matthew Turland has written a very nice article on Unit Testing using Zend_Test, Zend_Test_PHPUnit_DatabaseTestCase, Zend_Test_PHPUnit_ControllerTestCase where he uses a few interesting solutions. Definitely worth a read.

“I worked on a project recently where we used Zend Framework. As part of that project, I was tasked with writing unit tests. So, I went to the “tests” directory generated for me by the zf CLI utility to get started…”

via Getting Started with Zend_Test | Blue Parabola, LLC.

InnoDB Performance Monitoring with innotop

Manually extracting relevant information from repeated incantations of SHOW ENGINE INNODB STATUS while trying to figure out what InnoDB is doing is not only error prone, it’s just plain hard to do. And since MySQL doesn’t expose the data you really want in an INFORMATION_SCHEMA table (yet?), the option is use an external program to help: innotop.

As luck would have it, in 2006 Baron Schwartz announced his innotop: the most advanced MySQL and InnoDB monitor. And in the time since then it has definitely evolved into a suitable replacement for the 10-year-old mytop.

Install and Use

Edit: Updated version 1.7.x located at http://code.google.com/p/innotop/updates/list

You can download the latest version of innotop from Sourceforge. As of this writing, the latest version if 1.6. Once you’ve grabbed it, installation is like any CPAN module (since innotop is written in Perl).

$ tar -zxf innotop-1.6.0.tar.gz
$ cd innotop-1.6.0
$ perl Makefile.PL
Checking if your kit is complete...
Looks good
Writing Makefile for innotop
$ sudo make install
cp InnoDBParser.pm blib/lib/InnoDBParser.pm
cp innotop blib/script/innotop
/usr/bin/perl "-MExtUtils::MY" -e "MY->fixin(shift)" blib/script/innotop
Manifying blib/man1/innotop.1p
Manifying blib/man3/InnoDBParser.3pm
Installing /usr/local/share/perl/5.10.0/InnoDBParser.pm
Installing /usr/local/man/man1/innotop.1p
Installing /usr/local/man/man3/InnoDBParser.3pm
Installing /usr/local/bin/innotop
Writing /usr/local/lib/perl/5.10.0/auto/innotop/.packlist
Appending installation info to /usr/local/lib/perl/5.10.0/perllocal.pod

There is also a Debian package available from Sourceforge that should work on Debian and Ubuntu.

The Basics

Once innotop is installed, you can simply run innotop. The first time you run it, you’ll be prompted to create a new database connection and name it. With innotop you can have any number of saved named database connections that you can then refer to by name to quickly connect to a server without re-specifying the hostname, username, password, and so on.

If you have a MySQL server running on the same host where you’re testing innotop, you might call the first connection “localhost”.

Enter a name: localhost

Next you’re prompted to enter a “DSN string” which is what Perl DBI uses to represent the database connection parameters.

Typical DSN strings look like
   DBI:mysql:;host=hostname;port=port
The db and port are optional and can usually be omitted.
If you specify 'mysql_read_default_group=mysql' many options can be read
from your mysql options files (~/.my.cnf, /etc/my.cnf).

Enter a DSN string: DBI:mysql:;host=localhost

Then you’re prompted for an optional table name that innotop can use for deadlock detection information, along with a username and password:

Optional: enter a table (must not exist) to use when resetting InnoDB deadlock information: blahtable
Do you want to specify a username for localhost?: y
Do you want to specify a password for localhost?: y
Enter username for localhost: root
Enter password for 'root' on localhost: *****
Save password in plain text in the config file?: y

With all that information, innotop will save a configuration file as ~/.innotop/innotop.ini connect to the server and start showing some high-level statistics that may look something like this:

________________________________ InnoDB Row Operations ________________________________
CXN   Ins         Upd        Read        Del        Ins/Sec  Upd/Sec  Read/Sec  Del/Sec
root  1717192746  141503339  1935029792  146254835   166.42    97.45    257.37     0.00

______________________ Row Operation Misc _______________________
CXN   Queries Queued  Queries Inside  Rd Views  Main Thread State
root               0               0         1  flushing log

_____________________________________ InnoDB Semaphores ______________________________________
CXN   Waits    Spins      Rounds     RW Waits  RW Spins  Sh Waits  Sh Spins  Signals   ResCnt
root  2475263  104367748  344363574   1572553   3573154   3082885   5833069  11083664  7957967

_______________________________ InnoDB Wait Array _______________________________
CXN  Thread  Time  File  Line  Type  Readers  Lck Var  Waiters  Waiting?  Ending?

If so, your innotop installation is working correctly.

Monitoring Modes

Like mytop, innotop can be switched into a variety of monitoring modes based on what you want to focus on. By default, you’ll see mode “R” or InnoDB Row Operations. As its name implies, this mode summarizes row-level statistics such as rows inserted or deleted per second. You can press “?” to see a list of all the monitoring modes and the single letter keystroke used to switch to each of them.

Here are some of the other useful modes:

B: InnoDB Buffers

This view presents buffer pool information, page statistics, insert buffer information, and metrics for adaptive hash indexes. I often check this mode to see what the buffer pool hit rate is as well as how many page reads and writes per second the server is doing.

C: Command Summary

The command summary shows the breakdown of which commands MySQL has been handling. They’re listed from most frequent to least frequent and the display shows total counts for each as well as a percentage. Two sets of each numbers are presented. The first looks at numbers since the server was started while the second only shows numbers from the last polling cycle.

D: InnoDB Deadlocks

If you’re seeing deadlocks with some regularity, this display pulls together the necessary information to diagnose what’s happening, including username, hostname, query, time, victim, and so on.

F: InnoDB Foreign Key Errors

While I’ve never needed it, the foreign key mode will present foreign key problems in a useful format.

I: InnoDB File I/O

The File I/O mode is very useful when looking at some performance problems. You can see at a glance how many I/O threads have pending I/O requests as well as the total number of reads and writes (overall and per second) as well as the number of bytes read and written. Finally, the I/O mode summarizes the InnoDB log file checkpoint statistics and overall I/O as well.

L: Locks

When you suspect that you’re running into lock contention, the Locks view will help you see which clients/queries are involved and how often it is happening.

M: Master/Slave Replication Status

This view presents a summary of the slaving information for both the SQL and IO threads on a slave. You can see the binary log file being used, replication lag, log file size, and so on.

Q: Query List

The Query List view shows the queries that MySQL is processing. And like in mytop, you can select a specific query to view the full query and run it through the EXPLAIN command.

R: InnoDB Row Operations and Semaphores

As noted earlier, this view provides statistics about rows created, updated, deleted, read as well as InnoDB semaphores (internal locking) and the Wait Array.

T: InnoDB Transactions

Finally, the Transactions mode shows the transaction state of all the threads (at least those connected to a client) running inside of MySQL. You can see who owns the transaction, which host they are connected from, the transaction status, running time, and the query that’s currently being run inside the transaction.

That’s Not All

Believe it or not, innotop can do more than that. Spend a bit of time reading the biult-in help (hit the “?” key in any of the modes) and read the innotop manual page. You’ll find that innotop can talk to multiple servers and maintain lists of server groups, since we often deploy MySQL in multi-machine clusters. In other words, innotop doesn’t assume that you’re only going to watch a single server like mytop did.

The more I use innotop, the less inclined I am to continue maintaining and updating mytop, even though I use it almost daily at work. If you’ve been an mytop user and depend on InnoDB, I highly recommend giving innotop a try.

Jeremy Zawodny is a software engineer at Craigslist where he works on MySQL, Search, and various back-end infrastructure. He’s also the co-author of “High Performance MySQL” and blogs at http://jeremy.zawodny.com/blog/