Migrating MySQL latin1 to utf8 – In House Version

This entry is part [part not set] of 4 in the series Migrating MySQL latin1 to utf8

Use this method if at all possible as it will attempt to recover non-English latin1 characters (accents, umlauts) in your existing data.

  • Confirm your database is currently encoded in latin1.
  • Make a fresh backup (ideally using mysqlhotcopy se notes below)
  • Temporarily disable your cronjob so you don’t have anything trying to access the database.
  • The steps below require the ‘iconv’ utility to ensure proper UTF8 encoding. Type ‘iconv –version’ at the shell to check if it’s installed. If not, install it from your platform’s package manager (e.g. apt-get, yum, rpm).

Dump the database schema (use your own user + password + database):

mysqldump -Q -d -u root -p
--default-character-set=latin1 --skip-set-charset
old_database | sed 's/latin1/utf8/gi' > dump.schema.sql

Dump the database data (use your own user + password + database):

mysqldump -Q --insert-ignore -t -u root -p
--default-character-set=latin1 --skip-set-charset
old_database | iconv -c -f utf8 -t utf8 > dump.data.sql

Create a new UTF-8 database:

CREATE DATABASE newdb_utf8 CHARACTER SET utf8 COLLATE utf8_general_ci;

Import into the new database:

mysql -u root -p --default-character-set=utf8
newdb_utf8 < dump.schema.sql; mysql -u root -p --default-character-set=utf8 newdb_utf8 < dump.data.sql;

Notes:

Some helpfull hints

mysqlhotcopy:
Reference Here
Usage: (to local filesystem)

mysqlhotcopy -u backups -p`cat ~backups/.db.shadow`
--addtodest --noindices old_database ~backups/dbs/

Usage: (to SCP)

mysqlhotcopy -u backups -p`cat ~backups/.db.shadow`
--addtodest --noindices
--method='scp -c arcfour -C -2' old_database backups@remotehost:~backups/dbs/

Series Navigation

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.