2014-04-24 - Originally posted at https://tech.taskrabbit.com/blog/2014/04/24/active-record-mysql-and-emoji/
There are some problems when supporting the emoji character set wit our stack, which includes Rails 4.0 and MySQL. The main problem is that MySQL’s utf8 encoding does not actually support multi-byte strings, which emoji relies on. In MySQL 5.5, the utf8mb4 encoding was introduced which allows for Multi-Byte (mb) strings… and therefore emoji would work! The MySQL gem introduced support for utf8mb4 about a year ago, but only recently did active_record (and rails) add support for this in rails 4.1.
Initially, we decided to ignore all emoji characters, literally stripping them out of strings with our demogi gem (Thanks Pablo!). However, with our new product launch in the UK, we thought it was time to actually address the problem. Here is what we learned:
The good news is that the upgrade path from utf8 to utf8mb4 is easy. As we are adding bytes, the migration is really just a definition change at the table-level. Nothing has to change with your existing data. This is a non-blocking and non-downtime migration. If you are using normal rails migrations, all of your column types for VARCHAR columns will be based on the table’s encoding. Changing the table will change the column type. The bad news is that any text-type (or blob-type) columns will need to be explicitly changed.
Check out the migration steps:
The only change here is to change the encoding:
The last step here is to worry about index lengths, as mentioned above. If you are on rails 4.1, you have nothing to worry about! The rest of us have a few options:
We chose #2 due to the simplicity of the solution. Check the links above for a detailed discussion of the problem.
And now you can emoji to your ❤’s content!