Emoji and CakePHP

You might think: Why do I care.
Well, you would be surprised – but pretty much any website with at least one form on the page will be affected here.
So anyone with even just a basic contact form should think about applying the following changes.

A while back I got reports that one of my older websites has problems with emoji smileys and alike.
When browsing it with a mobile device and entering a native mobile smiley the emoji and all the following text would be lost when saving.
This was especially annoying with a longer conversation that you then have to retype – many people will not know why the text was truncated and will even run into the same issue again and again.

What’s the issue?

It turns out that native emoji are 4-byte characters, and the normal MySQL database, its tables and fields were all "just" utf8_unicode_ci and only supported less than that.

How to fix it?

First of all: BACK UP YOUR DATA!

With MySQL 5.5+ it is now possible to convert everything to utf8mb4_unicode_ci and then it will safely accept all those characters now, as well.

In your database.php in 2.x or app.php in 3.x you need to change the encoding from utf8 to:

'encoding' => 'utf8mb4',

Then you must convert your database, the tables and all their fields to the new collation.
You can use my Setup plugin to automate this:

bin/cake Setup.DbMaintenance encoding

But with dry-run param -d you can also just output the SQL commands and manually run them via Migrations plugin, for example.

If you now paste emoji characters and store them, they should be visible just fine on your website now.
Also, make sure sending emails, e.g. through a contact form, is working properly.

Troubleshooting

You can run into the following issue:

SQLSTATE[42000]: Syntax error or access violation: 
1071 Specified key was too long; max key length is 1000 bytes

In that case, make sure you modify your fields here. With the 4byte length, some fields cannot be varchar(255) anymore but must be a bit less (250) in order for the conversion to work out. You could also look into "innodb_large_prefix" and options to increase the limit here that way instead.

If an index is involved, the max length for a utf8mb4 field is usually varchar(191).

Can I use the 3.x shell for 2.x?

Sure, with a trick. See use-3-x-migrations-for-your-2-x-cakephp-app on how to provide 3.x shell access and access to my Tools and Setup shells even for 2.x apps.

Tips

If you have a larger database and tables > 1GB you definitely want to put the site into maintenance mode and do this during the night or a very inactive time-frame, as it will need quite a few minutes to complete.

It will also increase your DB size, in my case from 2.1 to 2.5 GB. But that is worth it if it does not silently truncate data anymore.

Or just use Postgres

It turns out that Postgres natively supports emoji and alike, without any need for mb4 modification.
So maybe, if you still use MySQL it could also be time to switch

Further links

5.00 avg. rating (93% score) - 1 vote

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.