萬盛學電腦網

 萬盛學電腦網 >> 數據庫 >> mysql教程 >> mysql如何存儲emoji表情

mysql如何存儲emoji表情

在mysql中如何存儲emoji表情,如果mysql是utf-8編碼,存入表情符出錯我們如何解決,下面我們來介紹一下。

utf8的數據庫,存入表情符,會出錯

 代碼如下 復制代碼 Incorrect string value: '\xF0\x9F\x98\x84\xF0\x9F...' for column 'content'




錯誤的解決辦法:

 代碼如下 復制代碼
4 byte Unicode characters aren't yet widely used, so not every application out there fully supports them. MySQL 5.5 works fine with 4 byte characters when properly configured – check if your other components can work with them as well.

Here's a few other things to check out:

Make sure all your tables' default character sets and text fields are converted to utf8mb4, in addition to setting the client & server character sets, e.g. ALTER TABLE mytable charset=utf8mb4, MODIFY COLUMN textfield1 VARCHAR(255) CHARACTER SET utf8mb4,MODIFY COLUMN textfield2 VARCHAR(255) CHARACTER SET utf8mb4; and so on.

If your data is already in the utf8 character set, it should convert to utf8mb4 in place without any problems. As always, back up your data before trying!

Also make sure your app layer sets its database connections' character set to utf8mb4. Double-check this is actually happening – if you're running an older version of your chosen framework's mysql client library, it may not have been compiled with utf8mb4 support and it won't set the charset properly. If not, you may have to update it or compile it yourself.

When viewing your data through the mysql client, make sure you're on a machine that can display emoji, and run a SET NAMES utf8mb4 before running any queries.

Once every level of your application can support the new characters, you should be able to use them without any corruption.




總結就是,表結構改為支持4字節的unicode,數據庫連接也用這個字符集哦,證明是可行的。
如果別的地方不支持,可以考慮去掉這些字符:

 代碼如下 復制代碼 Since 4-byte UTF-8 sequences always start with the bytes 0xF0-0xF7, the following should work:

$str = preg_replace('/[\xF0-\xF7].../s', '', $str);
Alternatively, you could use preg_replace in UTF-8 mode but this will probably be slower:

$str = preg_replace('/[\x{10000}-\x{10FFFF}]/u', '', $str);
This works because 4-byte UTF-8 sequences are used for code points in the supplementary Unicode planes starting from 0x10000.
copyright © 萬盛學電腦網 all rights reserved