How to use UTF-8 with MySql

How to use UTF-8 with MySql

book

Article ID: KB0075197

calendar_today

Updated On:

Products Versions
TIBCO Streaming 7

Description

How do I use UTF-8 with a MySQL JDBC database from StreamBase?

Issue/Introduction

How to use UTF-8 with MySql

Resolution

Strings use the character set specified in the property streambase.tuple-charset (in this case -Dstreambase.tuple-charset=UTF-8).

To connect to MySQL from StreamBase and use UTF-8 do the following:
  1. Use a current jdbc driver.
  2. Add the following to your sbd.sbconf file:

    <java-vm>
    <param name="jvm-args" value="\-Dstreambase.tuple-charset=UTF-8 ....
    ...
    <data-source name="myDB" type="jdbc">
    <uri value="jdbc:mysql://qa\- mysql5.office.streambase.com:3306/utf_test?user=nightly&password=nightly"/>
    <driver value="com.mysql.jdbc.Driver"/>
    <param name="useUnicode" value="true"/>
    <param name="use_unicode" value="1"/>
    <param name="characterEncoding" value="UTF-8"/>
    <param name="jdbc-fetch-size" value="10"/>
    <param name="jdbc-timeout" value="15000"/>
    <param name="jdbc-max-column-size" value="2048"/>
    </data-source>

    Select, insert, and update using string fields.

    Notes:

  • Internal sb expression string functions like length and == will not produce the correct results on UTF-8 encoded text in string fields.
  • In the MySQL setup, some care needs to be given in order to have the utf8 character encoding/decoding done properly.
  • The database should be created with the command:
    mysql> create database utf_test character set utf8;
    If this is not done, sending a utf-8 encoded string as a parameter to MySQL will give a SqlException with the error text:
      SQLException: SQLSTATES:*HY000* *REASON: Incorrect string value:
    followed by the hex values of the UTF-8 encoded text. Queries on the database will probably succeed without error, but INSERT or UPDATE statements will fail.
  • The tables should be created without specifying the character set for the table or the column. If the CREATE TABLE command specifies the character set for the table or the column, MySQL will do an additional UTF-8 encoding on the text from the JDBC driver – which is already UTF-8 encoded. MySQL tools will also do an extra UTF-8 decoding, so performing a SELECT from within the MySQL command may produce the correct text even if the stored text is doubly encoded.

    SELECT hex(textcolumn) FROM table;

    will show the results to be approximately twice as long as it should be.* The MySQL program should be invoked with the command*

    mysql -- default-character-set=utf8

    This will ensure that SELECT results will use the proper text encoding.* To give the correct sorting on text columns and correct encoded string comparisons, it is likely that collation type needs to be specified. This is because character ordering tends to vary depending upon the locale. Selecting the collation is beyond the scope of this discussion but is covered in chapter 10 of the MySQL Reference Manual.