Why are messages with extended characters getting mangled in transit when sent from a SmartSockets Java client to a SmartSockets C/C++ client?

Why are messages with extended characters getting mangled in transit when sent from a SmartSockets Java client to a SmartSockets C/C++ client?

book

Article ID: KB0089761

calendar_today

Updated On:

Products Versions
TIBCO SmartSockets -
Not Applicable -

Description

Resolution:
The SmartSockets C/C++ library will not prevent a user from using a string composed of an extended, or 8-bit, ASCII character set in a message since the SmartSockets C/C++ library treats all strings as an array of unsigned bytes. That is why sending such strings between C/C++ clients appears to work. However, these strings will not be compatible with the SmartSockets Java library. In Java, strings are represented as arrays of 2-byte characters. Since this differs from C/C++ a common encoding was needed and UTF-8 was chosen.

There are two public, but undocumented, C/C++ functions that can be used to convert a string into UTF-8:
        TutConvertUtf8ToUcs2()
        TutConvertUcs2ToUtf8()
To send a UTF-8 encoded string in a message, the message type grammar should define a “utf8” field and the corresponding append/next UTF-8 message functions should be used.

Below are examples of sending/receiving UTF-8 encoded message fields.

Java API:
     The following is a code snippet that creates a SmartSockets message and appends a string containing a character outside of the standard ASCII character set. Appending the string to the message will automatically convert the string to UTF-8.

            String str = “señor”;
            TipcMt mt = TipcSvc.createMt(“utf8_message”, 9000, “utf8”);
            TipcMsg msg = TipcSvc.createMsg(mt);
            msg.appendUtf8(str);

The following is a code snippet that can be used to get a UTF-8 encoded string from a SmartSockets message. The string will automatically be converted from UTF-8 to a native Java string.

            msg.setCurrent(0);
            String str = msg.nextUtf8();
            System.out.println(“Received string “ + str);

C/C++ API:
The following is a code snippet that can be used to get the UTF-8 encoded string from the SmartSockets message. The user must convert the string from UTF-8 after getting it from the message. No automatic conversion takes place.

            T_STR utf8_str;
            T_WSTR wstr;

            TipcMsgSetCurrent(msg, 0);
            TipcMsgNextUtf8(msg, &utf8_str);
            TutConvertUtf8ToUcs2(utf8_str, wstr);
            TutOut(“Received string %ls\n”, wstr);

The following is a code snippet that creates a SmartSockets message and appends a string containing a character outside of the standard ASCII character set. The user must convert the string to UTF-8 prior to appending it to the message. No automatic conversion takes place.

            mt = TipcMtCreate(“utf8_message”, 9000, “utf8”);
            msg = TipcMsgCreate(mt);
            TutConvertUcs2ToUtf8(wstr, utf8_str);
            TipcMsgAppendUtf8(utf8_str);

Please note that these C/C++ examples are using a T_WSTR which is a string composed of 2-byte characters. If using a string composed of 1-byte characters (T_STR, char*, etc.)  then the user will need to find their own way of converting the string to/from UTF-8.

(Examples are attached to this Solution).

Issue/Introduction

Why are messages with extended characters getting mangled in transit when sent from a SmartSockets Java client to a SmartSockets C/C++ client?

Attachments

Why are messages with extended characters getting mangled in transit when sent from a SmartSockets Java client to a SmartSockets C/C++ client? get_app