Carriage Return characters are being stripped after parsing an XML doc using BusinessWorks

Carriage Return characters are being stripped after parsing an XML doc using BusinessWorks

book

Article ID: KB0075952

calendar_today

Updated On:

Products Versions
TIBCO ActiveMatrix BusinessWorks 5.x, 6.x, 6.x

Description

Carriage Return characters are being stripped after parsing an XML doc which contains CRLF (Carriage Return/Line feed) control chars using ActiveMatrix BusinessWorks Parser. After parsing an XML doc containing CRLF control chars, it will strip the Carriage Return characters and the output only contains Line Feed control characters.
 

Issue/Introduction

Carriage Return characters are being stripped after parsing an XML doc using BusinessWorks

Environment

Product: TIBCO ActiveMatrix BusinessWorks Version: 5.x, 6.x OS: All Supported Operating Systems

Resolution

This is expected behavior. As per citation below, compliant XML parsers must, before parsing, translate CRLF and any CR not followed by a LF to a single LF. That's to say, the TIBCO BW XML Parser, is actually behaving as per the official XML specification.

https://www.w3.org/TR/REC-xml/#sec-line-ends

#########################
XML parsed entities are often stored in computer files which, for editing convenience, are organized into lines. These lines are typically separated by some combination of the characters CARRIAGE RETURN (#xD) and LINE FEED (#xA).

To simplify the tasks of applications, the XML processor must behave as if it normalized all line breaks in external parsed entities (including the document entity) on input, before parsing, by translating both the two-character sequence #xD #xA and any #xD that is not followed by #xA to a single #xA character.
#########################

If user needs the CRLF characters in the output, one possible solution is, assuming that the user has no control over the other system ("Third Party") that requires CR presence, to add the CR back to using XPath functions. I.e.

concat('Read XML From File - The string contains 0D0A7F new line sequences: ',xsd:string(contains(tib:string-to-hex(replace($ParseXml/tns:Payload, "&lf;","&crlf;" )), '0D0A7F')))

But users need to keep in mind that it's going to be stripped out when passing yet another XML Parser (TIBCO or External). So it's better to do that right before it arrives at the "The Third Party".

Additional Information

https://www.w3.org/TR/REC-xml/#sec-line-ends