Rounding Double Values and Precision

Rounding Double Values and Precision

book

Article ID: KB0075737

calendar_today

Updated On:

Products Versions
TIBCO Streaming 7.x

Description

There is a well known limitation in the IEEE-754 double-precision floating-point number encoding, multiplying or dividing by a multiple of ten (10) often loses precision. The IEEE-754 specification is what is implemented in all common PC floating-point hardware.

Issue/Introduction

Rounding Double Values and Precision

Resolution

For speed, StreamBase uses the system floating-point processor to handle calculations on the double data-type.

For example, this expression:
---
round(20.11125 * 100, 2)
---
yields the result 2011.12. The expectation is that 2011.125 should be rounded-up to 2011.13. The number is rounded down because the internal representation of the number after multiplication is decimal "2011.1249999999998".

The StreamBase double data-type is IEEE 754-1985 standard compliant and occupies 64 bits (8 bytes) with a precision of 53 bits (about 16 decimal digits). Reference: http://en.wikipedia.org/wiki/Floating_point

If a double value needs to be rounded, round twice, once at one greater than the required precision, and then again at the desired precision. The expression above becomes:
---
round(round(20.11125 * 100, 3),2)
---
resulting in the expected value 2011.13.

Our round() function adds half at the decimal value of the rounding target and then truncates at the target, a standard way to do rounding. Calling this twice doesn't expend much additional CPU time.

Also, never reduce precision by rounding until the very last step before the value leaves the application for display or delivery to another system. This preserves the IEEE-754's 16-or-so digits of precision throughout.

The java.math.BigDecimal implementation is too slow for most StreamBase CEP applications.

If you intend to use integer expressions throughout, understand that the operators +, -, and * preserve the integer type in the result, but / (division) always uses floating-point and the result will be a double-type. Cast the result using "int()" (which is equivalent to a truncation) to store the result as an integer.