Incomplete results when importing data from Cloudera Impala into TIBCO Spotfire

Incomplete results when importing data from Cloudera Impala into TIBCO Spotfire

book

Article ID: KB0073745

calendar_today

Updated On:

Products

Spotfire Analyst Spotfire Drivers Spotfire Server

Description

As stated in Cloudera Knowledge Base article TSB-2021 479: Impala can return incomplete results through JDBC and ODBC clients in all CDP offerings   

"In CDP, we introduced a timeout on queries to Impala defaulting to 10 seconds. The timeout setting is called FETCH_ROWS_TIMEOUT_MS. Due to this setting, JDBC, ODBC, and Beeswax clients running Impala queries believe the data returned at 10 seconds is a complete dataset and present it as the final output. However, in cases where there are still results to return after this timeout has passed, when the driver closes the connection, based on the timeout, it results in a scenario where the query results are incomplete."

You may encounter this issue when retrieving data from Cloudera Impala in TIBCO Spotfire, e.g. using TIBCO Spotfire Information Services (using the Cloudera JDBC driver) or the TIBCO Spotfire Connector for Cloudera Impala (using the Cloudera ODBC driver).  

Issue/Introduction

When importing data from Cloudera Impala into TIBCO Spotfire via ODBC or JDBC, you may get incomplete results. This is due to an issue in the Impala drivers, and has been addressed by Cloudera in a recent release.

Environment

Data Source: Cloudera Impala

Resolution

The issue has been fixed in version 2.6.20 of the Cloudera Impala JDBC driver and in version 2.6.12 of the Cloudera Impala ODBC driver.
 
For more information about solutions and workarounds, please refer to Cloudera Knowledge Base article TSB-2021 479: Impala can return incomplete results through JDBC and ODBC clients in all CDP offerings

Additional Information