Hadoop - Native Connection from Alpine 6.2 to Hive on Kerberized CDH 5.x

Hadoop - Native Connection from Alpine 6.2 to Hive on Kerberized CDH 5.x

book

Article ID: KB0082672

calendar_today

Updated On:

Products Versions
Spotfire Data Science 6.x

Description

Native Connection from Alpine 6.2 to Hive on Kerberized CDH 5.x

Issue/Introduction

Native Connection from Alpine 6.2 to Hive on Kerberized CDH 5.x

Resolution

Native Connection from Alpine 6.2 to Hive on Kerberized CDH 5.x

In order to be able to create a native connection from Alpine 6.2 to kerberized CDH Hive, go through the following steps:

1. Make sure that the Hive hostname is properly configured in the /etc/hosts file of Alpine server.

2. Add a new data connection - Hadoop Hive (see the attached screenshot).

3. Configure the parameters on the first page accordingly (see the attached screenshot).
4. Then configure the additional parameters:

alpine.principal=alpine/chorus.alpinenow.local@ALPINENOW.LOCAL
alpine.keytab=/home/chorus/keytab/alpine.keytab
mapreduce.jobhistory.address=nn2.alpinenow.local:10020
hive.hiveserver2.uris=jdbc:hive2://cm.alpinenow.local:10000/default
hive.metastore.kerberos.principal=hive/_HOST@ALPINENOW.LOCAL
hive.server2.authentication.kerberos.principal=hive/_HOST@ALPINENOW.LOCAL
hive.metastore.client.connect.retry.delay=1
hive.metastore.client.socket.timeout=600
dfs.client.failover.proxy.provider.nameservice1=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
dfs.datanode.kerberos.principal=hdfs/_HOST@ALPINENOW.LOCAL
dfs.ha.automatic-failover.enabled.nameservice1=true
dfs.ha.namenodes.nameservice1=namenode64,namenode72
dfs.namenode.http-address.nameservice1.namenode64=nn1.alpinenow.local:50070
dfs.namenode.http-address.nameservice1.namenode72=nn2.alpinenow.local:50070
dfs.namenode.https-address.nameservice1.namenode64=nn1.alpinenow.local:50470
dfs.namenode.https-address.nameservice1.namenode72=nn2.alpinenow.local:50470
dfs.namenode.kerberos.principal=hdfs/_HOST@ALPINENOW.LOCAL
dfs.namenode.rpc-address.nameservice1.namenode64=nn1.alpinenow.local:8020
dfs.namenode.rpc-address.nameservice1.namenode72=nn2.alpinenow.local:8020
dfs.namenode.servicerpc-address.nameservice1.namenode64=nn1.alpinenow.local:8022
dfs.namenode.servicerpc-address.nameservice1.namenode72=nn2.alpinenow.local:8022
dfs.nameservices=nameservice1
ha.zookeeper.quorum=cm.alpinenow.local:2181,nn1.alpinenow.local:2181,nn2.alpinenow.local:2181
hadoop.rpc.protection=authentication
hadoop.security.authentication=kerberos
mapreduce.jobhistory.principal=mapred/_HOST@ALPINENOW.LOCAL
mapreduce.jobhistory.webapp.address=nn2.alpinenow.local:19888
yarn.app.mapreduce.am.staging-dir=/tmp/hadoop-yarn/staging
yarn.resourcemanager.admin.address=nn1.alpinenow.local:8033
yarn.resourcemanager.principal=yarn/_HOST@ALPINENOW.LOCAL
yarn.resourcemanager.resource-tracker.address=nn1.alpinenow.local:8031
yarn.resourcemanager.scheduler.address=nn1.alpinenow.local:8030
hive.server2.enable.doAs=true
5. If high availability for the resource manager (HA for RM) is enabled on the cluster, add the following additional parameter with a comma-separated list of resource manger hostnames as a value to support HA for RM form Alpine side:
failover_resource_manager_hosts=nn1.alpinenow.local,nn2.alpinenow.local
Note: This parameter needs to be combined with a list of other parameters if SSL for the resource manager is enabled. In that case, find and add all the Resource Manager HA parameters from the yarn-site.xml file. For more information, look at this page - Connecting Alpine to a cluster with Resource Manager High Availability enabled .

6. If "data in transit" encryption is enabled and set up with the following hadoop parameters from the cluster side:
dfs.encrypt.data.transfer=true
dfs.data.transfer.protection=Privacy
hadoop.rpc.protection=Privacy
dfs.encrypt.data.transfer.algorithm=AES/CTR/NoPadding
dfs.encrypt.data.transfer.cipher.key.bitlength=256
add these two lines to the Alpine connection additional parameters list:
dfs.data.transfer.protection=privacy
hadoop.rpc.protection=privacy

Attachments

Hadoop - Native Connection from Alpine 6.2 to Hive on Kerberized CDH 5.x get_app
Hadoop - Native Connection from Alpine 6.2 to Hive on Kerberized CDH 5.x get_app