Hive driver does not recognize table that is created with 3r

Posted by xuelin.lu@oracle.com on 23-Mar-2018 14:27

We are using DataDirect Hive driver version 5.1.4.000136. In Hive, we created a table like following:

CREATE EXTERNAL TABLE `party`(
  `message_name` string COMMENT 'from deserializer',
  `message_triggering_service` string COMMENT 'from deserializer',
  `message_date_time` string COMMENT 'from deserializer',
  `message_id` string COMMENT 'from deserializer',
  `party_id` string COMMENT 'from deserializer',
  `party_type_id` string COMMENT 'from deserializer',
  `salutation` string COMMENT 'from deserializer',
  `first_name` string COMMENT 'from deserializer',
  `middle_name` string COMMENT 'from deserializer',
  `last_name` string COMMENT 'from deserializer',
  `dob` string COMMENT 'from deserializer',
  `gender` string COMMENT 'from deserializer',
  `marital_status_id` string COMMENT 'from deserializer',
  `organization_name` string COMMENT 'from deserializer',
  `organization_type_id` string COMMENT 'from deserializer',
  `organization_contact_person` string COMMENT 'from deserializer',
  `acquired_channel_id` string COMMENT 'from deserializer',
  `internal_customer_flag` string COMMENT 'from deserializer',
  `deceased_flag` string COMMENT 'from deserializer',
  `deceased_date` string COMMENT 'from deserializer',
  `party_status_id` string COMMENT 'from deserializer',
  `party_status_chg_rsn_id` string COMMENT 'from deserializer',
  `migrated_customer_flag` string COMMENT 'from deserializer',
  `partyaddresses` array<struct<addresstype:struct<party_address_id:string,address_type_id:string>>> COMMENT 'from deserializer',
  `addresses` array<struct<addresse:struct<address_id:string,address_line1:string,address_line2:string,address_line3:string,address_line4:string,street:string,city:string,post_code:string,county:string,country:string>>> COMMENT 'from deserializer',
  `partyspecialneeds` array<struct<specialneed:struct<special_need_id:string>>> COMMENT 'from deserializer',
  `party_joining_reasons` array<struct<party_joining_reason:struct<joining_reason_id:string>>> COMMENT 'from deserializer')
ROW FORMAT SERDE
  'com.ibm.spss.hive.serde2.xml.XmlSerDe'
WITH SERDEPROPERTIES (
  'column.xpath.ACQUIRED_CHANNEL_ID'='/MStart/PARTY/ACQUIRED_CHANNEL_ID/text()',
  'column.xpath.ADDRESSES'='/MStart/PARTY/ADDRESSES/ADDRESSE',
  'column.xpath.DECEASED_DATE'='/MStart/PARTY/DECEASED_DATE/text()',
  'column.xpath.DECEASED_FLAG'='/MStart/PARTY/DECEASED_FLAG/text()',
  'column.xpath.DOB'='/MStart/PARTY/DOB/text()',
  'column.xpath.FIRST_NAME'='/MStart/PARTY/FIRST_NAME/text()',
  'column.xpath.GENDER'='/MStart/PARTY/GENDER/text()',
  'column.xpath.INTERNAL_CUSTOMER_FLAG'='/MStart/PARTY/INTERNAL_CUSTOMER_FLAG/text()',
  'column.xpath.LAST_NAME'='/MStart/PARTY/LAST_NAME/text()',
  'column.xpath.MARITAL_STATUS_ID'='/MStart/PARTY/MARITAL_STATUS_ID/text()',
  'column.xpath.MESSAGE_DATE_TIME'='/MStart/Message_Header/MESSAGE_DATE_TIME/text()',
  'column.xpath.MESSAGE_ID'='/MStart/Message_Header/MESSAGE_ID/text()',
  'column.xpath.MESSAGE_NAME'='/MStart/Message_Header/MESSAGE_NAME/text()',
  'column.xpath.MESSAGE_TRIGGERING_SERVICE'='/MStart/Message_Header/MESSAGE_TRIGGERING_SERVICE/text()',
  'column.xpath.MIDDLE_NAME'='/MStart/PARTY/MIDDLE_NAME/text()',
  'column.xpath.MIGRATED_CUSTOMER_FLAG'='/MStart/PARTY/MIGRATED_CUSTOMER_FLAG/text()',
  'column.xpath.ORGANIZATION_CONTACT_PERSON'='/MStart/PARTY/ORGANIZATION_CONTACT_PERSON/text()',
  'column.xpath.ORGANIZATION_NAME'='/MStart/PARTY/ORGANIZATION_NAME/text()',
  'column.xpath.ORGANIZATION_TYPE_ID'='/MStart/PARTY/ORGANIZATION_TYPE_ID/text()',
  'column.xpath.PARTYADDRESSES'='/MStart/PARTY/PARTYADDRESSES/ADDRESSTYPE',
  'column.xpath.PARTYSPECIALNEEDS'='/MStart/PARTY/PARTYSPECIALNEEDS/SPECIALNEED',
  'column.xpath.PARTY_ID'='/MStart/PARTY/PARTY_ID/text()',
  'column.xpath.PARTY_JOINING_REASONS'='/MStart/PARTY/PARTY_JOINING_REASONS/PARTY_JOINING_REASON',
  'column.xpath.PARTY_STATUS_CHG_RSN_ID'='/MStart/PARTY/PARTY_STATUS_CHG_RSN_ID/text()',
  'column.xpath.PARTY_STATUS_ID'='/MStart/PARTY/PARTY_STATUS_ID/text()',
  'column.xpath.PARTY_TYPE_ID'='/MStart/PARTY/PARTY_TYPE_ID/text()',
  'column.xpath.SALUTATION'='/MStart/PARTY/SALUTATION/text()')
STORED AS INPUTFORMAT
  'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'adl://rbdvinsightsdls01.azuredatalakestore.net/root/CRM/1000.10.8.b/Party/Process/InProgress'
TBLPROPERTIES (
  'xmlinput.end'='</MStart>',
  'xmlinput.start'='<MStart')

Note we are using the serde downloaded from  https://github.com/dvasilen/Hive-XML-SerDe/wiki/XML-data-sources

We also did the "add jar ....".

Table was created successfully, and we were able to see it using "describe table" command.

However in our java code,

rs = Connection.getMetaData().getTables()

returns all the tables, except the one created above.

If table was created without the ROW FORMATSERDE clause, the table will be returned.

If table was created with serde like "parquet.hive.serde.ParquetHiveSerDe", "org.apache.hive.hcatalog.data.JsonSerDe", "org.apache.hadoop.hive.serde2.avro.AvroSerDe", the table will be returned.

So looks like only the "com.ibm.spss.hive.serde2.xml.XmlSerDe" will have the issue. Is there any special steps to be performed in order for the Hive drive to recognize it?

Thanks,

Xuelin

All Replies

This thread is closed