I have a table in Hive:
CREATE EXTERNAL TABLE sr2015( creation_date STRING, status STRING, first_3_chars_of_postal_code STRING, intersection_street_1 STRING, intersection_street_2 STRING, ward STRING, service_request_type STRING, division STRING, section STRING ) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES ( 'colelction.delim'='\u0002', 'field.delim'=',', 'mapkey.delim'='\u0003', 'serialization.format'=',', 'skip.header.line.count'='1', 'quoteChar'= "\"")
The table is loaded data this way:
LOAD DATA INPATH "hdfs:///user/rxie/SR2015.csv" INTO TABLE sr2015;
Why the table is only accessible in Hive? when I attempt to access it in HUE/Impala Editor I got the following error:<blockquote>
AnalysisException: Could not resolve table reference: 'sr2015'</blockquote>
which seems saying there is no such a table, but the table does show up in the left panel.
In Impala-shell, error is different as below:<blockquote>
ERROR: AnalysisException: Failed to load metadata for table: 'sr2015' CAUSED BY: TableLoadingException: Failed to load metadata for table: sr2015 CAUSED BY: InvalidStorageDescriptorException: Impala does not support tables of this type. REASON: SerDe library 'org.apache.hadoop.hive.serde2.OpenCSVSerde' is not supported.</blockquote>
I have always been thinking Hive table and Impala table are essentially the same and difference is Impala is a more efficient query engine.
Can anyone help sort it out? Thank you very much.Answer1:
sr2015 is located in DB called
db, in order to make the table visible in Impala, you need to either issue
invalidate metadata db;</blockquote>
invalidate metadata db.sr2015;</blockquote>
in Impala shell
However in your case, the reason is probably the version of Impala you're using, since it doesn't support the table format altogetherAnswer2:
Unfortunately, Impala <a href="https://www.cloudera.com/documentation/enterprise/latest/topics/impala_faq.html#faq_features__faq_unsupported" rel="nofollow">doesn't support</a> custom Hive Serializer/Deserializer classes (SerDes). It can only work with a limited set of file formats supported by its built-in SerDes.
Seems like this leaves you with the only option -- to CREATE another table (in Hive or Impala) using standard file format, and copy the data over from your Hive-only table
CREATE TABLE unquoted (
ROW FORMAT DELIMITED
STORED AS TEXTFILE;
INSERT INTO unquoted SELECT * FROM sr2015;
FWIW regarding your original issue (that Impala couldn't handle quotes in CSV file) there is a stagnant <a href="https://issues.apache.org/jira/browse/IMPALA-2148" rel="nofollow">Impala JIRA</a> which didn't get any traction for the last 3 years :(