45440

Create external HIVE table from files with different record formats for a csv file

I have a CSV file with different record formats that is defined by the first column value: Sample Data:

"EL","XXXXXXX", 2017-07-17 "EH","XXXXXXX",1,2017-07-17,"AAA" "BI","XXXXXXX","AAA","BBBB"

In this case, I am getting the file with 3 defined record types. Is there a way to load this to different hive tables ?

Answer1:

<strong>Demo</strong>

create table el (s1 string,d1 date); create table eh (s1 string,i1 int,dt1 date,s2 string); create table bi (s1 string,s2 string,s3 string); <hr> create external table myfile ( c1 string ,c2 string ,c3 string ,c4 string ,c5 string ) row format serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde' with serdeproperties ( 'separatorChar' = ',' ,'quoteChar' = '"' ,'escapeChar' = '\\' ) stored as textfile ; <hr> select * from myfile; +-----+----------+--------------+-------------+-------+ | c1 | c2 | c3 | c4 | c5 | +-----+----------+--------------+-------------+-------+ | EL | XXXXXXX | 2017-07-17 | NULL | NULL | | EH | XXXXXXX | 1 | 2017-07-17 | AAA | | BI | XXXXXXX | AAA | BBBB | NULL | +-----+----------+--------------+-------------+-------+ <hr> from myfile insert into el select c2,c3 where c1='EL' insert into eh select c2,c3,c4,c5 where c1='EH' insert into bi select c2,c3,c4 where c1='BI' ; <hr> select * from el; +----------+-------------+ | s1 | d1 | +----------+-------------+ | XXXXXXX | 2017-07-17 | +----------+-------------+ <hr> select * from eh; +----------+-----+-------------+------+ | s1 | i1 | dt1 | s2 | +----------+-----+-------------+------+ | XXXXXXX | 1 | 2017-07-17 | AAA | +----------+-----+-------------+------+ <hr> select * from bi; +----------+------+-------+ | s1 | s2 | s3 | +----------+------+-------+ | XXXXXXX | AAA | BBBB | +----------+------+-------+

Recommend

  • java try catch when is the program flow interrupted?
  • Recognize and skip invalid (or proprietary?) MP3 frame headers from Web audio streams
  • Cannot compile static array with fixed size [duplicate]
  • Exception in cvConvertScale in OpenCV calling solvePnP
  • Parsing Youtube playlist to Listview
  • How find path at the m x n table
  • How to register custom UDF jar in HiveThriftServer2?
  • string Transformation in HIVE
  • Data read separately fscanf
  • grant permissions in hive does not work on hdp2.2
  • How to effectively convert double to int in c++? [closed]
  • finding symmetric difference/unique elements in multiple arrays in javascript
  • Matrix problem Python
  • Grouping vars in function
  • Click button with javascript
  • Using : for multiple slicing in list or numpy array
  • How to give custom name to Sqoop output files
  • Wrapping a c#/WPF GUI around c++/cli around native c++
  • Azure table store snapshot/backup capability
  • Bigquery event streaming and table creation
  • Cast between interfaces whose interface signatures are same
  • C# program and C++ DLL compiled for 32-bit system crash on 64-bit system
  • Loading .coffee files via a view in Rails
  • Create DicomImage from scratch using Dcmtk
  • C++ Partial template specialization - design simplification
  • Linq Objects Group By & Sum
  • Javascript simulate pressing enter in input box
  • Large data - storage and query
  • Adding custom controls to a full screen movie
  • Delete MySQLi record without showing the id in the URL
  • Unanticipated behavior
  • Comma separated Values
  • Error creating VM instance in Google Compute Engine
  • How can I get HTML syntax highlighting in my editor for CakePHP?
  • Hits per day in Google Big Query
  • Trying to get generic when generic is not available
  • how does django model after text[] in postgresql [duplicate]
  • How to load view controller without button in storyboard?