Move data from hive tables in Google Dataproc to BigQuery


We are doing the data transformations using Google Dataproc and all our data is residing in Dataproc Hive tables. How do i transfer/move this data to BigQuery.


Transfer to BigQuery from Hive seems to have a standard pattern:

<ul><li>dump your Hive into Avro files</li> <li>Load those files in BigQuery</li> </ul>

See an example here: <a href="https://stackoverflow.com/questions/46958916/migrate-hive-table-to-google-bigquery/47038501#47038501" rel="nofollow">Migrate hive table to Google BigQuery</a>

As mentioned above, take care about the types compatibility between Hive/Avro/BigQuery.

And for the first time I guess it would not hurt to do some validations by comparing that the tables on both Hive and BigQuery have the same data: <a href="https://github.com/bolcom/hive_compared_bq" rel="nofollow">https://github.com/bolcom/hive_compared_bq</a>


  • Error retrieving Avro schema for id 1, Subject not found.; error code: 40401
  • How to fetch Kafka source connector schema based on connector name
  • BQ Load error : Avro parsing error in position 893786302. Size of data block 27406834 is larger than
  • oauth2client.client.AccessTokenRefreshError: invalid_grant Only in Docker
  • Google BigQuery: creating a view via Python google-cloud-bigquery version 0.27.0 vs. 0.28.0
  • BigQuery : is it possible to execute another query inside an UDF?
  • Get Most Recent Column Value With Nested And Repeated Fields
  • Laravel 5.2 Auth::check() on exception pages (layouts)
  • jQuery and Uploadify session in the php file
  • How to get file download speed (transfer rate) with php?
  • How to select table rows/complete table?
  • Why is django manage.py syncdb failing to create new columns on my development server?
  • Is there any way to call saveCurrentTurnWithMatchData without sending a push notification?
  • Angular2 - Template reference inside NgSwitch
  • OSX - always hide certain files
  • Clear fused location provider's location for testing
  • Update Google Maps traffic layer without page reloading
  • Trying to get the char code of ENTER key
  • Android Studio 1.3 RC3. Google Play services out of date. Requires 7571000 but found 6774470
  • android google indoor map
  • Why use database factory in asp.net mvc?
  • Android Google Maps API v2 start navigation
  • Insert new calendar with SyncAdapter- Calendar API Android
  • How can I enlarge video fullscreen without the affected interface project in as3?
  • How to use carriage return with multiple line?
  • copying resource to sdcard gives a damaged file in android
  • Seeking advice on Jetty HttpClient Hang
  • How to rebase a series of branches?
  • Control modification in presentation layer
  • WinForms: two way TextBox problem
  • Adding custom controls to a full screen movie
  • Google cloud sdk not working when python points python3
  • R: gsub and capture
  • Confusion with PayPal's monthly billing cycle
  • How do I rollback to a specific git commit
  • Is there a mandatory requirement to switch app.yaml?
  • using HTMLImports.whenReady not working in chrome
  • What are the advantages and disadvantages of reading an entire file into a single String as opposed
  • Busy indicator not showing up in wpf window [duplicate]
  • Android Heatmap on canvas or ImageView