54413

Hive ParseException in Drop Table Statement

Question:

I'm using python and pyodbc module in particular to execute Hive queries on Hadoop. The portion of code triggering issue is like this:

import pyodbc import pandas oConnexionString = 'Driver={ClouderaHive};[...]' oConnexion = pyodbc.connect(oConnexionString, autocommit=True) oConnexion.setencoding(encoding='utf-8') oQueryParameter = "select * from my_db.my_table;" oParameterData = pandas.read_sql(oQueryParameter, oConnexion) oCursor = oConnexion.cursor() for oRow in oParameterData.index: sTableName = oParameterData.loc[oRow,'TableName'] oQueryDeleteTable = 'drop table if exists my_db.' + sTableName + ';' print(oQueryDeleteTable) oCursor.execute(oQueryDeleteTable)

The print gives this: drop table if exists dl_audit_data_quality.hero_context_start_gamemode;

But the cursor.execute triggers the following error message

<blockquote>

pyodbc.Error: ('HY000', "[HY000] [Cloudera][HiveODBC] (80) Syntax or semantic analysis error thrown in server while execurint query. Error message from server: Error while compiling statement: FAILED: ParseException line 1:44 character ' (80) (SQLExecDirectW)")

</blockquote>

Note that when I copy the print and execute it manually in Hue, it works well. I am guessing it has something to do with the encoding of the variable sTableName but I can't figure out how to fix it.

Thanks

Answer1:

The query was failing due to incorrect encoding of the variable sTableName. Printing the variable alone would display the text properly. Example with the print above:

>>> print(oQueryDeleteTable) >>> 'drop table if exists dl_audit_data_quality.hero_context_start_gamemode;'

But printing the original data frame showed it contained characters like this:

>>> print(oParameterData.loc[oRow,'TableName'] >>> 'h\x00e\x00r\x00o\x00_c\x00o\x00n\x00t\x00e\x00x\x00t\x00'

Issue was solved by reworking on the encoding as described here: <a href="https://stackoverflow.com/questions/43827811/python-dictionary-contains-encoded-values" rel="nofollow">Python Dictionary Contains Encoded Values</a>

import pyodbc import pandas oConnexionString = 'Driver={ClouderaHive};[...]' oConnexion = pyodbc.connect(oConnexionString, autocommit=True) oConnexion.setdecoding(pyodbc.SQL_CHAR, encoding='utf-8') oConnexion.setdecoding(pyodbc.SQL_WCHAR, encoding='utf-8') oConnexion.setencoding(encoding='utf-8') oQueryParameter = "select * from my_db.my_table;" oParameterData = pandas.read_sql(oQueryParameter, oConnexion) oCursor = oConnexion.cursor() for oRow in oParameterData.index: sTableName = oParameterData.loc[oRow,'TableName'] oQueryDeleteTable = 'drop table if exists my_db.' + sTableName + ';' print(oQueryDeleteTable) oCursor.execute(oQueryDeleteTable)

Recommend

  • Subscript text in HTML
  • c# datatable select last row on a speicfic condition
  • Python Dictionary Contains Encoded Values
  • Fastest method of finding data from another row in Pandas DataFrame based upon column data calculati
  • Correct way to set value in multi-index Pandas Dataframe
  • Python interpolate not working on rows
  • Compute EWMA over sparse/irregular TimeSeries in Pandas
  • Simultaneously melt multiple columns in Python Pandas
  • Business days between two dates excluding holidays in python
  • How to merge two dataframes based on the closest (or most recent) timestamp
  • create new column that compares across rows in pandas dataframe
  • Python Pandas: Adding methods to class pandas.core.series.Series
  • Read pandas dataframe from csv beginning with non-fix header
  • Pandas DataFrame column values in to list
  • The best way to mark (split?) dataset in each string
  • How to specify logical types when writing Parquet files from PyArrow?
  • Delete suddenly taking a long time
  • pandas equals mysterious behavior
  • Resample in a rolling window using pandas
  • pandas mix position and label indexing without chaining
  • Compare Pandas dataframes and add column
  • How to remove just the index name and not the content in Pandas multiindex data frame
  • Alter Table doesn't work under MS Access 64 bit. Why?
  • Executing a function that adds columns and populates them dependig on other columns in Pandas
  • Display Custom Marker in Google Maps Using Relative File Path [duplicate]
  • C# where to add a method
  • How do I write an item to a DynamoDb with the AWS DynamoDB DocumentClient?
  • Parsing Data From Long to Wide Format in Python
  • Serverless Framework Dynamo DB Table Resource Definition with Sort Key
  • How to remove comma or any characters from Python dataframe column name
  • Pandas groupby to to_csv
  • Get the last date of each month in a list of dates in Python
  • Client side validation mvc dropdown
  • Pre-populated SQLite Database not reading properly in Android Studio
  • Deploying a CodeRush plugin from the Community Site
  • How to override value that appears in a dropdown in the rails_admin gem
  • PHPUnit_Framework_TestCase class is not available. Fix… - Makegood , Eclipse
  • Rearranging Cells in UITableView Bug & Saving Changes
  • Proper way to use connect-multiparty with express.js?
  • Conditional In-Line CSS for IE and Others?