I have a xlsx file with a column containing Dates in the format: "01.01.1900 09:01:25". The file is password protected so I convert it to a dataframe by means of win32com.client library.
Here is the code:
import pandas as pd import win32com.client xlApp = win32com.client.Dispatch("Excel.Application") xlApp.DisplayAlerts = False xlwb = xlApp.Workbooks.Open(File, False, True, None, " ") #Open Workbook password " " xlws = xlwb.Sheets("Sheet 1") #Open Sheet 1 #Get table dimensions LastRow = xlws.Range("A1").CurrentRegion.Rows.Count LastColumn = xlws.Range("A1").CurrentRegion.Columns.Count header=list((xlws.Range(xlws.Cells(1, 1), xlws.Cells(1, LastColumn)).Value)) content = list(xlws.Range(xlws.Cells(2, 1), xlws.Cells(LastRow, LastColumn)).Value) #Get the dataframe df=pd.DataFrame(data=content, columns=header) print (df)
I checked that once imported dtype as been automatically and correctly assigned to datetime64 for that column. The issue is that any time I try to do whatever with any value of that column (just print it or compare it) I get a meesage saying:
File "pandas\_libs\tslibs\timezones.pyx", line 227, in pandas._libs.tslibs.timezones.get_dst_info AttributeError: 'NoneType' object has no attribute 'total_seconds' Exception ignored in: 'pandas._libs.tslib._localize_tso' Traceback (most recent call last): File "pandas\_libs\tslibs\timezones.pyx", line 227, in pandas._libs.tslibs.timezones.get_dst_info AttributeError: 'NoneType' object has no attribute 'total_seconds' Traceback (most recent call last):
Nonetheless the code works perfectly, but the warning message is annoying me.
Is there anything I can do with the datatype to avoid that warning?
Opening the excel in this way, the
content variable is a list of tuples.
Having a look on those tuples there is a TimeZoneInfo that localizes all the dates in a kind of time zone, in my case "GMT Standard Time".
So once converted to a dataframe, when doing
df.dtypes the result is not only "datetime64" but "datetime64 (UTC+0:00) Dublin, Edimburg, ..."
This time zone setting only happens when opening the excel file through
win32com.client. If you removed the password, you can open it with
pandas.read_excel and discover that there is no timezone set for those datetimes and the mentioned warning does not appear.
Don't know exactly the reason it happens, but I have a solution for the original example. The warning dissapears setting a timezone recognized by tz database as
"UTC" or simply
None. Something like: