15583

Convert an Object dtype column to Number Dtype in a datafrane Pandas

Trying to answer this question Get List of Unique String per Column we ran into a different problem from my dataset. When I import this CSV file to the dataframe every column is OBJECT type, we need to convert the columns that are just number to real (number) dtype and those that are not number to String dtype.

Is there a way to achieve this?

Download the data sample from here

I have tried following code from following article Pandas: change data type of columns but did not work.

df = pd.DataFrame(a, columns=['col1','col2','col3'])

As always thanks for your help

Answer1:

<strong>Option 1</strong> use pd.to_numeric in an apply

df.apply(pd.to_numeric, errors='ignore')

<strong>Option 2</strong> use pd.to_numeric on df.values.ravel

cvrtd = pd.to_numeric(df.values.ravel(), errors='coerce').reshape(-1, len(df.columns)) pd.DataFrame(np.where(np.isnan(cvrtd), df.values, cvrtd), df.index, df.columns) <hr>

<strong>Note</strong> These are not exactly the same. For some column that contains mixed values, option 2 converts what it can while option 2 leaves everything in that column an object. Looking at your file, I'd choose option 1.

<hr>

<strong>Timing</strong>

df = pd.read_csv('HistorianDataSample/HistorianDataSample.csv', skiprows=[1, 2])

<img src=https://www.e-learn.cn/content/wangluowenzhang/"https://i.stack.imgur.com/zfCBE.png" alt="enter image description here">

Recommend

  • Matlab to Python Conversion binary file read
  • TypeError: Value passed to parameter 'input' has DataType float64 not in list of allowed v
  • Cythonized function unexpectedly slow
  • Numpy odd behaviour conversion to datetime64 dtype
  • EntLib Way to Bind “Null” Value to Parameter
  • Primefaces :radioButton inside a ui:repeat
  • R convert summary result (statistics with all dataframe columns) into dataframe
  • how does System.Web.HttpRequest::PathInfo work?
  • Django model inheritance, filtering models
  • Reading a file into a multidimensional array
  • saving file generated by TCPDF
  • Android application: how to use the camera and grab the image bytes?
  • Breaking out column by groups in Pandas
  • How solve “Qt: Untested Windows version 10.0 detected!”
  • Unable to get column index with table.getColumn method using custom table Model
  • Time complexity of a program which involves multiple variables
  • Content-Length header not returned from Pylons response
  • Python urlparse: small issue
  • How to clear text inside text field when radio button is select
  • Scrapy recursive link crawler
  • Is there a javascript serializer for JSON.Net?
  • Is my CUDA kernel really runs on device or is being mistekenly executed by host in emulation?
  • Is possible to count alias result on mysql
  • Where to put my custom functions in Wordpress?
  • Can I make an Android app that runs a web view in Chrome 39?
  • R: gsub and capture
  • Calling of Constructors in a Java
  • jqPlot EnhancedLegendRenderer plugin does not toggle series for Pie charts
  • Traverse Array and Display in markup
  • Transpose CSV data with awk (pivot transformation)
  • Comma separated Values
  • Buffer size for converting unsigned long to string
  • Benchmarking RAM performance - UWP and C#
  • Why can't I rebase on to an ancestor of source changesets if on a different branch?
  • embed rChart in Markdown
  • How to get NHibernate ISession to cache entity not retrieved by primary key
  • Binding checkboxes to object values in AngularJs
  • How can I use `wmic` in a Windows PE script?
  • Unable to use reactive element in my shiny app
  • How to load view controller without button in storyboard?