65856

Concatenating Multiple DataFrames with Non-Standard Columns

Question:

Is there a good way to concatenate a list of DataFrames where the columns are not regular between DataFrames?

The desired outcome is to match up all columns that are a match but to keep the ones that have no match off to the side. The reason you would want to keep the unmatched columns is because while there may not be a match on a given column between the 1st and 2nd dataframes in the list there may be a match between the 1st and 3rd. Thus discarding prematurely on the first lack of match would not be ideal.

And example is:

print list(datalist[0].columns) >>>[u'1', u'2', u'3'] print list(datalist[1].columns) >>>[u'1', u'2', u'4'] print list(datalist[2].columns) >>>[u'2', u'3', u'4']

Where the output would be a dataframe like (stylistically represented here):

1 2 3 - 1 2 - 4 - 2 3 4

Answer1:

data=pd.concat(datalist,join='outer', axis=0, ignore_index=True)

This works. I was originally under the impression that concat with the join="outer" argument applied would just append straight up and down without regard to column names. Actually, when the join="outer" argument is applied it will combine what matching columns it can but then keep all of the non-matched columns off to the side of the DF, which is exactly what is desired. Hope this helps someone else.

Recommend

  • embedded vs linked JS / CSS
  • Parse delimited string
  • Mitigate xss attacks when building links
  • Client/server sockets in c
  • Built in f# operator to compose functions with the same input but different outputs?
  • Create a chart from different data
  • How do I keep Google Analytics from corrupting query strings that contain %26?
  • Merging a changing collection of observables
  • Read pandas dataframe from csv beginning with non-fix header
  • How Can I Prevent Recurring Automatic Connections to Oracle Database?
  • Accesing properties in a UserControl from the MainWindow (WPF/MVVM)
  • how to append two or more dataframes in pandas and do some analysis
  • Scala Slick Database Views
  • Is it possible to access raw iphone audio output?
  • WPF DataGrid lost focus after row delete
  • How do I control the soft menu button in Honeycomb?
  • Subclassing a Pandas DataFrame, updates?
  • Hudson dependencies
  • No rows to manipulate in html table created with jQuery csvToTable?
  • Transpose table then set and rename index
  • How to make R's read_csv2() recognise the text characters properly
  • R convert summary result (statistics with all dataframe columns) into dataframe
  • Android Activity.onWindowFocusChanged doesn't get called from within TabHost
  • Make new pandas columns based on pipe-delimited column with possible repeats
  • xtable package: Skipping some rows in the output
  • C: Incompatible pointer type initializing
  • Groovy: Unexpected token “:”
  • Replace value with Factor in r data.table
  • How to access EntityManager inside Entity class in EJB3
  • Repeat a vertical line on every page in Report Builder / SSRS
  • JavaScriptCore crash on iOS9
  • Modifying destination and filename of gulp-svg-sprite
  • Deserializing XML into class C#
  • Function pointer “assignment from incompatible pointer type” only when using vararg ellipsis
  • Compare two NSDates in iPhone
  • python draw pie shapes with colour filled
  • Observable and ngFor in Angular 2
  • How to Embed XSL into XML
  • UserPrincipal.Current returns apppool on IIS
  • Conditional In-Line CSS for IE and Others?