Multiplying just one column from each of the 2 input DataFrames together


I have two DataFrames that are each of the exact sane dimensions and I would like to multiply just one specific column from each of them together:

My first DataFrame is:

In [834]: patched_benchmark_df_sim Out[834]: build_number name cycles 0 390 adpcm 21598 1 390 aes 5441 2 390 blowfish NaN 3 390 dfadd 463 .... 284 413 jpeg 766742 285 413 mips 4263 286 413 mpeg2 2021 287 413 sha 348417 [288 rows x 3 columns]

My second DataFrame is:

In [835]: patched_benchmark_df_syn Out[835]: build_number name fmax 0 390 adpcm 143.45 1 390 aes 309.60 2 390 blowfish NaN 3 390 dfadd 241.02 .... 284 413 jpeg 197.75 285 413 mips 202.39 286 413 mpeg2 291.29 287 413 sha 243.19 [288 rows x 3 columns]

And I would like to take each element of the cycles column of patched_benchmark_df_sim and multiply that to the corresponding element of the fmax column of patched_benchmark_df_syn, and then store the result in a new DataFrame that has exactly the same structure, contiaining the build_number and name columns, but now the last column containing all the numerical data will be called latency, which is the product of fmax and cycles.

So the output DataFrame has to look something like this:

build_number name latency 0 390 adpcm ## each value here has to be product of cycles and fmax and they must correspond to one another ## ......

I tried doing a straightforward patched_benchmark_df_sim * patched_benchmark_df_syn but that did not work as my DataFrames had the name column that's of string type. Is there no builtin pandas method that can do this for me? How could I proceed with the multiplication to get the result I need?

Thank you very much.


The simplest thing to do is to add a new column to the df and then select the columns you want and if you want assign that to a new df:

In [356]: df['latency'] = df['cycles'] * df1['fmax'] df Out[356]: build_number name cycles latency 0 390 adpcm 21598 3.098233e+06 1 390 aes 5441 1.684534e+06 2 390 blowfish NaN NaN 3 390 dfadd 463 1.115923e+05 284 413 jpeg 766742 1.516232e+08 285 413 mips 4263 8.627886e+05 286 413 mpeg2 2021 5.886971e+05 287 413 sha 348417 8.473153e+07 In [357]: new_df = df[['build_number', 'name', 'latency']] new_df Out[357]: build_number name latency 0 390 adpcm 3.098233e+06 1 390 aes 1.684534e+06 2 390 blowfish NaN 3 390 dfadd 1.115923e+05 284 413 jpeg 1.516232e+08 285 413 mips 8.627886e+05 286 413 mpeg2 5.886971e+05 287 413 sha 8.473153e+07

As you've found you can't multiply non-numeric type df's together like you tried. The above is assuming that the build_number and name columns are the same from both dfs.


