I have two pandas dataframes, both index with datetime entries. The
df1 has non-unique time indices, whereas
df2 has unique ones. I would like to add a column
df1 in the following way: for every row in
df1 with timestamp
df1.a should contain the most recent value of
df2.a whose timestamp is less then
For example, let's say that
df2 is sampled every minute, and there are rows with timestamps
df1. In this case I would like the value from
df2.a[08:00:00] to be used for the first two rows, and
df2.a[08:02:00] for the third. How can I do this?
apply to rows of df1, reindex on df2 with ffill.
df1['df2.a'] = df1.apply(lambda x: pd.Series(df2.a.reindex([x.name]).ffill().values), axis=1)
pd.merge(df1, df2, left_on='ts', right_on='a')