35281

Plot the result of a groupby operation in pandas

Question:

I have this sample table:

ID Date Days Volume/Day 0 111 2016-01-01 20 50 1 111 2016-02-01 25 40 2 111 2016-03-01 31 35 3 111 2016-04-01 30 30 4 111 2016-05-01 31 25 5 111 2016-06-01 30 20 6 111 2016-07-01 31 20 7 111 2016-08-01 31 15 8 111 2016-09-01 29 15 9 111 2016-10-01 31 10 10 111 2016-11-01 29 5 11 111 2016-12-01 27 0 0 112 2016-01-01 31 55 1 112 2016-02-01 26 45 2 112 2016-03-01 31 40 3 112 2016-04-01 30 35 4 112 2016-04-01 31 30 5 112 2016-05-01 30 25 6 112 2016-06-01 31 25 7 112 2016-07-01 31 20 8 112 2016-08-01 30 20 9 112 2016-09-01 31 15 10 112 2016-11-01 29 10 11 112 2016-12-01 31 0

I'm trying to make my table final table look like this below after grouping by ID and Date.

ID Date CumDays Volume/Day 0 111 2016-01-01 20 50 1 111 2016-02-01 45 40 2 111 2016-03-01 76 35 3 111 2016-04-01 106 30 4 111 2016-05-01 137 25 5 111 2016-06-01 167 20 6 111 2016-07-01 198 20 7 111 2016-08-01 229 15 8 111 2016-09-01 258 15 9 111 2016-10-01 289 10 10 111 2016-11-01 318 5 11 111 2016-12-01 345 0 0 112 2016-01-01 31 55 1 112 2016-02-01 57 45 2 112 2016-03-01 88 40 3 112 2016-04-01 118 35 4 112 2016-05-01 149 30 5 112 2016-06-01 179 25 6 112 2016-07-01 210 25 7 112 2016-08-01 241 20 8 112 2016-09-01 271 20 9 112 2016-10-01 302 15 10 112 2016-11-01 331 10 11 112 2016-12-01 362 0

Next, I want to be able to extract the first value of Volume/Day per ID, all the CumDays values and all the Volume/Day values per ID and Date. So I can use them for further computation and plotting Volume/Day vs CumDays. Example for ID:111, the first value of Volume/Day will be only 50 and ID:112, it will be only 55. All CumDays values for ID:111 will be 20,45... and ID:112, it will be 31,57...For all Volume/Day --- ID:111, will be 50, 40... and ID:112 will be 55,45...

My solution:

def get_time_rate(grp_df): t = grp_df['Days'].cumsum() r = grp_df['Volume/Day'] return t,r vals = df.groupby(['ID','Date']).apply(get_time_rate) vals

Doing this, the cumulative calculation doesn't take effect at all. It returns the original Days value. This didn't allow me move further in extracting the first value of Volume/Day, all the CumDays values and all the Volume/Day values I need. Any advice or help on how to go about it will be appreciated. Thanks

Answer1:

Get a groupby object.

g = df.groupby('ID')

Compute columns with transform:

df['CumDays'] = g.Days.transform('cumsum') df['First Volume/Day'] = g['Volume/Day'].transform('first') df ID Date Days Volume/Day CumDays First Volume/Day 0 111 2016-01-01 20 50 20 50 1 111 2016-02-01 25 40 45 50 2 111 2016-03-01 31 35 76 50 3 111 2016-04-01 30 30 106 50 4 111 2016-05-01 31 25 137 50 5 111 2016-06-01 30 20 167 50 6 111 2016-07-01 31 20 198 50 7 111 2016-08-01 31 15 229 50 8 111 2016-09-01 29 15 258 50 9 111 2016-10-01 31 10 289 50 10 111 2016-11-01 29 5 318 50 11 111 2016-12-01 27 0 345 50 0 112 2016-01-01 31 55 31 55 1 112 2016-01-02 26 45 57 55 2 112 2016-01-03 31 40 88 55 3 112 2016-01-04 30 35 118 55 4 112 2016-01-05 31 30 149 55 5 112 2016-01-06 30 25 179 55 6 112 2016-01-07 31 25 210 55 7 112 2016-01-08 31 20 241 55 8 112 2016-01-09 30 20 271 55 9 112 2016-01-10 31 15 302 55 10 112 2016-01-11 29 10 331 55 11 112 2016-01-12 31 0 362 55 <hr />

If you want grouped plots, you can iterate over each groups after grouping by ID. To plot, first set index and call plot.

fig, ax = plt.subplots(figsize=(8,6)) for i, g in df2.groupby('ID'): g.plot(x='CumDays', y='Volume/Day', ax=ax, label=str(i)) plt.show()

<a href="https://i.stack.imgur.com/RYKlU.png" rel="nofollow"><img alt="enter image description here" class="b-lazy" data-src="https://i.stack.imgur.com/RYKlU.png" data-original="https://i.stack.imgur.com/RYKlU.png" src="https://etrip.eimg.top/images/2019/05/07/timg.gif" /></a>

Recommend

  • char to int conversion
  • why is django using ascii instead of utf-8?
  • vhdl 4 bit vedic multiplier
  • Merge video with ffmpeg
  • MySQL: Get nth highest value for each group in a table
  • libavdevice.so.57: failed to map segment from shared object: Permission denied\"
  • Default argument in c++
  • Django - No such table: main.auth_user__old
  • Using or in multiline sed replacement
  • regular expression to consider periods after numeric character only in php
  • How to constrain rotation from mouse input in unity 5?
  • PDF conversion suddenly fails if reading stylesheet from SSL
  • in R, how to distribution data into different group
  • Rectangular Nesting - Convergence to optimal solution using Simulated Annealing
  • How to copy pandas DataFrame values down to fill 0's?
  • Convert row values into columns in R
  • Easy Way to Get Averages Based on Names in List
  • Recreate Apple Watch fitness tracker ‘progress’ bar - gradient on CAShapeLayer stroke
  • Two column unique constraint ActiveAndroid
  • LibGdx GLES2.0 cube texel stretching
  • mysql and indexes with more than one column
  • Use allowDiskUse in criteria query with Grails and the MongoDB plugin?
  • Python: Split a String Field into 3 Separate Fields using Lambda
  • to implement a spinner in angular2+
  • Scala: Function returning an unknown type
  • jQuery - resize an elements height to match window without refreshing, on window resize
  • Group list of tuples by item
  • Reduction and collapse clauses in OMP have some confusing points
  • Grails calculated field in SQL
  • MongoDb aggregation
  • How to use remove-erase idiom for removing empty vectors in a vector?
  • How to add date and time under each post in guestbook in google app engine
  • Excel - Autoshape get it's name from cell (value)
  • Check if a string to interpolate provides expected placeholders
  • JSON with duplicate key names losing information when parsed
  • Return words with double consecutive letters
  • RestKit - RKRequestDelegate does not exist
  • Traverse Array and Display in markup
  • Python: how to group similar lists together in a list of lists?
  • WPF Applying a trigger on binding failure