24555

Concatenating dictionaries of numpy arrays of different lengths (avoiding manual loops if possible)

I have a question similar to the one discussed here Concatenating dictionaries of numpy arrays (avoiding manual loops if possible)

I am looking for a way to concatenate the values in two python dictionaries that contain numpy arrays of arbitrary size whilst avoiding having to manually loop over the dictionary keys. For example:

import numpy as np # Create first dictionary n1 = 3 s = np.random.randint(1,101,n1) n2 = 2 r = np.random.rand(n2) d = {"r":r,"s":s} print "d = ",d # Create second dictionary n3 = 1 s = np.random.randint(1,101,n3) n4 = 3 r = np.random.rand(n4) d2 = {"r":r,"s":s} print "d2 = ",d2 # Some operation to combine the two dictionaries... d = SomeOperation(d,d2) # Updated dictionary print "d3 = ",d

to give the output

>> d = {'s': array([75, 25, 88]), 'r': array([ 0.1021227 , 0.99454874])} >> d2 = {'s': array([78]), 'r': array([ 0.27610587, 0.57037473, 0.59876391])} >> d3 = {'s': array([75, 25, 88, 78]), 'r': array([ 0.1021227 , 0.99454874, 0.27610587, 0.57037473, 0.59876391])}

i.e. so that if the key already exists, the numpy array stored under that key is appended to.

The solution proposed in the previous discussion using the package pandas does not work as it requires arrays having the same length (n1=n2 and n3=n4).

Does anybody know the best way to do this, whilst minimising the use of slow, manual for loops? (I would like to avoid loops because the dictionaries I would like to combine could have hundreds of keys).

Thanks (also to "Aim" for formulating a very clear question)!

Answer1:

One way is to go is use a dictionary of Series (i.e. the values are Series rather than arrays):

In [11]: d2 Out[11]: {'r': array([ 0.3536318 , 0.29363604, 0.91307454]), 's': array([46])} In [12]: d2 = {name: pd.Series(arr) for name, arr in d2.iteritems()} In [13]: d2 Out[13]: {'r': 0 0.353632 1 0.293636 2 0.913075 dtype: float64, 's': 0 46 dtype: int64}

That way you can pass it into the DataFrame constructor:

In [14]: pd.DataFrame(d2) Out[14]: r s 0 0.353632 46 1 0.293636 NaN 2 0.913075 NaN

Recommend

  • How to prove this statement of big o notation?
  • Queue using Arrays
  • Generating list of 2-lists in Scheme
  • Lua C API: Initializing a variable matrix in a structure C
  • R - Get number of values per group without counting NAs
  • Using StandardTokenizerFactory with currency
  • Bug in Number or BigInteger and BigDecimal (or alternatively in the API documentation of those)?
  • sql query to select record having same id but different value in two columns
  • Label Areas in Python Matplotlib stackplot
  • python pandas-possible to compare 3 dfs of same shape using where(max())? is this a masking issue?
  • Replace value in unordered list (html) + JQuery
  • Histogram of events grouped by month and day
  • Conflicting Types Error
  • C# - Random number with seed
  • Generate BIG random number php [duplicate]
  • Threading lock in python not working as desired
  • Can't stop while loop
  • Type definition: expected UnionAll, got TypeVar
  • How do I stop js files being cached in IE?
  • why doesn't this visNetwork in R show edge
  • How to generate random events in android?
  • Tips for creating scalable WPF user control
  • Yii: any way to save the images in compressed form?
  • Google Cloud Platform - Vanity Nameservers
  • How to Divide an array on c#?
  • Color time-series based on column values in pandas
  • What is the default HTTP verb in WebApi ? GET or POST?
  • Inversing an interpolation of rotation
  • Problem with Django using Apache2 (mod_wsgi), Occassionally is “unable to import from module” for no
  • TFS 2015 - Waiting for an agent to be requested
  • How to synchronize jQuery dialog box to act like alert() of Javascript
  • Object and struct member access and address offset calculation
  • Scrapy recursive link crawler
  • NetLogo BehaviorSpace - Measure runs using reporters
  • How to handle AllServersUnavailable Exception
  • align graphs with different xlab
  • Return words with double consecutive letters
  • embed rChart in Markdown
  • Reading document lines to the user (python)
  • Python/Django TangoWithDjango Models and Databases