58729

Generate different random numbers to multiple rows in a column

Question:

I got a columns with integer values(n rows). I want to generate random numbers that range from a normal distribution on values that meet certain condition. I tried with code below but they are too slow.

df_members['bd'] = df_members.bd.apply(lambda x: np.random.normal(bd_mean, bd_sd) if float(x)==-99999 else x )

I tried with code below but it will only assign one random value to all the rows.

bd_mean = 29.2223808862 bd_std = 10.4168850957 df_members[df_members['bd'] == -99999] = np.random.normal(bd_mean, bd_sd)

Example Data:

msno city bd gender registered_via 0 URiXrfYPzHAlk+7+n7BOMl9G+T7g8JmrSnT/BU8GmEo= 1 -99999 NaN 9 1 U1q0qCqK/lDMTD2kN8G9OXMtfuvLCey20OAIPOvXXGQ= 1 26 NaN 4 2 W6M2H2kAoN9ahfDYKo3J6tmsJRAeuFc9wl1cau5VL1Q= 1 -99999 NaN 4 3 1qE5+cN7CUyC+KFH6gBZzMWmM1QpIVW6A43BEm98I/w= 5 17 female 4 4 SeAnaZPI+tFdAt+r3lZt/B8PgTp7bcG/1os39u4pLxs= 1 -99999 NaN 4

EDIT

I guess that generating 3425689(rows) random numbers will take a long time. I will stick to the first way at this moment.

Answer1:

You're missing the <a href="https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.random.normal.html#numpy.random.normal" rel="nofollow">"size" argument</a> that will give the shape of the random values to be generated.

df_members[df_members['bd'] == -99999] = np.random.normal(bd_mean, bd_sd,len(df_members[df_members['bd'] == -99999]))

will give you what you want

Recommend

  • how to parse different type of xml data or similar to xml structure
  • how to make Meteor.user field reactive
  • How to items from jcombobox to mysql table
  • display Enum Values in UI with default value if not selected
  • How to include multiple select statements in one table
  • Removing object from one array if present in another array based on value
  • Search Field on multiple indexes in a html table using java-script
  • MySQL stops using index when additional constraints are added
  • MockMvc: forwardedUrl is null
  • printing the top 2 of frequently occurred values of the target column
  • Vim syntax highlighting
  • How do you programmatically focus on a column/row in ngx-datatable?
  • TagHelpers add custom class for LabelTagHelper based on validation attribute [Required]
  • How can we get radio button values from form using body-parser on an expressjs server?
  • PL/SQL: re-write SELECT statement using IN parameter in stored procedure
  • Add delivery info to query in SAP Crystal Reports
  • XGBOOST - DMATRIX
  • Update a record where _id = :id with Mongoose
  • missing parameter name at index 0 {}
  • Python PIL remove sections of an image based on its colour
  • FlexJSON Orders Alphabetically by Default
  • What's a fast (non-loop) way to apply a dict to a ndarray (meaning use elements as keys and rep
  • SQL - count occurrences of gender
  • I don't get what's the difference between format() and … (python)
  • How to define an array of floats in Shader properties?
  • Mocha throws unexpected token error for ES6 object spread operator
  • Scala: Function returning an unknown type
  • Criterion causing memory consumption to explode, no CAFs in sight
  • Find 5 consecutive numbers in numpy array by row, ignore duplicates
  • Put value at centre of bins for histogram
  • Make new pandas columns based on pipe-delimited column with possible repeats
  • Reduction and collapse clauses in OMP have some confusing points
  • Assign variable to the value in HTML
  • MongoDb aggregation
  • Set the selected item in dropdownlist in MVC3
  • How to use remove-erase idiom for removing empty vectors in a vector?
  • Why value captured by reference in lambda is broken? [duplicate]
  • Opengl-es onTouchEvents problem or a draw problem? [closed]
  • Redux, normalised entities and lodash merge
  • WPF Applying a trigger on binding failure