72334

How to iterate a vectorized if/else statement over additional columns?

Question:

import pandas as pd, numpy as np ltlist = [1, 2] org = {'ID': [1, 3, 4, 5, 6, 7], 'ID2': [3, 4, 5, 6, 7, 2]} ltlist_set = set(ltlist) org['LT'] = np.where(org['ID'].isin(ltlist_set), org['ID'], 0)

I'll need to check the ID2 column and write the ID in, unless it already has an ID.

output

ID ID2 LT 1 3 1 3 4 0 4 5 0 5 6 0 6 7 0 7 2 2

Thanks!

Answer1:

<strong>Option 1</strong>

You can nest numpy.where statements:

org['LT'] = np.where(org['ID'].isin(ltlist_set), 1, np.where(org['ID2'].isin(ltlist_set), 2, 0))

<strong>Option 2</strong>

Alternatively, you can use pd.DataFrame.loc sequentially:

org['LT'] = 0 # default value org.loc[org['ID2'].isin(ltlist_set), 'LT'] = 2 org.loc[org['ID'].isin(ltlist_set), 'LT'] = 1

<strong>Option 3</strong>

A third option is to use numpy.select:

conditions = [org['ID'].isin(ltlist_set), org['ID2'].isin(ltlist_set)] values = [1, 2] org['LT'] = np.select(conditions, values, 0) # 0 is default value

Recommend

  • Find and remove matching column values in pyspark
  • finding values in pandas series - Python3
  • Select from multiple slices in Numpy
  • Compare Pandas dataframes and add column
  • how to split and categorize value in a column of a pandas dataframe
  • Vectorizing the reshaping and cropping of images using PIL
  • In scipy why doesn't idct(dct(a)) equal to a?
  • Subclassing a Pandas DataFrame, updates?
  • Intel c/c++ compiler: “could not locate executable icc” (and ecc)
  • How to make Plotly chart with year mapped to line color and months on x-axis
  • Reshape array on xAxis and fill with mean value in Python?
  • What's a fast (non-loop) way to apply a dict to a ndarray (meaning use elements as keys and rep
  • How to filter on year and quarter in pandas
  • Python function to read variable length blocks of data from file while open
  • Color time-series based on column values in pandas
  • Find 5 consecutive numbers in numpy array by row, ignore duplicates
  • Converting query results into DataFrame in python
  • Put value at centre of bins for histogram
  • vectorized indexing/slicing in numpy/scipy?
  • Wrong labels when plotting a time series pandas dataframe with matplotlib
  • R convert summary result (statistics with all dataframe columns) into dataframe
  • Make new pandas columns based on pipe-delimited column with possible repeats
  • Groovy: Unexpected token “:”
  • Replace value with Factor in r data.table
  • Error when parsing timestamp with pandas read_csv
  • How to avoid particles glitching together in an elastic particle collision simulator?
  • Recording logins for password protected directories
  • How to access EntityManager inside Entity class in EJB3
  • Repeat a vertical line on every page in Report Builder / SSRS
  • Splitting given String into two variables - php
  • Check if a string to interpolate provides expected placeholders
  • Matplotlib draw Spline from multiple points
  • How to set the response of a form post action to a iframe source?
  • Change div Background jquery
  • Qt: Run a script BEFORE make
  • reshape alternating columns in less time and using less memory
  • costura.fody for a dll that references another dll
  • Observable and ngFor in Angular 2
  • UserPrincipal.Current returns apppool on IIS
  • java string with new operator and a literal