79147

Python Pandas check if a value occurs more then once in the same day

Question:

I have a Pandas dataframe as below. What I am trying to do is check if a station has variable yyy and any other variable on the same day (as in the case of station1). If this is true I need to delete the whole row containing yyy.

Currently I am doing this using iterrows() and looping to search the days in which this variable appears, changing the variable to something like "delete me", building a new dataframe from this (because <a href="https://stackoverflow.com/questions/15972264/why-doesnt-this-function-take-after-i-iterrows-over-a-pandas-dataframe" rel="nofollow">pandas doesn't support replacing in place</a>) and filtering the new dataframe to get rid of the unwanted rows. This works now because my dataframes are small, but is not likely to scale.

<strong>Question:</strong> This seems like a very "non-Pandas" way to do this, is there some other method of deleting out the unwanted variables?

dateuse station variable1 0 2012-08-12 00:00:00 station1 xxx 1 2012-08-12 00:00:00 station1 yyy 2 2012-08-23 00:00:00 station2 aaa 3 2012-08-23 00:00:00 station3 bbb 4 2012-08-25 00:00:00 station4 ccc 5 2012-08-25 00:00:00 station4 ccc 6 2012-08-25 00:00:00 station4 ccc

Answer1:

I might index using a boolean array. We want to delete rows (if I understand what you're after, anyway!) which have yyy and more than one dateuse/station combination.

We can use transform to broadcast the size of each dateuse/station combination up to the length of the dataframe, and then select the rows in groups which have length > 1. Then we can & this with where the yyys are.

>>> multiple = df.groupby(["dateuse", "station"])["variable1"].transform(len) > 1 >>> must_be_isolated = df["variable1"] == "yyy" >>> df[~(multiple & must_be_isolated)] dateuse station variable1 0 2012-08-12 00:00:00 station1 xxx 2 2012-08-23 00:00:00 station2 aaa 3 2012-08-23 00:00:00 station3 bbb 4 2012-08-25 00:00:00 station4 ccc 5 2012-08-25 00:00:00 station4 ccc 6 2012-08-25 00:00:00 station4 ccc

Recommend

  • Laravel - Dynamic relationship using hasManyThough() and unique merge
  • App Script HtmlService use with font like Google Web Fonts?
  • How should an app react when indexedDB is blocked
  • How to create a decimal.Decimal object with a given number of significant figures?
  • Is there a greater chance to collide when comparing GUIDs based on a hash vs \"Guid.NewGuid()?
  • Executing a function that adds columns and populates them dependig on other columns in Pandas
  • Show HTML user input, security issue
  • Is it possible to collect a stream to two different collections using one line?
  • d3js: time scaling and “1901”
  • substitute period from abbreviation (single letter + period) unless followed by a capital letter
  • Parsing Data From Long to Wide Format in Python
  • Visual studio alerts workspace already exists
  • Pandas groupby to to_csv
  • Updating and removing unique join relationships in CakePHP
  • Get the last date of each month in a list of dates in Python
  • Efficient algorithm to find additions and removals from 2 collections
  • Prolog Ambiguous Output
  • Mocha throws unexpected token error for ES6 object spread operator
  • Primefaces lazy datascroller calling load twice
  • Building Qt project for C++11 standard
  • How to access meteor package name inside package?
  • Rest Services conventions
  • SonarQube: Cannot deactivate rule with missing quality profile
  • Excel's Macro-Recorder usage
  • x64 applications using gdi+: what are the consequences on performance?
  • How to use carriage return with multiple line?
  • Rails Find when some params will be blank
  • Hardware Accelerated Image Scaling in windows using C++
  • Functions in global context
  • Ajax Loaded meta Tags
  • Xamarin Forms - UWP Fonts
  • Jenkins: How To Build multiple projects from a TFS repository?
  • Can I make an Android app that runs a web view in Chrome 39?
  • Arrow is showed instead of the material design version hamburger icon. Why doesn't syncState in
  • Weird JavaScript statement, what does it mean?
  • Timeout for blocking function call, i.e., how to stop waiting for user input after X seconds?
  • Arrays break string types in Julia
  • How to include full .NET prerequisite for Wix Burn installer
  • need help with bizarre java.net.HttpURLConnection behavior
  • Running Map reduces the dimensions of the matrices