23355

# Getting date ranges for multiple datetime pairs

Given a datetime array of the shape `(n, 2)`:

```x = np.array([['2017-10-02T00:00:00.000000000', '2017-10-12T00:00:00.000000000']], dtype='datetime64[ns]') ```

`x` has shape `(1, 2)`, but in reality it could be `(n, 2)`, `n >= 1`. In each pair, the first date is always smaller than (or equal to) the second. I want to get a list of all date ranges between each pair of dates in `x`. This is what I'm doing basically:

```np.concatenate([pd.date_range(*y, closed='right') for y in x]) ```

And it works, giving

```array(['2017-10-03T00:00:00.000000000', '2017-10-04T00:00:00.000000000', '2017-10-05T00:00:00.000000000', '2017-10-06T00:00:00.000000000', '2017-10-07T00:00:00.000000000', '2017-10-08T00:00:00.000000000', '2017-10-09T00:00:00.000000000', '2017-10-10T00:00:00.000000000', '2017-10-11T00:00:00.000000000', '2017-10-12T00:00:00.000000000'], dtype='datetime64[ns]') ```

But this is pretty slow because of the list comp - it isn't exactly vectorised as I'd like. I'm wondering if there's a better way to obtain date ranges for multiple pairs of dates?

I'll provide as much clarification as needed. Thanks.

```d = np.array(1, dtype='timedelta64[D]') x = x.astype('datetime64[D]') deltas = np.diff(x, axis=1) / d np.concatenate([ i + np.arange(j + 1) for i, j in zip(x[:, 0], deltas[:, 0].astype(int)) ]).astype('datetime64[ns]') array(['2017-10-02T00:00:00.000000000', '2017-10-03T00:00:00.000000000', '2017-10-04T00:00:00.000000000', '2017-10-05T00:00:00.000000000', '2017-10-06T00:00:00.000000000', '2017-10-07T00:00:00.000000000', '2017-10-08T00:00:00.000000000', '2017-10-09T00:00:00.000000000', '2017-10-10T00:00:00.000000000', '2017-10-11T00:00:00.000000000', '2017-10-12T00:00:00.000000000'], dtype='datetime64[ns]') ``` <hr>
<li>`d` represents one day</li> <li>`x` is turned into dates with no timestamps</li> <li>`diff` gets me the number of days difference... but in `timedelta` space</li> <li>I divide by my `d` which is also in `timedelta` space and the dimensions disappear... leaving me with `float` which I cast to `int`</li> <li>When I add the first column of the pairs `x[:, 0]` to an array of integers, I get a broadcasting of adding 1 unit of whatever the dimension is of `x`, which is `datetime64[D]`. So I'm adding one day.</li> </ul> <hr>
```d = np.array(1, dtype='timedelta64[D]') np.concatenate([np.arange(row[0], row[1] + 1, d) for row in x]) array(['2017-10-02T00:00:00.000000000', '2017-10-03T00:00:00.000000000', '2017-10-04T00:00:00.000000000', '2017-10-05T00:00:00.000000000', '2017-10-06T00:00:00.000000000', '2017-10-07T00:00:00.000000000', '2017-10-08T00:00:00.000000000', '2017-10-09T00:00:00.000000000', '2017-10-10T00:00:00.000000000', '2017-10-11T00:00:00.000000000', '2017-10-12T00:00:00.000000000'], dtype='datetime64[ns]') ```