75294

Question:

I know I can do something like

```
numpy.loadtxt('data.txt', dtype={'names': ('time', 'magnitude'),
'formats': ('S12', 'f8')})
```

but this gives me times as a string. How can I manipulate it into a float?

Answer1:You could use the <a href="http://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html" rel="nofollow">`converter`

parameter</a> to apply a function to each string in the first column. Calling a Python function once for each row may slow down `np.loadtxt`

considerably, but this might still be a workable solution for moderate-sized files:

```
import numpy as np
def parse_date(datestr):
return sum([multiplier*val for multiplier, val in
zip((3600, 60, 1), map(float, datestr.split(':')))])
x = np.loadtxt('data', dtype={'names': ('time', 'magnitude'), 'formats': ('f8', 'f8')},
converters={0:parse_date})
print(x)
```

<hr />Alternatively, you could parse the strings into floats after using loadtxt like this:

```
x = np.loadtxt('data', dtype={'names': ('time', 'magnitude'), 'formats': ('S12', 'f8')})
arr = np.char.split(x['time'], ':')
# http://stackoverflow.com/a/19459439/190597 (Jaime)
newarr = np.fromiter((tuple(row) for row in arr), dtype=[('', np.float)]*3,
count=len(arr)).view('float').reshape(-1, 3)
times = (newarr * [3600,60,1]).sum(axis=1)
y = np.empty_like(x, dtype={'names': ('time', 'magnitude'), 'formats': ('f8', 'f8')})
y['time'] = times
y['magnitude'] = x['magnitude']
print(y)
```

<hr />Edit: I created a test files of 10**6 lines to test which method is faster. The second method is a bit faster:

```
In [329]: %timeit using_fromiter()
1 loops, best of 3: 5.59 s per loop
In [328]: %timeit using_converter()
1 loops, best of 3: 6.88 s per loop
```

<hr />```
import os
import numpy as np
def create_data(N):
data = np.random.random(size=N)*86400
hours, remainder = data.__divmod__(3600)
minutes, seconds = remainder.__divmod__(60)
mag = np.arange(N)
filename = os.path.expanduser('~/tmp/data')
with open(filename, 'w') as f:
for h,m,s,a in np.column_stack([hours, minutes, seconds, mag]):
f.write('{h:d}:{m:d}:{s:.6f} {a}\n'.format(h=int(h), m=int(m), s=s, a=a))
def parse_date(datestr):
return sum([multiplier*val for multiplier, val in
zip((3600, 60, 1), map(float, datestr.split(':')))])
def using_converter():
x = np.loadtxt('data', dtype={'names': ('time', 'magnitude'),
'formats': ('f8', 'f8')},
converters={0:parse_date})
return x
def using_fromiter():
x = np.loadtxt('data', dtype={'names': ('time', 'magnitude'), 'formats': ('S12', 'f8')})
arr = np.char.split(x['time'], ':')
newarr = np.fromiter((tuple(row) for row in arr), dtype=[('', np.float)]*3,
count=len(arr)).view('float').reshape(-1, 3)
times = (newarr * [3600,60,1]).sum(axis=1)
y = np.empty_like(x, dtype={'names': ('time', 'magnitude'), 'formats': ('f8', 'f8')})
y['time'] = times
y['magnitude'] = x['magnitude']
return y
create_data(10**6)
```