having cv2.imread reading images from file objects or memory-stream-like data (here non-extracted ta

I have a .tar file containing several hundreds of pictures (.png). I need to process them via opencv.

I am wondering whether - for efficiency reasons - it is possible to process them without passing by the disc. In other, words I want to read the pictures from the memory stream related to the tar file.

Consider for instance

import tarfile import cv2 tar0 = tarfile.open('mytar.tar') im = cv2.imread( tar0.extractfile('fname.png').read() )

The last line doesn't work as imread expects a file name rather than a stream.

Consider that this way of reading directly from the tar stream can be achieved e.g. for text (see e.g. this SO question).

<hr>

Any suggestion to open the stream with the correct png encoding?

Untarring to ramdisk is of course an option, although I was looking for something more cachable.

Answer1:

Thanks to the suggestion of @abarry and this SO answer I managed to find the answer.

Consider the following

def get_np_array_from_tar_object(tar_extractfl): '''converts a buffer from a tar file in np.array''' return np.asarray( bytearray(tar_extractfl.read()) , dtype=np.uint8) tar0 = tarfile.open('mytar.tar') im0 = cv2.imdecode( get_np_array_from_tar_object(tar0.extractfile('fname.png')) , 0 )

Answer2:

Perhaps use imdecode with a buffer coming out of the tar file? I haven't tried it but seems promising.

人吐槽 人点赞

Recommend

Comment

用户名: 密码:
验证码: 匿名发表

你可以使用这些语言

查看评论:having cv2.imread reading images from file objects or memory-stream-like data (here non-extracted ta