having cv2.imread reading images from file objects or memory-stream-like data (here non-extracted ta

I have a .tar file containing several hundreds of pictures (.png). I need to process them via opencv.

I am wondering whether - for efficiency reasons - it is possible to process them without passing by the disc. In other, words I want to read the pictures from the memory stream related to the tar file.

Consider for instance

import tarfile import cv2 tar0 = tarfile.open('mytar.tar') im = cv2.imread( tar0.extractfile('fname.png').read() )

The last line doesn't work as imread expects a file name rather than a stream.

Consider that this way of reading directly from the tar stream can be achieved e.g. for text (see e.g. this SO question).


Any suggestion to open the stream with the correct png encoding?

Untarring to ramdisk is of course an option, although I was looking for something more cachable.


Thanks to the suggestion of @abarry and this SO answer I managed to find the answer.

Consider the following

def get_np_array_from_tar_object(tar_extractfl): '''converts a buffer from a tar file in np.array''' return np.asarray( bytearray(tar_extractfl.read()) , dtype=np.uint8) tar0 = tarfile.open('mytar.tar') im0 = cv2.imdecode( get_np_array_from_tar_object(tar0.extractfile('fname.png')) , 0 )


Perhaps use imdecode with a buffer coming out of the tar file? I haven't tried it but seems promising.


  • open .tar.gz archives in python
  • How can I tar multiple files in Perl?
  • When does an action not run on the driver in Apache Spark?
  • exceptions.TypeError: src is not a numpy array, neither a scalar
  • Face aligment check with DLIB
  • OpenCV's video capture not returning an image
  • Error while trying to upload file using kairos
  • How can I enlarge video fullscreen without the affected interface project in as3?
  • How integrated is Collada to OpenGL ES
  • jQuery ready not fired after rails link_to is clicked
  • Debug.DrawLine not showing in the GameView
  • How to define custom class, title, and target in Link Browser for content elements and the new rte_c
  • Display images in Django
  • How to match http request and response using Jersey ContainerRequestFilter and ContainerResponseFilt
  • Moving mysql files across servers
  • $wpdb not working in file of WordPress plugin
  • How to use an array of arrays with array_map(…) in PHP?
  • Highlight one bar in a series in highcharts?
  • Paperclip, set path outside of rails root folder
  • R - Combining Columns to String Based on Logical Match
  • Get one-time binding to work for ng-if
  • How reduce the height of an mschart by breaking up the y-axis
  • How to redirect a user to a different server and include HTTP basic authentication credentials?
  • Running a C# exe file
  • Perl system calls when running as another user using sudo
  • Can I make an Android app that runs a web view in Chrome 39?
  • Do create extension work in single-user mode in postgres?
  • Return words with double consecutive letters
  • R: gsub and capture
  • Calling of Constructors in a Java
  • jqPlot EnhancedLegendRenderer plugin does not toggle series for Pie charts
  • Comma separated Values
  • PHP: When would you need the self:: keyword?
  • Python: how to group similar lists together in a list of lists?
  • how does django model after text[] in postgresql [duplicate]
  • Cant find why the layout is getting smaller
  • LevelDB C iterator
  • Linking SubReports Without LinkChild/LinkMaster
  • Busy indicator not showing up in wpf window [duplicate]
  • How to load view controller without button in storyboard?