Reading the same data from stdin multiple times in C


I'm writing a cache simulator in C that's based on trace files, which I want to pipe into the program via stdin. These trace files can be up to 15 billion lines long, so I don't want to store them anywhere in active memory. I want to run the simulation multiple times for different memory configurations from one call using a configuration file which is specified in the input to the program. The program call should look like this:

cat | (trace file) ./MemorySimulator -f (config file)

Right now, the way the program runs is that it uses the config file to set the parameters of a simulation then reads the piped in formatted data from stdin using scanf() until it reaches the end of the trace file. It then proceeds to the next configuration setting from the config file and tries to read data from the trace file over again. This process continues until the various configuration options have been exhausted.

The problem I'm running into is that once I run through the trace file once, I'm unable to capture the data again for the following memory configuration from the config file.

Is there a way to recycle the pipe data within my C program so that I can run the simulation multiple times from a single program execution? So far, I haven't been able to find a way to accomplish this.


No, that doesn't work. That's the very nature of a pipe.

You cannot have the demand that data isn't cached and at the same time that it can be re-requested.

In a pipe, one the data has been written, it is gone, so you haveto store it somewhere in order not to get lost.

The only way you can accomplish this is to "imitate" the behaviour of the other program - which should be trivial in the cat case.

To be exact, your code is a very good example for the famous UUOC (Unneecessary Use of cat).

If you are requested to read from stdin - well, that hasn't to be a pipe. Instead of

cat file | program

you can do

program < file

and this doesn't give you a pipe, but direct access to the file, including the ability to seek.

You could use this if possible, and if not, either cache the data yourself or refuse to run.

This, however, doesn't work if you are requested to accept <em>all</em> kinds of standard input.


You asked:


Is there a way to recycle the pipe data within my C program so that I can run the simulation multiple times from a single program execution?


If you are open to using the trace file as an input argument to the program, you can accomplish what you want.

Instead of

cat <tracefile> | ./MemorySimulator -f (config file)

you can use:

./MemorySimulator <tracefile> -f (config file)

In main, use fopen to open the trace file. Once you are done using it for one configuration, rewind using frewind and reuse the FILE* for the next configuration.

You can also use fopen/fclose on the trace file for each configuration.


Given your comments that you are required to read your data from stdin (and, I presume, cannot require stdin to be directly redirected from a file), you have little choice than to cache the data yourself. Since that data is more than 40GB, the cache better be a disk file.

What I'd do is, on the first pass, open a temporary file for read/write and as you read from a FILE* variable set equal to stdin, also write the data to your temporary file. At the end of the first pass, copy your temporary file fp to your input fp.

Now for the remaining passes, you can start be rewinding your input (temporary) file and read it for input.

You can use your loop counter to determine what you need to do each pass.

Here's an overview of this code:

infp = stdin; for (loop = 0; loop < NUM_LOOPS; loop++) { if (loop == 0) { tmpfp = fopen("tmpfile.tmp", "w"); //check for errors here } for (;;) { num_read = read(infp, buf, sizeof(buf)); // check for EOF here and break if so if (loop == 0) { num_written = write(tmpfp, buf, num_read); //check for write errors here } // Main input processing code } if (loop == 0) { infp = tmpfp; } rewind(infp); }


  • Filling a string buffer when using Ocilib
  • Call order of constructors
  • MYSQLi bind_result allocates too much memory
  • Android Mapview: Control ordering of multiple types of OverlayItems?
  • How to use back pressed options fragment to fragment in android
  • android : bitmap size exceeds VM budget
  • Python OR Operator Trouble
  • Getting text from inside editText that is contained in a Recyclerview
  • Find char width in pixels for various Arial fontsizes
  • FPDF Fatal error: Allowed memory size of 33554432 bytes exhausted (PHP)
  • Android: posting message to HandlerThread makes UI thread unresponsive or gives IllegalStateExceptio
  • iPhone: 5 seconds video capture
  • Printing out Japanese (Chinese) characters
  • how to increment a message header
  • Find group of records that match multiple values
  • Redshift Querying: error xx000 disk full redshift
  • Bash if statement with multiple conditions
  • How to remove a SwiftyJSON element?
  • Android application: how to use the camera and grab the image bytes?
  • Unable to install Git-core+svn by MacPorts
  • Unable to decode certificate at client new X509Certificate2()
  • How to run “Deployd” on port 80 instead of port 5000 in webserver.
  • Swift: Switch statement fallthrough behavior
  • The plugin 'org.apache.maven.plugins:maven-jboss-as-plugin' does not exist or no valid ver
  • Sails.js/waterline: Executing waterline queries in toJSON function of a model?
  • Ajax jQuery multiple calls at the same time - long wait for answer and not able to cancel
  • Launch Runnable Jar from Web Start
  • Join two tables and save into third-sql
  • How to model a transition system with SPIN
  • Redux, normalised entities and lodash merge
  • ActionScript 2 vs ActionScript 3 performance
  • ORA-29908: missing primary invocation for ancillary operator
  • Do create extension work in single-user mode in postgres?
  • How can I estimate amount of memory left with calling System.gc()?
  • Apache 2.4 - remove | delete | uninstall
  • How can I remove ASP.NET Designer.cs files?
  • python draw pie shapes with colour filled
  • Is there any way to bind data to data.frame by some index?
  • How can i traverse a binary tree from right to left in java?
  • Converting MP3 duration time