12239

how to replace the tabs with empty space in each file of a directory

Question:

I would like to replace the tabs in each file of a directory with the <strong>corresponding</strong> empty space. I found already a solution 11094383, where you can replace tabs with <strong>given</strong> number of empty spaces:

> find ./ -type f -exec sed -i 's/\t/ /g' {} \;

In the solution above tabs are replaced with four spaces. But in my case tabs can occupy more spaces - e.g. 8.

An example of file with tabs, which should be replaced with 8 spaces is:

NSMl1 100 PSHELL 0.00260 400000 400200 400300 400400 400500 400600 400700 400800 400900 401000 401100 400100 430000 430200 430300 430400 430500 430600 430700 430800 430900 431000 431100 430100 401200 431200

here the lines with tabs are the 3th to the 5th line.

An example of file with tabs, which should be replaced with 4 tabs is:

RBE2 1101001 5000511 123456 1100

Could anybody help?

Answer1:

The classic answer is to use the pr command with options to expand tabs into an appropriate number of spaces, turning of the pagination features:

pr -e8 -l1 -t …files…

The tricky part is getting the file over-written that seems to be part of the question. Of course, sed in the GNU and BSD (Mac OS X) incarnations supports overwriting with the -i option — with variant behaviours between the two as BSD sed requires a suffix for the backup files and GNU sed does not. However, sed does not (readily) support converting tabs to an appropriate number of blanks, so it isn't wholly appropriate.

There's a script overwrite (which I abbreviate to ow) in <a href="http://rads.stackoverflow.com/amzn/click/013937681X" rel="nofollow">The UNIX Programming Environment</a> that can do that. I've been using the script since 1987 (first checkin — last updated in 2005).

#!/bin/sh # Overwrite file # From: The UNIX Programming Environment by Kernighan and Pike # Amended: remove PATH setting; handle file names with blanks. case $# in 0|1) echo "Usage: $0 file command [arguments]" 1>&2 exit 1;; esac file="$1" shift new=${TMPDIR:-/tmp}/ovrwr.$$.1 old=${TMPDIR:-/tmp}/ovrwr.$$.2 trap "rm -f '$new' '$old' ; exit 1" 0 1 2 15 if "$@" >"$new" then cp "$file" "$old" trap "" 1 2 15 cp "$new" "$file" rm -f "$new" "$old" trap 0 exit 0 else echo "$0: $1 failed - $file unchanged" 1>&2 rm -f "$new" "$old" trap 0 exit 1 fi

It would be possible and arguably better to use the mktemp command on most systems these days; it didn't exist way back then.

In the context of the question, you could then use:

find . -type f -exec ow {} pr -e8 -t -l1 \;

You do need to process each file separately.

If you are truly determined to use sed for the job, then you have your work cut out. There's a gruesome way to do it. There is a notational problem; how to represent a literal tab; I will use \t to denote it. The script would be stored in a file, which I'll assume is script.sed:

:again /^\(\([^\t]\{8\}\)*\)\t/s//\1 / /^\(\([^\t]\{8\}\)*\)\([^\t]\{1\}\)\t/s//\1\3 / /^\(\([^\t]\{8\}\)*\)\([^\t]\{2\}\)\t/s//\1\3 / /^\(\([^\t]\{8\}\)*\)\([^\t]\{3\}\)\t/s//\1\3 / /^\(\([^\t]\{8\}\)*\)\([^\t]\{4\}\)\t/s//\1\3 / /^\(\([^\t]\{8\}\)*\)\([^\t]\{5\}\)\t/s//\1\3 / /^\(\([^\t]\{8\}\)*\)\([^\t]\{6\}\)\t/s//\1\3 / /^\(\([^\t]\{8\}\)*\)\([^\t]\{7\}\)\t/s//\1\3 / t again

That's using the classic sed notation.

You can then write:

sed -f script.sed …data-files…

If you have GNU sed or BSD (Mac OS X) sed, you can use the extended regular expressions instead:

:again /^(([^\t]{8})*)\t/s//\1 / /^(([^\t]{8})*)([^\t]{1})\t/s//\1\3 / /^(([^\t]{8})*)([^\t]{2})\t/s//\1\3 / /^(([^\t]{8})*)([^\t]{3})\t/s//\1\3 / /^(([^\t]{8})*)([^\t]{4})\t/s//\1\3 / /^(([^\t]{8})*)([^\t]{5})\t/s//\1\3 / /^(([^\t]{8})*)([^\t]{6})\t/s//\1\3 / /^(([^\t]{8})*)([^\t]{7})\t/s//\1\3 / t again

and then run:

sed -r -f script.sed …data-files… # GNU sed sed -E -f script.sed …data-files… # BSD sed

What do the scripts do?

The first line sets a label; the last line jumps to that label if any of the s/// operations in between made a substitution. So, for each line of the file, the script loops until there are no matches made, and hence no substitutions performed.

The 8 substitutions deal with:

<ul><li>A block of zero or more sequences of 8 non-tabs, which is captured, followed by</li> <li>a sequence of 0-7 more non-tabs, which is also captured, followed by</li> <li>a tab.</li> <li>It replaces that match with the captured material, followed by an appropriate number of spaces.</li> </ul>

One curiosity found during the testing is that if a line ends with white space, the pr command removes that trailing white space.

There's also the expand command on some systems (BSD or Mac OS X at least), which preserves the trailing white space. Using that is simpler than pr or sed.

With these sed scripts, and using the BSD or GNU sed with backup files, you can write:

find . -type f -exec sed -i.bak -r -f script.sed {} +

(GNU sed notation; substitute -E for -r for BSD sed.)

Recommend

  • standalone exe in Qt
  • log4J: Failure in post-close rollover action using TimeBasedRollingPolicy
  • How to use org.jboss.varia.property.SystemPropertiesService and org.jboss.util.property.PropertyList
  • Which version of ANSI C standard does Turbo C 3.0 follow?
  • ToString vs. ToString() in VB.NET
  • CoordinatorLayout moves content
  • Cannot both disable entity field and retain its value
  • Mule ESB: Are the Log4j Config for Batch in Mule need separate configuration?
  • Bounded contexts sharing a same aggregate
  • How to accept user input without waiting in Python
  • Erlang: what supervision tree should I end with writing a task scheduler?
  • PyCharm opens old versions or files in duplicate tabs while debugging
  • awk how to remove duplicates in a field except for some specific strings
  • CLISP : Check if two elements are in order one after another in a list
  • Adding properties to UIControls without subclassing
  • Simulate click event on select (not working for IE and FF)
  • How do I exit a series of If / else conditions in a mysql trigger?
  • MinGW's compiler option Wl,--kill-at does not work
  • why when we write \\n in the file it converts into \\r\\n combination?
  • How to resize a pixmap with XLib?
  • how to remove comments from a bash script
  • Static url to asset
  • How can we extract the main verb from a sentence?
  • When i use auto bi = 123456789, in C++, is it always assigned as an int?
  • How to implement arriving behavior with time delta?
  • Where to save the local DB created for iphone app?
  • creating instance of object using reflection , when the constructor takes an array of strings as arg
  • Is there a way to choose which files are displayed to the user via the standard OPENFILE dialogs?
  • How can i dump blob fields from mysql tables
  • python: forcing relative imports to search from script file
  • Alamofire and Reachability.swift not working on xCode8-beta5
  • Implementing “partial void” in VB
  • Convert Type Decimal to Hex (string) in .NET 3.5
  • What's the purpose of QString?
  • Firefox Extension - Monitor refresh and change of tab
  • Hardware Accelerated Image Scaling in windows using C++
  • Can I make an Android app that runs a web view in Chrome 39?
  • Timeout for blocking function call, i.e., how to stop waiting for user input after X seconds?
  • How to include full .NET prerequisite for Wix Burn installer
  • Does armcc optimizes non-volatile variables with -O0?