![Transpose using AWK [duplicate]](https://www.xszz.org/skin/wt/rpic/t3.jpg)
Question:
<blockquote>
<strong>Possible Duplicate:</strong><br /><a href="https://stackoverflow.com/questions/11659588/transpose-using-awk-or-perl" rel="nofollow">Transpose using AWK or Perl</a>
</blockquote>Hi I would like to use AWK to get an output file in the format below. My input file is a space separated text file. Any help with this would be appreciated. Thanks
Input File
id quantity colour shape size colour shape size colour shape size
1 10 blue square 10 red triangle 12 pink circle 20
2 12 yellow pentagon 3 orange rectangle 4 purple oval 6
Desired Output
id colour shape size
1 blue square 10
1 red triangle 12
1 pink circle 20
2 yellow pentagon 3
2 orange rectangle 4
2 purple oval 6
Answer1:This is generalized so you can choose columns for output by name. I've tested it with additional columns in the input data and they don't get output regardless of their position.
#!/usr/bin/awk -f
BEGIN {
col_list = "colour shape size"
# Use a B ("blank") to add spaces in the output before or
# after a format string (e.g. %6dB), but generally use the numeric argument
col_fmt = "%-12s %-14s %5d"
# columns to be repeated on multiple lines may appear anywhere in
# the input, but they will be output together at the beginning of the line
repeat_fields["id"]
# since these are individually set we won't use B
repeat_fmt["id"] = "%4d "
# additional fields to repeat on each line
#repeat_fields["another"]
#repeat_fmt["another"] = "%8s"
ncols = split(col_list, cols)
split(col_fmt, fmts)
for (i = 1; i <= ncols; i++) {
col_names[cols[i]]
forms[cols[i]] = fmts[i]
}
}
# save the positions of the columns using the header line
FNR == 1 {
for (i = 1; i <= NF; i++) {
if ($i in repeat_fields) {
repeat[++nrepeats] = i
repeat_look[i] = i
rformats[i] = repeat_fmt[$i]
}
if ($i in col_names) {
col_nums[++n] = i
col_look[i] = i
formats[i] = forms[$i]
}
}
# print the header line
for (i = 1; i <= nrepeats; i++) {
f = rformats[repeat[i]]
sub("d", "s", f)
gsub("B", " ", f)
printf f, $repeat[i]
}
for (i = 1; i <= ncols; i++) {
f = formats[col_nums[i]]
sub("d", "s", f)
gsub("B", " ", f)
printf f, $col_nums[i]
}
printf "\n"
next
}
{
for (i = 1; i <= NF; i++) {
if (i in repeat_look) {
f = rformats[i]
gsub("B", " ", f)
repeat_out = repeat_out sprintf(f, $i)
}
if (i in col_look) {
f = formats[i]
gsub("B", " ", f)
out = out sprintf(f, $i)
coln++
}
if (coln == ncols) {
print repeat_out out
out = ""
coln = 0
}
}
repeat_out = ""
}
With this modified input data:
no id colour base shape size colour shape size colour shape size material
14 1 blue twenty square 10 red triangle 12 pink circle 20 wool
23 2 yellow ninety pentagon 3 orange rectangle 4 purple oval 6 cotton
the output is:
id colour shape size
1 blue square 10
1 red triangle 12
1 pink circle 20
2 yellow pentagon 3
2 orange rectangle 4
2 purple oval 6
Answer2:see below:
kent$ cat a
id colour shape size colour shape size colour shape size
1 blue square 10 red triangle 12 pink circle 20
2 yellow pentagon 3 orange rectangle 4 purple oval 6
kent$ awk 'NR==1{print "id colour shape size";next;}
{id=$1; printf id;
for(i=2;i<=NF;i++){
printf FS$i; if((i-1)%3==0)printf (NF!=i)?"\n"id:"\n"; }}' a
id colour shape size
1 blue square 10
1 red triangle 12
1 pink circle 20
2 yellow pentagon 3
2 orange rectangle 4
2 purple oval 6
if you have 'column', you can pipe the output to column to make it look better:
kent$ awk 'NR==1{print "id colour shape size";next;}
{id=$1; printf id;
for(i=2;i<=NF;i++){
printf FS$i; if((i-1)%3==0)printf (NF!=i)?"\n"id:"\n"; }}' a|column -t
id colour shape size
1 blue square 10
1 red triangle 12
1 pink circle 20
2 yellow pentagon 3
2 orange rectangle 4
2 purple oval 6
Answer3:One way:
Content of script.awk
:
FNR == 1 {
for ( i = 1; i <= 4; i++ ) {
header = (header ? header "\t" : "") $i
}
printf "%s\n", header
next
}
FNR > 1 {
id = $1
for ( i = 2; i <= NF; i += 3 ) {
j = i + 2
for ( ; j >= i; j-- ) {
line = $j "\t" line
}
printf "%d\t%s\n", id, line
line = ""
}
}
Run it like (thanks to <em>Kent</em> for columnt -t
command):
awk -f script.awk infile | column -t
With following output:
id colour shape size
1 blue square 10
1 red triangle 12
1 pink circle 20
2 yellow pentagon 3
2 orange rectangle 4
2 purple oval 6
Answer4:
sed -e '1! {s/\([0-9][0-9]*\)[ \t][ \t]*/\1\n/g;}' test.txt |awk -vOFS="\t" 'NR==1 {print $1,$2,$3,$4} NF==1 {id=$1} NR>1 && NF>1 {print id,$0}'
Tried to do it all in sed, but I am to green with the hold buffers and what not. Anyhow, insert a line break after each number:
id colour shape size colour shape size colour shape size
1
blue square 10
red triangle 12
pink circle 20
2
yellow pentagon 3
orange rectangle 4
purple oval 6
and use awk to shift down the id to the following rows:
id colour shape size
1 blue square 10
1 red triangle 12
1 pink circle 20
2 yellow pentagon 3
2 orange rectangle 4
2 purple oval 6
Should be doable completely in sed though