awk count selective combinations only:


Would like to read and count the field value == "TRUE" only from 3rd field to 5th field.


Locationx,Desc,A,B,C,Locationy ab123,Name1,TRUE,TRUE,TRUE,ab1234 ab123,Name2,TRUE,FALSE,TRUE,ab1234 ab123,Name2,FALSE,FALSE,TRUE,ab1234 ab123,Name1,TRUE,TRUE,TRUE,ab1234 ab123,Name2,TRUE,TRUE,TRUE,ab1234 ab123,Name3,FALSE,FALSE,FALSE,ab1234 ab123,Name3,TRUE,FALSE,FALSE,ab1234 ab123,Name3,TRUE,TRUE,FALSE,ab1234 ab123,Name3,TRUE,TRUE,FALSE,ab1234 ab123,Name1,TRUE,TRUE,FALSE,ab1234

While reading the headers from 3rd field to 5th field , i,e A, B, C want to generate unique combinations and permutations like A,B,C,AB,AC,AB,ABC only. Note: AA, BB, CC, BA etc excluded

If the "TRUE" is considered for "AB" combination count then it should not be considered for "A" conut & "B" count again to avoid duplicate ..


Locationx,Desc,A,B,C,Locationy ab123,Name1,TRUE,TRUE,TRUE,ab1234


Desc,A,B,C,AB,AC,BC,ABC Name1,,,,,,,1


Locationx,Desc,A,B,C,Locationy ab123,Name1,TRUE,TRUE,FALSE,ab1234


Desc,A,B,C,AB,AC,BC,ABC Name1,,,,1,,,


Locationx,Desc,A,B,C,Locationy ab123,Name1,FALSE,TRUE,FALSE,ab1234


Desc,A,B,C,AB,AC,BC,ABC Name1,,1,,,,,

Desired Output:

Desc,A,B,C,AB,AC,BC,ABC Name1,,,,1,,,2 Name2,,,1,,1,,1 Name3,1,,,2,,,

Actual File is like below :



Have tried lot , nothing is materialised , any suggestions please !!!

Edit: Desired Output from Actual Input:


Don't have Perl and Python access !!!


I have written a perl script that does this for you. As you can see from the size and comments, it is really simple to get this done.

#!/usr/bin/perl use strict; use warnings; use autodie; use Algorithm::Combinatorics qw(combinations); ## change the file to the path where your file exists open my $fh, '<', 'file'; my (%data, @new_labels); ## capture the header line in an array my @header = split /,/, <$fh>; ## backup the header my @fields = @header; ## remove first, second and last columns @header = splice @header, 2, -1; ## generate unique combinations for my $iter (1 .. +@header) { my $combination = combinations(\@header, $iter); while (my $pair = $combination->next) { push @new_labels, "@$pair"; } } ## iterate through rest of the file while(my $line = <$fh>) { my @line = split /,/, $line; ## identify combined labels that are true my @is_true = map { $fields[$_] } grep { $line[$_] eq "TRUE" } 0 .. $#line; ## increment counter in hash map keyed at description and then new labels ++$data{$line[1]}{$_} for map { s/ /-/g; $_ } "@is_true"; } ## print the new header print join ( ",", "Desc", map {s/ /-/g; $_} reverse @new_labels ) . "\n"; ## print the description and counter values for my $desc (sort keys %data){ print join ( ",", $desc, ( map { $data{$desc}{$_} //= "" } reverse @new_labels ) ) . "\n"; } <h3>Output:</h3> Desc,INCOMING-OUTGOING-SMS-RECHARGE-DEBIT-DATA,OUTGOING-SMS-RECHARGE-DEBIT-DATA,INCOMING-SMS-RECHARGE-DEBIT-DATA,INCOMING-OUTGOING-RECHARGE-DEBIT-DATA,INCOMING-OUTGOING-SMS-DEBIT-DATA,INCOMING-OUTGOING-SMS-RECHARGE-DATA,INCOMING-OUTGOING-SMS-RECHARGE-DEBIT,SMS-RECHARGE-DEBIT-DATA,OUTGOING-RECHARGE-DEBIT-DATA,OUTGOING-SMS-DEBIT-DATA,OUTGOING-SMS-RECHARGE-DATA,OUTGOING-SMS-RECHARGE-DEBIT,INCOMING-RECHARGE-DEBIT-DATA,INCOMING-SMS-DEBIT-DATA,INCOMING-SMS-RECHARGE-DATA,INCOMING-SMS-RECHARGE-DEBIT,INCOMING-OUTGOING-DEBIT-DATA,INCOMING-OUTGOING-RECHARGE-DATA,INCOMING-OUTGOING-RECHARGE-DEBIT,INCOMING-OUTGOING-SMS-DATA,INCOMING-OUTGOING-SMS-DEBIT,INCOMING-OUTGOING-SMS-RECHARGE,RECHARGE-DEBIT-DATA,SMS-DEBIT-DATA,SMS-RECHARGE-DATA,SMS-RECHARGE-DEBIT,OUTGOING-DEBIT-DATA,OUTGOING-RECHARGE-DATA,OUTGOING-RECHARGE-DEBIT,OUTGOING-SMS-DATA,OUTGOING-SMS-DEBIT,OUTGOING-SMS-RECHARGE,INCOMING-DEBIT-DATA,INCOMING-RECHARGE-DATA,INCOMING-RECHARGE-DEBIT,INCOMING-SMS-DATA,INCOMING-SMS-DEBIT,INCOMING-SMS-RECHARGE,INCOMING-OUTGOING-DATA,INCOMING-OUTGOING-DEBIT,INCOMING-OUTGOING-RECHARGE,INCOMING-OUTGOING-SMS,DEBIT-DATA,RECHARGE-DATA,RECHARGE-DEBIT,SMS-DATA,SMS-DEBIT,SMS-RECHARGE,OUTGOING-DATA,OUTGOING-DEBIT,OUTGOING-RECHARGE,OUTGOING-SMS,INCOMING-DATA,INCOMING-DEBIT,INCOMING-RECHARGE,INCOMING-SMS,INCOMING-OUTGOING,DATA,DEBIT,RECHARGE,SMS,OUTGOING,INCOMING Name1,,,,,,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1,,,1,,,,,,,,,,,,,,,,,,,,, Name2,,,,1,,1,,,,,,,,,,,,,,,,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Name3,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2,,,,,,,,,,,,,,,,,,,1,,,

<strong>Note:</strong> Please revisit your expected output. It has few mistakes in it as you can see from the output generated from the script above.


Here is an attempt at solving this using awk:

<h3>Content of script.awk</h3> BEGIN { FS = OFS = "," } function combinations(flds, itr, i, pre) { for (i=++cnt; i<=numRecs; i++) { ++n sep = "" for (pre=1; pre<=itr; pre++) { newRecs[n] = newRecs[n] sep (sprintf ("%s", flds[pre])); sep = "-" } newRecs[n] = newRecs[n] sep (sprintf ("%s", flds[i])) ; } } NR==1 { for (fld=3; fld<NF; fld++) { recs[++numRecs] = $fld } for (iter=0; iter<numRecs; iter++) { combinations(recs, iter) } next } !seen[$2]++ { desc[++d] = $2 } { y = 0; var = sep = "" for (idx=3; idx<NF; idx++) { if ($idx == "TRUE") { is_true[++y] = recs[idx-2] } } for (z=1; z<=y; z++) { var = var sep sprintf ("%s", is_true[z]) sep = "-" } data[$2,var]++; } END{ printf "%s," , "Desc" for (k=1; k<=n; k++) { printf "%s%s", newRecs[k],(k==n?RS:FS) } for (name=1; name<=d; name++) { printf "%s,", desc[name]; for (nR=1; nR<=n; nR++) { printf "%s%s", (data[desc[name],newRecs[nR]]?data[desc[name],newRecs[nR]]:""), (nR==n?RS:FS) } } } <h3>Sample file</h3> Locationx,Desc,A,B,C,Locationy ab123,Name1,TRUE,TRUE,TRUE,ab1234 ab123,Name2,TRUE,FALSE,TRUE,ab1234 ab123,Name2,FALSE,FALSE,TRUE,ab1234 ab123,Name1,TRUE,TRUE,TRUE,ab1234 ab123,Name2,TRUE,TRUE,TRUE,ab1234 ab123,Name3,FALSE,FALSE,FALSE,ab1234 ab123,Name3,TRUE,FALSE,FALSE,ab1234 ab123,Name3,TRUE,TRUE,FALSE,ab1234 ab123,Name3,TRUE,TRUE,FALSE,ab1234 ab123,Name1,TRUE,TRUE,FALSE,ab1234 <h3>Execution:</h3> $ awk -f script.awk file Desc,A,B,C,A-B,A-C,A-B-C Name1,,,,1,,2 Name2,,,1,,1,1 Name3,1,,,2,,

Now, there is pretty evident bug in the combination function. It does not recurse to print all combinations. For eg: for A B C D it will print


but not BC


  • how to convert PEM to P7B with python
  • How do I get the number of jobs in a rq queue?
  • Reading from Windows registry in Perl [duplicate]
  • Python equivalent of Scala's exists() function?
  • Python 2.7 on OS X: TypeError: 'frozenset' object is not callable on each command
  • Perl keyword say is not working in version 5.14.4
  • How can I stop my python script when another python script is running?
  • Extract data between rows r
  • Plotting A Hyperboloid
  • Binary Tree Traversal Sum Of Each Depth
  • replacing while loop with list comprehension
  • Best win32 compiled scripting language? [closed]
  • help('modules') crashing? Not sure how to fix
  • OpenCV Python: Draw minAreaRect ( RotatedRect not implemented)
  • Regex for Specific Tag
  • pymongo replication secondary readreference not work
  • How can we prepend rows to a react native list-view?
  • import scipy.sparse failed
  • perl, mysql - fasting way to upload a csv file into mysql?
  • Sending keystrokes/mouse clicks to a Java program with Autohotkey
  • During installation of Django, why do I keep getting ImportError: No module named django?
  • where do I find the xml.dom python package for the python-2.6.0-8.9.28 and I have a suse/x86_64 vers
  • Python pickle not one-to-one: different pickles give same object
  • Get specific string
  • What's the purpose of QString?
  • Insert new calendar with SyncAdapter- Calendar API Android
  • Ensure fsync did its job
  • Is there a perl module to validate passwords stored in “{crypt}hashedpassword” “{ssha}hashedpassword
  • Jackson Parser: ignore deserializing for type mismatch
  • How to match http request and response using Jersey ContainerRequestFilter and ContainerResponseFilt
  • Different response to non-authenticated users and AJAX calls
  • Fetching methods from BroadcastReceiver to update UI
  • Knitr HTML Loop - Some HTML output, some R output
  • Symfony2: How to get request parameter
  • GridView Sorting works once only
  • Windows forms listbox.selecteditem displaying “System.Data.DataRowView” instead of actual value
  • Unit Testing MVC Web Application in Visual Studio and Problem with QTAgent
  • WPF Applying a trigger on binding failure
  • Benchmarking RAM performance - UWP and C#
  • How can i traverse a binary tree from right to left in java?