48170

awk count selective combinations only:

Question:

Would like to read and count the field value == "TRUE" only from 3rd field to 5th field.

Input.txt

Locationx,Desc,A,B,C,Locationy ab123,Name1,TRUE,TRUE,TRUE,ab1234 ab123,Name2,TRUE,FALSE,TRUE,ab1234 ab123,Name2,FALSE,FALSE,TRUE,ab1234 ab123,Name1,TRUE,TRUE,TRUE,ab1234 ab123,Name2,TRUE,TRUE,TRUE,ab1234 ab123,Name3,FALSE,FALSE,FALSE,ab1234 ab123,Name3,TRUE,FALSE,FALSE,ab1234 ab123,Name3,TRUE,TRUE,FALSE,ab1234 ab123,Name3,TRUE,TRUE,FALSE,ab1234 ab123,Name1,TRUE,TRUE,FALSE,ab1234

While reading the headers from 3rd field to 5th field , i,e A, B, C want to generate unique combinations and permutations like A,B,C,AB,AC,AB,ABC only. Note: AA, BB, CC, BA etc excluded

If the "TRUE" is considered for "AB" combination count then it should not be considered for "A" conut & "B" count again to avoid duplicate ..

Example#1

Locationx,Desc,A,B,C,Locationy ab123,Name1,TRUE,TRUE,TRUE,ab1234

Op#1

Desc,A,B,C,AB,AC,BC,ABC Name1,,,,,,,1

Example#2

Locationx,Desc,A,B,C,Locationy ab123,Name1,TRUE,TRUE,FALSE,ab1234

Op#2

Desc,A,B,C,AB,AC,BC,ABC Name1,,,,1,,,

Example#3

Locationx,Desc,A,B,C,Locationy ab123,Name1,FALSE,TRUE,FALSE,ab1234

Op#3

Desc,A,B,C,AB,AC,BC,ABC Name1,,1,,,,,

Desired Output:

Desc,A,B,C,AB,AC,BC,ABC Name1,,,,1,,,2 Name2,,,1,,1,,1 Name3,1,,,2,,,

Actual File is like below :

Input.txt

Locationx,Desc,INCOMING,OUTGOING,SMS,RECHARGE,DEBIT,DATA,Locationy ab123,Name1,TRUE,TRUE,TRUE,FALSE,FALSE,FALSE,ab1234 ab123,Name2,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,ab1234 ab123,Name2,TRUE,TRUE,TRUE,TRUE,FALSE,FALSE,ab1234 ab123,Name1,TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,ab1234 ab123,Name2,TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,ab1234 ab123,Name3,FALSE,FALSE,FALSE,TRUE,FALSE,FALSE,ab1234 ab123,Name3,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,ab1234 ab123,Name3,TRUE,TRUE,FALSE,TRUE,FALSE,FALSE,ab1234 ab123,Name3,TRUE,TRUE,FALSE,TRUE,FALSE,FALSE,ab1234 ab123,Name1,TRUE,TRUE,FALSE,FALSE,FALSE,TRUE,ab1234

Have tried lot , nothing is materialised , any suggestions please !!!

Edit: Desired Output from Actual Input:

Desc,INCOMING-OUTGOING-SMS-RECHARGE-DEBIT-DATA,OUTGOING-SMS-RECHARGE-DEBIT-DATA,INCOMING-SMS-RECHARGE-DEBIT-DATA,INCOMING-OUTGOING-RECHARGE-DEBIT-DATA,INCOMING-OUTGOING-SMS-RECHARGE-DATA,INCOMING-OUTGOING-SMS-RECHARGE-DEBIT,SMS-RECHARGE-DEBIT-DATA,OUTGOING-RECHARGE-DEBIT-DATA,OUTGOING-SMS-RECHARGE-DATA,OUTGOING-SMS-RECHARGE-DEBIT,INCOMING-RECHARGE-DEBIT-DATA,INCOMING-SMS-DEBIT-DATA,INCOMING-SMS-RECHARGE-DATA,INCOMING-SMS-RECHARGE-DEBIT,INCOMING-OUTGOING-DEBIT-DATA,INCOMING-OUTGOING-RECHARGE-DATA,INCOMING-OUTGOING-RECHARGE-DEBIT,INCOMING-OUTGOING-SMS-DATA,INCOMING-OUTGOING-SMS-DEBIT,INCOMING-OUTGOING-SMS-RECHARGE,RECHARGE-DEBIT-DATA,SMS-DEBIT-DATA,SMS-RECHARGE-DATA,SMS-RECHARGE-DEBIT,OUTGOING-RECHARGE-DATA,OUTGOING-RECHARGE-DEBIT,OUTGOING-SMS-DATA,OUTGOING-SMS-DEBIT,OUTGOING-SMS-RECHARGE,INCOMING-DEBIT-DATA,INCOMING-RECHARGE-DATA,INCOMING-RECHARGE-DEBIT,INCOMING-SMS-DATA,INCOMING-SMS-DEBIT,INCOMING-SMS-RECHARGE,INCOMING-OUTGOING-DATA,INCOMING-OUTGOING-DEBIT,INCOMING-OUTGOING-RECHARGE,INCOMING-OUTGOING-SMS,DEBIT-DATA,RECHARGE-DATA,RECHARGE-DEBIT,SMS-DATA,SMS-DEBIT,SMS-RECHARGE,OUTGOING-DATA,OUTGOING-DEBIT,OUTGOING-RECHARGE,OUTGOING-SMS,INCOMING-DATA,INCOMING-DEBIT,INCOMING-RECHARGE,INCOMING-SMS,INCOMING-OUTGOING,DATA,DEBIT,RECHARGE,SMS,OUTGOING,INCOMING Name1,,,,,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1,,,1,,,,,,,,,,,,,,,,,,,,, Name2,,,,1,1,,,,,,,,,,,,,,,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Name3,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2,,,,,,,,,,,,,,,,,,,1,,,

Don't have Perl and Python access !!!

Answer1:

I have written a perl script that does this for you. As you can see from the size and comments, it is really simple to get this done.

#!/usr/bin/perl use strict; use warnings; use autodie; use Algorithm::Combinatorics qw(combinations); ## change the file to the path where your file exists open my $fh, '<', 'file'; my (%data, @new_labels); ## capture the header line in an array my @header = split /,/, <$fh>; ## backup the header my @fields = @header; ## remove first, second and last columns @header = splice @header, 2, -1; ## generate unique combinations for my $iter (1 .. +@header) { my $combination = combinations(\@header, $iter); while (my $pair = $combination->next) { push @new_labels, "@$pair"; } } ## iterate through rest of the file while(my $line = <$fh>) { my @line = split /,/, $line; ## identify combined labels that are true my @is_true = map { $fields[$_] } grep { $line[$_] eq "TRUE" } 0 .. $#line; ## increment counter in hash map keyed at description and then new labels ++$data{$line[1]}{$_} for map { s/ /-/g; $_ } "@is_true"; } ## print the new header print join ( ",", "Desc", map {s/ /-/g; $_} reverse @new_labels ) . "\n"; ## print the description and counter values for my $desc (sort keys %data){ print join ( ",", $desc, ( map { $data{$desc}{$_} //= "" } reverse @new_labels ) ) . "\n"; } <h3>Output:</h3> Desc,INCOMING-OUTGOING-SMS-RECHARGE-DEBIT-DATA,OUTGOING-SMS-RECHARGE-DEBIT-DATA,INCOMING-SMS-RECHARGE-DEBIT-DATA,INCOMING-OUTGOING-RECHARGE-DEBIT-DATA,INCOMING-OUTGOING-SMS-DEBIT-DATA,INCOMING-OUTGOING-SMS-RECHARGE-DATA,INCOMING-OUTGOING-SMS-RECHARGE-DEBIT,SMS-RECHARGE-DEBIT-DATA,OUTGOING-RECHARGE-DEBIT-DATA,OUTGOING-SMS-DEBIT-DATA,OUTGOING-SMS-RECHARGE-DATA,OUTGOING-SMS-RECHARGE-DEBIT,INCOMING-RECHARGE-DEBIT-DATA,INCOMING-SMS-DEBIT-DATA,INCOMING-SMS-RECHARGE-DATA,INCOMING-SMS-RECHARGE-DEBIT,INCOMING-OUTGOING-DEBIT-DATA,INCOMING-OUTGOING-RECHARGE-DATA,INCOMING-OUTGOING-RECHARGE-DEBIT,INCOMING-OUTGOING-SMS-DATA,INCOMING-OUTGOING-SMS-DEBIT,INCOMING-OUTGOING-SMS-RECHARGE,RECHARGE-DEBIT-DATA,SMS-DEBIT-DATA,SMS-RECHARGE-DATA,SMS-RECHARGE-DEBIT,OUTGOING-DEBIT-DATA,OUTGOING-RECHARGE-DATA,OUTGOING-RECHARGE-DEBIT,OUTGOING-SMS-DATA,OUTGOING-SMS-DEBIT,OUTGOING-SMS-RECHARGE,INCOMING-DEBIT-DATA,INCOMING-RECHARGE-DATA,INCOMING-RECHARGE-DEBIT,INCOMING-SMS-DATA,INCOMING-SMS-DEBIT,INCOMING-SMS-RECHARGE,INCOMING-OUTGOING-DATA,INCOMING-OUTGOING-DEBIT,INCOMING-OUTGOING-RECHARGE,INCOMING-OUTGOING-SMS,DEBIT-DATA,RECHARGE-DATA,RECHARGE-DEBIT,SMS-DATA,SMS-DEBIT,SMS-RECHARGE,OUTGOING-DATA,OUTGOING-DEBIT,OUTGOING-RECHARGE,OUTGOING-SMS,INCOMING-DATA,INCOMING-DEBIT,INCOMING-RECHARGE,INCOMING-SMS,INCOMING-OUTGOING,DATA,DEBIT,RECHARGE,SMS,OUTGOING,INCOMING Name1,,,,,,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1,,,1,,,,,,,,,,,,,,,,,,,,, Name2,,,,1,,1,,,,,,,,,,,,,,,,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, Name3,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,2,,,,,,,,,,,,,,,,,,,1,,,

<strong>Note:</strong> Please revisit your expected output. It has few mistakes in it as you can see from the output generated from the script above.

Answer2:

Here is an attempt at solving this using awk:

<h3>Content of script.awk</h3> BEGIN { FS = OFS = "," } function combinations(flds, itr, i, pre) { for (i=++cnt; i<=numRecs; i++) { ++n sep = "" for (pre=1; pre<=itr; pre++) { newRecs[n] = newRecs[n] sep (sprintf ("%s", flds[pre])); sep = "-" } newRecs[n] = newRecs[n] sep (sprintf ("%s", flds[i])) ; } } NR==1 { for (fld=3; fld<NF; fld++) { recs[++numRecs] = $fld } for (iter=0; iter<numRecs; iter++) { combinations(recs, iter) } next } !seen[$2]++ { desc[++d] = $2 } { y = 0; var = sep = "" for (idx=3; idx<NF; idx++) { if ($idx == "TRUE") { is_true[++y] = recs[idx-2] } } for (z=1; z<=y; z++) { var = var sep sprintf ("%s", is_true[z]) sep = "-" } data[$2,var]++; } END{ printf "%s," , "Desc" for (k=1; k<=n; k++) { printf "%s%s", newRecs[k],(k==n?RS:FS) } for (name=1; name<=d; name++) { printf "%s,", desc[name]; for (nR=1; nR<=n; nR++) { printf "%s%s", (data[desc[name],newRecs[nR]]?data[desc[name],newRecs[nR]]:""), (nR==n?RS:FS) } } } <h3>Sample file</h3> Locationx,Desc,A,B,C,Locationy ab123,Name1,TRUE,TRUE,TRUE,ab1234 ab123,Name2,TRUE,FALSE,TRUE,ab1234 ab123,Name2,FALSE,FALSE,TRUE,ab1234 ab123,Name1,TRUE,TRUE,TRUE,ab1234 ab123,Name2,TRUE,TRUE,TRUE,ab1234 ab123,Name3,FALSE,FALSE,FALSE,ab1234 ab123,Name3,TRUE,FALSE,FALSE,ab1234 ab123,Name3,TRUE,TRUE,FALSE,ab1234 ab123,Name3,TRUE,TRUE,FALSE,ab1234 ab123,Name1,TRUE,TRUE,FALSE,ab1234 <h3>Execution:</h3> $ awk -f script.awk file Desc,A,B,C,A-B,A-C,A-B-C Name1,,,,1,,2 Name2,,,1,,1,1 Name3,1,,,2,,

Now, there is pretty evident bug in the combination function. It does not recurse to print all combinations. For eg: for A B C D it will print

A B C AB AC ABC

but not BC

Recommend

  • how to convert PEM to P7B with python
  • How do I get the number of jobs in a rq queue?
  • Reading from Windows registry in Perl [duplicate]
  • Python equivalent of Scala's exists() function?
  • Python 2.7 on OS X: TypeError: 'frozenset' object is not callable on each command
  • Perl keyword say is not working in version 5.14.4
  • How can I stop my python script when another python script is running?
  • Extract data between rows r
  • Plotting A Hyperboloid
  • Binary Tree Traversal Sum Of Each Depth
  • replacing while loop with list comprehension
  • Best win32 compiled scripting language? [closed]
  • help('modules') crashing? Not sure how to fix
  • OpenCV Python: Draw minAreaRect ( RotatedRect not implemented)
  • Regex for Specific Tag
  • pymongo replication secondary readreference not work
  • How can we prepend rows to a react native list-view?
  • import scipy.sparse failed
  • perl, mysql - fasting way to upload a csv file into mysql?
  • Sending keystrokes/mouse clicks to a Java program with Autohotkey
  • During installation of Django, why do I keep getting ImportError: No module named django?
  • where do I find the xml.dom python package for the python-2.6.0-8.9.28 and I have a suse/x86_64 vers
  • Python pickle not one-to-one: different pickles give same object
  • Get specific string
  • What's the purpose of QString?
  • Insert new calendar with SyncAdapter- Calendar API Android
  • Ensure fsync did its job
  • Is there a perl module to validate passwords stored in “{crypt}hashedpassword” “{ssha}hashedpassword
  • Jackson Parser: ignore deserializing for type mismatch
  • How to match http request and response using Jersey ContainerRequestFilter and ContainerResponseFilt
  • Different response to non-authenticated users and AJAX calls
  • Fetching methods from BroadcastReceiver to update UI
  • Knitr HTML Loop - Some HTML output, some R output
  • Symfony2: How to get request parameter
  • GridView Sorting works once only
  • Windows forms listbox.selecteditem displaying “System.Data.DataRowView” instead of actual value
  • Unit Testing MVC Web Application in Visual Studio and Problem with QTAgent
  • WPF Applying a trigger on binding failure
  • Benchmarking RAM performance - UWP and C#
  • How can i traverse a binary tree from right to left in java?