awk Lookup 2 files, print match and Sum of Sencond Field:


I need to comapre two files of f1.txt and f2.txt and obtain matches, and non-matches, for this case I am looking to check Second field of f1.txt is matching with First field of f2.txt,if yes then print the entire line of f1.txt and print first field of f2.txt and Sum of second field of f2.txt. And for no match found on f1.txt to state "NotFound".


aa,10,cc,Jan-13 bb,20,cc,Feb-13 dd,50,cc,Mar-13


10,1500,ss 20,500,gg 10,2000,kk 10,15000,yy 20,500,zz, 35,250,tt


aa,10,cc,Jan-13,10,18500 bb,20,cc,Feb-13,20,1000 dd,50,cc,Mar-13,NotFound,NotFound


This awkshould do

awk -F, 'FNR==NR {a[$1]+=$2;next} {if ($2 in a) print $0,$2,a[$2]; else print $0,"NotFound","NotFound"}' OFS=, f2.txt f1.txt aa,10,cc,Jan-13,10,18500 bb,20,cc,Feb-13,20,1000 dd,50,cc,Mar-13,NotFound,NotFound

How does it work:

awk -F, ' #Set Field separator to , FNR==NR {a[$1]+=$2;next} #Read data from file f2.txt using field #1 as index and sum field #2 in to array a {if ($2 in a) #Test if field #2 in f1.txt is found in a print $0,$2,a[$2] #If found, print line of f1.txt with sum and index from array else print $0,"NotFound","NotFound" #If not found print line of f1.txt with NotFound } ' OFS=, f2.txt f1.txt #Set Output field separator to , and read files <hr />

A slightly shorter version:

awk -F, 'FNR==NR {a[$1]+=$2;next} {print $0 ","($2 in a?$2","a[$2]:"NotFound,NotFound")}' f2.txt f1.txt


  • Create a multiple horizontal line plot, plotting numerous variable for multiple years
  • TFS2018 why is shelveset name empty?
  • Complex ruby array of hashes combine
  • Bypass Gmail's spam filter (mails sent with PHP from a shared host)
  • apt-get update fails with 404 in a previously working build
  • Laravel Artisan PHP version error in 1and1 server
  • Aligning grid lines in R, bReeze package
  • Highcharts: display multiple tooltips by click and vice versa
  • How to supress header Vary:* when using OutputCacheProfiles
  • What is the best practise to organize different applications under VOBs
  • pseudocolors in R
  • Excluding dates in the jQuery datepicker using beforeShowDay and an array of objects
  • LoadRunner correlation phenomen?
  • Calculate Event Durations to only part within Sliced Period
  • CodeIgniter - strange output issue
  • WCF Rest Error Handling
  • Cumulative sum of values by month, filling in for missing months
  • Remove PNG plot margins
  • ASP.NET MVC ActionResult View() not changing url
  • multiple colors on beanplot in R
  • How to set the date format of dd-MMM-yy in DataGridView at design time in vb.net?
  • How to generate Date Series in HIVE? (Creating table)
  • Converting Twitter DateTime to Localtime with adding UTC offset [duplicate]
  • date: illegal option — d, Find difference between two dates
  • Google Protocol Buffer error
  • Show records ordered with maximum price first in PHP & MySQL
  • Building jamvm 1.5.4 on OS X Lion
  • How Get arguments value using inline assembly in C without Glibc?
  • How to make R's read_csv2() recognise the text characters properly
  • R convert summary result (statistics with all dataframe columns) into dataframe
  • Approximate Order-Preserving Huffman Code
  • Implementation of State Monad
  • Grails calculated field in SQL
  • preg_replace Double Spaces to tab (\\t) at the beginning of a line
  • How do I pass the string value parameter of the selected list item from an auto-populated dropdown l
  • Extracting HTML between tags
  • MongoDB in PHP using aggregate to group by _id is null not working
  • Why HTML5 Canvas with a larger size stretch a drawn line?
  • Is possible to count alias result on mysql
  • Check if a string to interpolate provides expected placeholders