63007

Compare two unsorted files and print unique elements from each file

Question:

File1:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

File2:

9 10 11 12 21 22 23 24 1 2 3 4 17 18 19 20

I'm new to unix and I'm trying to obtain the unique rows from each file and output them into a new file, not printing the duplicates, files are Unsorted.

Answer1:

You want sort -n and uniq -u:

$ sort -n file1 file2 | uniq -u 5 6 7 8 13 14 15 16 21 22 23 24 # Redirect to file3 $ sort -n file1 file2 | uniq -u > file3

<strong>Edit:</strong>

$ awk '{u[$0]++}END{for(k in u)if(u[k]==1)print k}' file1 file2 5 6 7 8 21 22 23 24 13 14 15 16

Here u is the name of an associative array, you could name it anything <em>(I choose u short for unique)</em>. The keys <em>(k)</em> in the array are the lines in the files so every time a duplicate line is seen the count is increased. After the array is built we loop through the array and only print the key if it was only seen once, this code should help clear it up:

$ awk '{uniq[$0]++}END{for (key in uniq)print uniq[key]": "key}' file1 file2 2: 9 10 11 12 1: 5 6 7 8 1: 21 22 23 24 1: 13 14 15 16 2: 17 18 19 20 2: 1 2 3 4

Answer2:

Assuming you want a set of unique rows from <em>both files as a whole</em>:

sort -u File1 File2 > File3

UPD: -u might be specific for GNU coreutils. If your sort doesn't support it, see answer from @sudo_O.

UPD2: it turned out that @sudo_O interpreted the question differently: I assumed that duplicate lines should be included once, he assumed that they should be removed. If I'm right, than sort|uniq is the alternative for non-GNU sort. Otherwise, sort|uniq -u is the best solution so far.

Recommend

  • tests skipped inspite the dependant method passed. TestNG
  • Filter numpy structured array based on multiple values
  • How to match/merge data from two different files in R?
  • Structural Pseudo Classes and attribute selectors doesn't work together
  • perl - need to add set of lines into a file
  • Compare two files and write to a new file but only output a few lines?
  • Intellisense keeps showing errors while build is successful Visual Studio 2015
  • Remove characters after a specific character in column
  • Send multimedia commands
  • Detect language of Word document
  • What is the equivalent of Android permissions in iOS development? [duplicate]
  • Loop through each key and value of php multidimensional array
  • Many to Many in Linq using Dapper
  • Are there any side effects from calling SQLAlchemy flush() within code?
  • Setting the run time properties on SpringApplicationBuilder()
  • netsh acl setting (need alternative method - registry settings?)
  • how to set variables in a php include file?
  • How to print columns containing value
  • Do I need to seed any random number generator before using EVP_PKEY_keygen of OpenSSL?
  • Element.tagName for python not working
  • Get specific string
  • Array with custom indexes in Ionic2
  • Parsing a CSV string while ignoring commas inside the individual columns
  • Checking free space on FTP server
  • Avoid links criss cross / overlap in d3.js using force layout
  • Fetching methods from BroadcastReceiver to update UI
  • Check if a string to interpolate provides expected placeholders
  • Javascript + PHP Encryption with pidCrypt
  • Symfony2: How to get request parameter
  • GridView Sorting works once only
  • VB.net deserialize, JSON Conversion from type 'Dictionary(Of String,Object)' to type '
  • Transpose CSV data with awk (pivot transformation)
  • WPF Applying a trigger on binding failure
  • need help with bizarre java.net.HttpURLConnection behavior
  • Reading document lines to the user (python)
  • Observable and ngFor in Angular 2
  • How to Embed XSL into XML
  • UserPrincipal.Current returns apppool on IIS
  • Conditional In-Line CSS for IE and Others?
  • Python/Django TangoWithDjango Models and Databases