Howto - Linux Delete Common Lines From Two Files

October 22, 2012 | By
| 3 Replies More

Question: How can I delete lines containing matching text from two files?

Answer:

Suppose, you have two files, say file1 and file2, having the following contents.

# cat file1
test@domain.com
test2@domain.com
dom@domain.com
test3@domain.com

# cat file2
test2@domain.com
test3@domain.com

The purpose is to get a file with contents that are unique to file1 (matching lines of file2 should be removed from file1).

So, the resulting file should be as follows.

# cat file3
test@domain.com
dom@domain.com

This can be done with the Linux command “comm”. The basic syntax of this command is as follows.

comm [-1] [-2] [-3 ] file1 file2

-1 Suppress the output column of lines unique to file1.
-2 Suppress the output column of lines unique to file2.
-3 Suppress the output column of lines duplicated in file1 and file2.
file1 Name of the first file to compare.
file2 Name of the second file to compare.

Before applying “comm”, we need to sort the input files. So, in order to get the lines unique to file1, we can use a combination of “comm” and “sort” commands as follows.

# comm -23 <(sort file1) <(sort file2) > file3

The above command will create the file3 file with the unique contents that we discussed earlier.

Filed Under : HOWTOS, LINUX HOWTO

Free Linux Ebook to Download

Comments (3)

Trackback URL | Comments RSS Feed

  1. Tahir says:

    Suppose instead you wanted to update file1 instead of create a new file3.
    Would this work for that case:
    # comm -23 <(sort file1) file1

  2. Ashish Karpe says:

    # comm -23 <(autodeploy-AMZ-PRODUCTION.txt) autodeploy-servers-toadd
    -bash: autodeploy-AMZ-PRODUCTION.txt: command not found

    • Ashish Karpe says:

      comm command is there can't figure out exact syntax of the command
      # comm -23 < (autodeploy-AMZ-PRODUCTION.txt) autodeploy-servers-toadd
      -bash: syntax error near unexpected token `('

Leave a Reply

All comments are subject to moderation.