How to Delete Common Lines from two Files in Linux

How can I remove lines containing matching text from two files in Linux? Let's discuss how to perform this task using Linux comm and sort command.

Suppose, you have two files, say file1 and file2, having the following contents:

$ cat file1
$ cat file2

The purpose is to get a file with contents that are unique to file1 (matching lines of file2 should be removed from file1).

So, the resulting file should be as follows.

$ cat file3

This can be done with the Linux command “comm”. The basic syntax of this command is as follows:

comm [-1] [-2] [-3 ] file1 file2
-1 Suppress the output column of lines unique to file1.
-2 Suppress the output column of lines unique to file2.
-3 Suppress the output column of lines duplicated in file1 and file2.
file1 Name of the first file to compare.
file2 Name of the second file to compare.

Before applying “comm”, we need to sort the input files. So, in order to get the lines unique to file1, we can use a combination of “comm” and “sort” commands as follows.

$ comm -23 <(sort file1) <(sort file2) > file3

The above command will create the file3 file with the unique contents from file1 and file2. You can read comm command man pages for more details.

3 Comments... add one

  1. Suppose instead you wanted to update file1 instead of create a new file3.
    Would this work for that case:
    # comm -23 <(sort file1) file1

  2. # comm -23 <(autodeploy-AMZ-PRODUCTION.txt) autodeploy-servers-toadd
    -bash: autodeploy-AMZ-PRODUCTION.txt: command not found

    • comm command is there can't figure out exact syntax of the command
      # comm -23 < (autodeploy-AMZ-PRODUCTION.txt) autodeploy-servers-toadd
      -bash: syntax error near unexpected token `('


Leave a Comment