While searching for text in files inside a directory structure from command prompt/shell, there are many tools available in linux. The one of the tool which is oldest and widely used is grep that stands for global regular expression print. There are some issues with grep like it is not as fast while searching source code files. There is another text/pattern searching tool available specifically for searching text inside source code is ack. A good searching tool is lifeline for developer who rely on shell prompt, editor like vi or emacs or an IDE for writing codes. In this article, we will cover the basics of few search tools that will make life easier while searching text inside files.
The search tools that we will explore in this tutorial are-
→ Git grep
→ Platinum searcher
Ack is a code-searching tool, similar to grep but optimized for programmers searching large trees of source code. It runs in pure Perl, is highly portable, and runs on any platform that runs Perl. By default, ACK search directories recursively and ignores common version control directories by default like .git, .gitignore, .svn It also ignores binary files, image/music/video files, gzip/zip/tar archive files. The output of ack have better highlighting of matches and format the output clearly.
Install ACK in Ubuntu
# sudo apt-get install ack-grep
Install ACK in CentOS
# sudo yum install ack
In Ubuntu, there is already a package available by the name 'ack' which has nothing to do with searching. So the packagers had renamed this searching tool as ack-grep. Once you have installed it using apt-get, you can change/shorten its name to ack using following command.
# sudo dpkg-divert --local --divert /usr/bin/ack --rename --add /usr/bin/ack-grep
To find all the options that you can use while executing ack command, use the following man command
# man ack
2) Ag - the silver searcher
Ag is also a code searching tool like ack but it is significantly faster than ack. As compared to ack, it can search through compressed files and have better editor (vim) integration. Like ack, ag also ignores file patterns from .gitignore and .hgignore. Basic usage of Ag is simple: cd to the directory you want to search and run ag blah to find instances of "blah". It had been found that silver search is 34 times faster than ack while searching same text in source files.
Install Ag in Ubuntu
# sudo apt-get install silversearcher-ag
Install Ag in CentOS
# sudo yum install -y automake pcre-devel
# sudo yum install xz-devel
# cd /usr/local/src
# sudo git clone https://github.com/ggreer/the_silver_searcher.git
# cd the_silver_searcher
# sudo ./build.sh
# sudo make install
# which ag
To find all the options that you can use while executing ag command, use the following man command
# man ag
3) ripgrep (rg)
Ripgrep is a line oriented search tool that combines the usability of The Silver Searcher (similar to ack) with the raw speed of GNU grep. ripgrep works by recursively searching your current directory for a regex pattern. ripgrep has first class support on Windows, Mac and Linux, with binary downloads available for every release.
Installation of ripgrep binary in Ubuntu/CentOS
# wget https://github.com/BurntSushi/ripgrep/releases/download/0.4.0/ripgrep-0.4.0-i686-unknown-linux-musl.tar.gz
# tar xf ripgrep-0.4.0-i686-unknown-linux-musl.tar.gz
# cd ripgrep-0.4.0-i686-unknown-linux-musl
# mv rg /usr/local/bin
# which rg
The usage of rg is described the github page of ripgrep
Sift is another searching tool that is developed keeping in mind both speed and flexibility. Sift uses perl compatible regular expression format with basic options known from grep but with usable defaults. It can select or exclude targets based on file name, directory name, path and type. Like earlier search tools sift understands .gitignore files and can be configured to only show results in relevant files. Sift has multiline support and can replace output to reformat it to your needs without relying on awk/sed. Sift can also search through gzip files and can handle search inside big files of size>50GB. Another cool feature of sift is you may specify various conditions while searching text like-
→preceded by A
→followed by B within X lines
→if the file also contains a line with C
→if the file contains D in the first Y lines
→any combination of the available conditions
sift comes as a single executable with no dependencies and is available for all major platforms. So you can install it in any platform easily.
Download sift from the download section of official sift site, unzip it and move it to any location pointed by PATH environment variable.
# wget https://sift-tool.org/downloads/sift/sift_0.9.0_linux_386.tar.gz
# tar xf sift_0.9.0_linux_386.tar.gz
# cd sift_0.9.0_linux_386
# mv sift /usr/local/bin
# which sift
For sift usage check the documentation of sift-tool.org
5) pt - The platinum searcher
Another source code search utility similar to ack and ag is Platinum Searcher (pt), that is a written in Go programming language. It is claimed to be 3 to 5 times faster than ack. Pt is safer as it is written in memory safe language and uses Go’s standard regexp package, enabling it to avoid exponential time matching. Platinum Searcher can search not only in files encoded with UTF-8, but also EUC-JP and Shift_JIS, making it very useful for Japanese programmers.
Installing and using pt
The Platinum Searcher binaries are available for Windows, Mac OS X, Linux (including ARM) from its Github releases page. Download the binary and move it to a location pointed by $PATH and start searching.
# wget https://github.com/monochromegane/the_platinum_searcher/releases/download/v2.1.5/pt_linux_386.tar.gz
# tar xf pt_linux_386.tar.gz
# cd pt_linux_386/
# mv pt /usr/local/bin
# which pt
To search for a pattern in the current directory and all of its sub directories, simply type:
# pt PATTERN
pt [OPTIONS] PATTERN [PATH]
--color Print color codes in results (Enabled by default)
--nocolor Don't print color codes in results (Disabled by default)
--nogroup Don't print file name at header (Disabled by default)
-l, --files-with-matches Only print filenames that contain matches
--vcs-ignore= VCS ignore files (.gitignore, .hgignore, .ptignore)
--noptignore Don't use default ($Home/.ptignore) file for ignore patterns
--noglobal-gitignore Don't use git's global gitignore file for ignore patterns
-U, --skip-vsc-ignores Don't use VCS ignore file for ignore patterns. Still obey .ptignore
--ignore= Ignore files/directories matching pattern
-i, --ignore-case Match case insensitively
-S, --smart-case Match case insensitively unless PATTERN contains uppercase characters
-g= Print filenames matching PATTERN
-G, --file-search-regexp= PATTERN Limit search to filenames matching PATTERN
--depth= Search up to NUM directories deep (Default: 25)
-f, --follow Follow symlinks
-A, --after= Print lines after match
-B, --before= Print lines before match
-C, --context= Print lines before and after match
-o, --output-encode= Specify output encoding (none, jis, sjis, euc)
-e Parse PATTERN as a regular expression (Disabled by default)
-w, --word-regexp Only match whole words
--stats Print stats about files scanned, time taken, etc
--version Show version
6) Git grep
Git grep search for a regular expression in a Git repository. In a way, it's just a combination of find / grep combo, but very concise and fast. Git grep is a great tool for finding all uses and references to a symbol in a git repository. There is no separate installation for git grep as it installed alongside, when you install git.
For usage of git grep, check git-grep manual page
There are few others search utilities available like zgrep, agrep, xmlgrep, pdfgrep etc. Among all the search tools that we have discussed above ripgrep is faster and is cross platform whereas silver searcher (ag) is better than Ack. Grep is written in C but does not ignore files while searching while Ack is written in perl and is very good at ignoring files.