Grep Tool For Mac

It's fast, it's powerful, and its very name suggests that it does something technical: grep. With this workhorse of the command line, you can quickly find text hidden in your files. Grep is an acronym that stands for Global Regular Expression Print. Grep is a Linux / Unix command-line tool used to search for a string of characters in a specified file. The text search pattern is called a regular expression. When it finds a match, it prints the line with the result. The grep command is handy when searching through large log.

AckMate

Users of TextMate, the programmer's editor for the Mac, can use the AckMate plugin by Trevor Squires:

TextMate users know just how slow its 'Find in Project' can be with large source trees. That's why you need 'ack-in-project' — a TextMate bundle that uses the super-speedy 'ack' tool to search your code FAST. It gives you beautiful, clickable results just as fast as 'ack' can find them. Check it out at: https://github.com/protocool/ackmate

ack.vim

ack.vim provides an interface between ack and vim. For example, you can call :Ack foo, which will run ack and load ack's results into a vim buffer for manipulation and navigation.

ack.vim is available at the official vim website at https://www.vim.org/scripts/script.php?script_id=2572

ack2vim

ack2vim eases the interface beween ack and vim, so that ack's findings can be found with vim as well.

Emacs integration

There are at least four different Emacs modes for supporting ack here at the Emacs Wiki: https://www.emacswiki.org/emacs/Ack

Note: Be careful of any solution that defaults you to using --all, since that option no longer exists in ack 2.

hhighlighter

hhlighter is a wrapper around ack to make it easy to highlight words in a file. Invoke it like

cat file | h foo bar bat

and each of the three words 'foo', 'bar' and 'bat' will be highlighted as a different color. See https://github.com/paoloantinori/hhighlighter

There are many ways to search source code that are more flexible and tuned to programmers than straight grep. I suggest you take a look at some of these alternatives, for they may suit your needs better than ack. If you have any suggestions to add to this list, please let me know at andy@petdance.com or submit an issue at https://github.com/petdance/beyondgrep/issues.

ag, the Silver Searcher

Geoff Greer says 'ag is like ack, but better. It's fast. It's damn fast. The only thing faster is stuff that builds indices beforehand, like Exuberant Ctags.' Geoff has also created a fork of AckMate that uses Ag instead of ack.

cgrep

Cgrep is a grep tool suitable for searching in large code repositories. It supports 30 programming languages and searches that go beyond the simple pattern matching. It enables context-aware filtering and semantic searches through wildcard and combinators.

grab

grab is another faster grep alternative that tries to use multiple cores. It also uses parallel processing, mmap and other speedy tricks behind the scenes.

glark

The biggest departure from ack, glark adds many more features like the ability to AND and OR your patterns. It's written in Ruby.

greple

greple is a search tools that lets you search for multiple keywords at a time.

grin

'A grep program configured the way I like it', written in Python by Robert Kern.

kaki

Grep App For Mac

kaki is inspired by ack, and built on top of nodejs.

nak

An implementation of ack, written in Node.js. It has inspiration from Ag, and is optimized for speed, not features. It's completely asynchronous. Written by Garen J. Torikian.

paragrep

paragrep is a text search tool that operates at the paragraph level.

pcregrep

pcregrep looks like it's just a regular grep, but with a PCRE regex engine.

pss

pss is an ack clone written in Python by Eli Bendersky. It's written in pure Python with no additional modules necessary.

pt, the Platinum Searcher

The Platinum Searcher is another code search tool similar to ack and ag. It supports multi platforms and multi encodings.

rak

A straight clone of ack, with some visual tweaks, written in Ruby by Daniel Lucraft.

ripgrep

ripgrep is written in Rust and claims to be 'faster than everything else'.

sift

sift is a search tool written in Go. It claims to be very faster, faster than ag.

spot

spot is a tiny search utility that adapts some of ack's features. It's simple and uses find+grep+awk.

UniversalCodeGrep (ucg)

A fast ack-like search tool that is written in C++ using PCRE and makes use of concurrency.

vack

vack is visual ack for the Mac.

Sometimes when you're looking at a large codebase, it makes sense to see everything as a whole. An indexing tool may help you out.

ctags

ctags is a program almost as old as time itself. When run against a codebase, ctags indexes various elements of the code, such as variables and functions. This lets your editor or other tools use the tags index to jump quickly to that element.

Best Grep Tool For Mac

The most common ctags implementation is Exuberant ctags: http://ctags.sourceforge.net/

cscope

Cscope is a developer's tool for browsing source code. Cscope was part of the official AT&T Unix distribution for many years, and has been used to manage projects involving 20 million lines of code. It also can integrate with vim and Emacs.

CodeQuery

CodeQuery indexes and queries C, C++, Java and Python source code. It builds upon the databases of cscope and ctags, mentioned above, and provides a nice GUI tool.

Code Search

Russ Cox, the guy that wrote Google's CodeSearch engine, wrote an article about how it worked and released an implementation in Go.

OpenGrok

OpenGrok is a fast and usable source code search and cross reference engine. It helps you search, cross-reference and navigate your source tree.

GNU GLOBAL

GNU GLOBAL is a source code tagging system that works the same way across diverse environments (emacs, vi, less, bash, web browser, etc). You can locate objects in source files and move there easily. It is similar to ctags or etags but is different from them at the point of independence of any editor.

beagrep

Beagrep is a combination of a desktop search engine named beagle and grep. Use the search engine first, then use grep on the small subset of possibly matching files, thus it is very fast and useful for code reading in huge source trees.

Hound

Hound provides a centralized web front-end to a regex-searchable text index of multiple Git repositories. It was created by engineers at Etsy to handle searching across codebases.

Mac OS X Command Line 101
by Richard Burton

Understanding The 'grep' Command In Mac OS X
Part XV of this series...
October 4th, 2002

I don't know why Dudley keeps trying to find himself, I found him years ago.
- Peter Cook

This series is designed to help you learn more about the Mac OS X command line. If you have any questions about what you read here, check out the earlier columns, write back in the comments below, or join us in the Hardcore X! forum.

In the previous column, we learned about regular expressions, and how to use them to search for text in vi. Having such a text-searching tool for the command line would be a valuable addition to Unix; naturally, such a tool exists. It is called grep, and it is the subject of today's column.

grep allows you to search through your entire system, for either the name of a file, or for content within those files. This is similar to the way Sherlock used to work before Sherlock 3, and the way 'Find' works today in Jaguar's GUI. When you need to find a string of text on your system from the command line, grep is the way to do it. Now, on to how to use it.

The grep command will take a regular expression, as well as a list of files. It will then search through the files and, for each line that is matched by the regular expression, print the line. (Supposedly, the name grep comes from ed command g/RE/p, or 'global/regular expression/print', which does the same thing within the editor. I can neither confirm nor deny this.) If there are no files indicated, grep will read from standard input. Therefore, you can do things like:

to give a more flexible search. Notice that the regular expression, .es.*, was enclosed in double quotes. Otherwise, we get this:

[Note: I think that this is because the asterisk and/or period will confuse the tcsh command line, which tries to use them as metacharacers, so you need the quotes. On the other hand, if you want to anchor the regular expression to the end of a line with a dollar sign, it interprets this as a variable $' and chokes. tcsh is quirky with regular expressions, and I haven't quite figured out everything with it. I know from experience that the Korn shell, ksh, does not suffer from this. On the other hand, ksh is not the default shell, so there y'are.]

You also need quotes if you have spaces in the regular expression. The difference between grep the file and grep 'the ' file is that the former will match any occurrence of t-h-e, whereas the latter will match only for t-h-e-space. This means that the former will match 'I was there' but the latter will not. Remember that the command line ignores extra spaces, collapsing many into one, unless the spaces are quoted.

As you might expect, grep takes the standard regular expression characters of ., *, ^, $, , and [ ]. Thus, to count the number of blank lines in a file, do:

Thus, we can see that grep ^$ testfile will print all three blank lines. We can use wc and the pipe, |, to build our own tool to count blank lines. Neat, huh?

In some Unixes (Unices?), there were two versions of grep, grep and egrep, whose primary difference was that each had slightly different additions to the basic regular expression syntax. In Darwin, and therefore in OS X, the syntaxes (syntaces?) are combined, and using either command will get you the same as using the other. Thus, you can bounce back and forth between them like so many yo-yos (yo-yi?)[*]

One set of regular expression characters available in grep is the { } pair. This allows you to search for a range of occurrences. Suppose you want to look for 'to', followed by three to nine characters, follow by an 'a'. This can be done by:

Again, the quotes are needed here. If you want to match exactly 3, the regular expression is to.{3}a. Normally, the { } pair is only available in grep, but in Darwin and OS X, it is also available in egrep.

grep's regular expression syntax is expanded in OS X to include features not seen in the standard definition of grep. In other words, OS X'sgrep will let you do searches that greps on other Unices won't. For example, you can use the < > pair to denote the beginnings and endings of words, just like in vi.

We have seen that the asterisk (*) is used to denote 'any number of the thing preceding me.' In OS X'sgrep, the plus sign, +, can be used to denote 'at least on of the thing preceding me.' So, while the regular expression th*e will match te, the, thhe, ..., the regular expression th+e will match the, thhe, thhhe, .... So can see that h+ is the same as hh*. The plus sign is often used in other utilities' regular expressions, but is not part of grep on most other systems. Make a note of it, there will be a quiz later.

Another bonus freebie that is thrown our way is the question mark, ?, unless you are British and over 35, in which case it is 'a mark of interrogation.' grep uses this in regular expressions to denote 'zero or one occurrence of the thing before me', or 'an optional [whatever is before me].' Therefore, the expression lie?d will match either lied or lid.

Finally, the vertical bar, |, can be used for either/or matching, just like in, you guessed it, vi.

grep can take several options; you can see them all via man grep, of course, but I've found that the most useful ones are (remember that this works in the grep option format):

-c: 'count the lines'. Instead of printing all the matched lines, -c merely prints a count of matched lines for each file. Thus that '| wc -l' trick isn't needed for one file. (If you pass in a list of files, though, ...)

-e PATTERN: 'expression starts here.' Using -e will tell grep 'What follows is the pattern with which to search.' This is very useful when your pattern starts with a '-'. Otherwise, the command line might think that your expression is an option and get confused.

-f FILE: 'file holds the expression'. -f allows you to store a pattern in a file and tell grep 'Yo, use this.' I've mostly used this when writing scripts that will use the same pattern repeatedly. That way, if I have to change it later, I only have to change it in one place.

-i: 'ignore case'. -i forces grep to ignore the distinction between uppercase and lowercase. Imagine you need to find matches in a file which may have come from Windows (include shudder here). Now imagine a long string of paired letters like [Tt][Hh][Ee] [Cc][Aa][Tt] and on and on. Just use -i instead and save yourself time and pain.

-l: 'list files'. Instead of printing the matched lines,when you use the -l option,grep will just print a list of the files which contain the expression. This is mostly used when you are doing something like grep 'expression' * in a directory with a lot of files or when you just want to know which files need (processing, editing, etc).

-n: 'number'. -n means that before each line of output, grep will print its line number within the file.

-v: 'invert'. -v instructs grep to print only those lines that don't match the expression.

As you can see, grep is a very powerful tool. It can be used to quickly search files and to filter output on the command line. It does have a couple limitations, though. First, it is no speed demon. Building those regular expressions and parsing a lot of text in a flexible way takes resources, and that takes time. (Admittedly, these days, that isn't much of an issue, but still, there it is.) Second, consider the following: you are working away, happy as a clam, and the boss says 'Cyprian', if your name is Cyprian, 'I just got a call from marketing, we need to change the search in all those voodoo scripts you wrote, and we need it in ten minutes.'

Now, you know and I know that you can look for the expression the.?.*ca[r]?t and search for it using after after . But my lord, and your duke for that matter, who the heck would want to? Do you realize that you would look for the.?.*ca[r]?t (or something along those lines) and heaven forbid you should make the slightest mistake. If you're like me, and I know I am, you'd think 'Now dash it, there must be an easier way. Surely, in all the history of Unix, someone has had to face just such an emergency and written a grep-like tool to deal with this. Like that Cyprian chap, maybe.' Well, Cyprian has come through. It's called fgrep (for 'fast grep'), and it works a lot like grep except it doesn't take a regular expression.

Where you would normally place a regular expression, just put in a literal string. Originally it was used to be a fast alternative to grep by trading the power and flexibility of regular expressions for speed. As quick as computers are these days, that isn't an issue, but if you want to find something that contains a literal period or a literal asterisk, it's the bee's knees.

[*] This joke was borrowed at great embarrassment from Shelley Berman. All young whippersnappers are advised to ask their parents or grandparents.

You are encouraged to send Richard your comments, or to post them below.

Most Recent Mac OS X Command Line 101 Columns

Command Line History & Editing Your Commands
November 22nd

Pico: An Easy To Use Command Line Editor
November 1st

Understanding The 'grep' Command In Mac OS X
October 4th

Command Line History & Editing Your Commands
September 6th

Mac OS X Command Line 101 Archives

Back to The Mac Observer For More Mac News!

Richard Burton is a longtime Unix programmer and a handsome brute. He spends his spare time yelling at the television during Colts and Pacers games, writing politically incorrect short stories, and trying to shoot the neighbor's cat (not really) nesting in his garage. He can be seen running roughshod over the TMO forums under the alias tbone1.