grep
to wc -l
in order to count the number of lines of output. The -c
option to grep
gives a count of lines that match the specified pattern and is generally faster than a pipe to wc
, as in the following example:Listing 1. Example of good habit #8: Counting lines with and without grep
~ $ time grep and tmp/a/longfile.txt | wc -l 2811 real 0m0.097s user 0m0.006s sys 0m0.032s ~ $ time grep -c and tmp/a/longfile.txt 2811 real 0m0.013s user 0m0.006s sys 0m0.005s ~ $ |
An addition to the speed factor, the
-c
option is also a better way to do the counting. With multiple files, grep
with the -c
option returns a separate count for each file, one on each line, whereas a pipe to wc
gives a total count for all files combined.However, regardless of speed considerations, this example showcases another common error to avoid. These counting methods only give counts of the number of lines containing matched patterns -- and if that is what you are looking for, that is great. But in cases where lines can have multiple instances of a particular pattern, these methods do not give you a true count of the actual number of instances matched.
To count the number of instances, use
wc
to count, after all. First, run a grep
command with the -o
option, if your version supports it. This option outputs only the matched pattern, one on each line, and not the line itself. But you cannot use it in conjunction with the -c
option, so use wc -l
to count the lines, as in the following example:Listing 2. Example of good habit #8: Counting pattern instances with grep
~ $ grep -o and tmp/a/longfile.txt | wc -l 3402 ~ $ |
In this case, a call to
wc
is slightly faster than a second call to grep
with a dummy pattern put in to match and count each line (such as grep -c
).
No comments:
Post a Comment