How can I search for a multiline pattern in a file?

I needed to find all the files that contained a specific string pattern. The first solution that comes to mind is using find piped with xargs grep:

find . -iname '*.py' | xargs grep -e 'YOUR_PATTERN'

But if I need to find patterns that spans on more than one line, I'm stuck because vanilla grep can't find multiline patterns.

Answers


So I discovered pcregrep which stands for Perl Compatible Regular Expressions GREP.

For example, you need to find files where the '_name' variable is immediatelly followed by the '_description' variable:

find . -iname '*.py' | xargs pcregrep -M '_name.*\n.*_description'

Tip: you need to include the line break character in your pattern. Depending on your platform, it could be '\n', \r', '\r\n', ...


Why don't you go for awk:

awk '/Start pattern/,/End pattern/' filename

Here is the example using GNU grep:

grep -Pzo '_name.*\n.*_description'

-z/--null-data Treat input and output data as sequences of lines.

See also here


grep -P also uses libpcre, but is much more widely installed. To find a complete title section of an html document, even if it spans multiple lines, you can use this:

grep -P '(?s)<title>.*</title>' example.html

Since the PCRE project implements to the perl standard, use the perl documentation for reference:


Here is a more useful example:

pcregrep -Mi "<title>(.*\n){0,5}</title>" afile.html

It searches the title tag in a html file even if it spans up to 5 lines.

Here is an example of unlimited lines:

pcregrep -Mi "(?s)<title>.*</title>" example.html 

With silver searcher:

ag 'abc.*(\n|.)*efg'

Speed optimizations of silver searcher could possibly shine here.


You can use the grep alternative sift here (disclaimer: I am the author).

It support multiline matching and limiting the search to specific file types out of the box:

sift -m --files '*.py' 'YOUR_PATTERN'

(search all *.py files for the specified multiline regex pattern)

It is available for all major operating systems. Take a look at the samples page to see how it can be used to to extract multiline values from an XML file.


This answer might be useful:

Regex (grep) for multi-line search needed

To find recursively you can use flags -R (recursive) and --include (GLOB pattern). See:

Use grep --exclude/--include syntax to not grep through certain files


perl -ne 'print if (/begin pattern/../end pattern/)' filename

Using ex/vi editor and globstar option (syntax similar to awk and sed):

ex +"/string1/,/string3/p" -R -scq! file.txt

where aaa is your starting point, and bbb is your ending text.

To search recursively, try:

ex +"/aaa/,/bbb/p" -scq! **/*.py

Note: To enable ** syntax, run shopt -s globstar (Bash 4 or zsh).


@Marcin: awk example non-greedy:

awk '{if ($0 ~ /Start pattern/) {triggered=1;}if (triggered) {print; if ($0 ~ /End pattern/) { exit;}}}' filename

Need Your Help

Size property has an invalid size of 0

c# asp.net sql-server stored-procedures sqlparameters

I am working on a social network, one of my procedures returns a VARCHAR output.

How to integrate WinMerge with TortoiseSvn after installation?

windows tortoisesvn winmerge

When you install winmerge after TortoiseSVN it gives you the option of associating winmerge with Tortoise. But if we install TortoiseSVN after winmerge how can we associate Winmerge to be used inst...