summaryrefslogtreecommitdiff
path: root/pcre/doc/pcregrep.txt
blob: 0c873c7a8630ff8b4f79650850d564d56663c949 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
PCREGREP(1)                 General Commands Manual                PCREGREP(1)



NAME
       pcregrep - a grep with Perl-compatible regular expressions.

SYNOPSIS
       pcregrep [options] [long options] [pattern] [path1 path2 ...]


DESCRIPTION

       pcregrep  searches  files  for  character  patterns, in the same way as
       other grep commands do, but it uses the PCRE regular expression library
       to support patterns that are compatible with the regular expressions of
       Perl 5. See pcresyntax(3) for a quick-reference summary of pattern syn-
       tax,  or pcrepattern(3) for a full description of the syntax and seman-
       tics of the regular expressions that PCRE supports.

       Patterns, whether supplied on the command line or in a  separate  file,
       are given without delimiters. For example:

         pcregrep Thursday /etc/motd

       If you attempt to use delimiters (for example, by surrounding a pattern
       with slashes, as is common in Perl scripts), they  are  interpreted  as
       part  of  the pattern. Quotes can of course be used to delimit patterns
       on the command line because they are interpreted by the shell, and  in-
       deed  quotes  are  required  if a pattern contains white space or shell
       metacharacters.

       The first argument that follows any option settings is treated  as  the
       single  pattern  to be matched when neither -e nor -f is present.  Con-
       versely, when one or both of these options are  used  to  specify  pat-
       terns, all arguments are treated as path names. At least one of -e, -f,
       or an argument pattern must be provided.

       If no files are specified, pcregrep reads the standard input. The stan-
       dard  input can also be referenced by a name consisting of a single hy-
       phen.  For example:

         pcregrep some-pattern /file1 - /file3

       By default, each line that matches a pattern is copied to the  standard
       output,  and if there is more than one file, the file name is output at
       the start of each line, followed by a colon. However, there are options
       that  can  change  how  pcregrep  behaves. In particular, the -M option
       makes it possible to search for patterns  that  span  line  boundaries.
       What  defines  a  line boundary is controlled by the -N (--newline) op-
       tion.

       The amount of memory used for buffering files that are being scanned is
       controlled  by a parameter that can be set by the --buffer-size option.
       The default value for this parameter  is  specified  when  pcregrep  is
       built,  with  the  default  default  being 20K. A block of memory three
       times this size is used (to allow for buffering  "before"  and  "after"
       lines). An error occurs if a line overflows the buffer.

       Patterns  can  be  no  longer than 8K or BUFSIZ bytes, whichever is the
       greater.  BUFSIZ is defined in <stdio.h>. When there is more  than  one
       pattern (specified by the use of -e and/or -f), each pattern is applied
       to each line in the order in which they are defined,  except  that  all
       the -e patterns are tried before the -f patterns.

       By  default, as soon as one pattern matches a line, no further patterns
       are considered. However, if --colour (or --color) is used to colour the
       matching  substrings, or if --only-matching, --file-offsets, or --line-
       offsets is used to output only the part of the line that  matched  (ei-
       ther  shown  literally,  or as an offset), scanning resumes immediately
       following the match, so that further matches on the same  line  can  be
       found.  If  there  are multiple patterns, they are all tried on the re-
       mainder of the line, but patterns that follow the one that matched  are
       not tried on the earlier part of the line.

       This  behaviour  means  that  the  order in which multiple patterns are
       specified can affect the output when one of the above options is  used.
       This  is no longer the same behaviour as GNU grep, which now manages to
       display earlier matches for later patterns (as  long  as  there  is  no
       overlap).

       Patterns  that can match an empty string are accepted, but empty string
       matches  are  never  recognized.  An  example  is  the  pattern   "(su-
       per)?(man)?",  in which all components are optional. This pattern finds
       all occurrences of both "super" and  "man";  the  output  differs  from
       matching  with  "super|man" when only the matching substrings are being
       shown.

       If the LC_ALL or LC_CTYPE environment variable is  set,  pcregrep  uses
       the  value to set a locale when calling the PCRE library.  The --locale
       option can be used to override this.


SUPPORT FOR COMPRESSED FILES

       It is possible to compile pcregrep so that it uses libz  or  libbz2  to
       read  files  whose names end in .gz or .bz2, respectively. You can find
       out whether your binary has support for one or both of these file types
       by running it with the --help option. If the appropriate support is not
       present, files are treated as plain text. The standard input is  always
       so treated.


BINARY FILES

       By  default,  a  file that contains a binary zero byte within the first
       1024 bytes is identified as a binary file, and is processed  specially.
       (GNU  grep  also identifies binary files in this manner.) See the --bi-
       nary-files option for a means of changing the way binary files are han-
       dled.


OPTIONS

       The  order  in  which some of the options appear can affect the output.
       For example, both the -h and -l options affect  the  printing  of  file
       names.  Whichever  comes later in the command line will be the one that
       takes effect. Similarly, except where noted  below,  if  an  option  is
       given  twice,  the  later setting is used. Numerical values for options
       may be followed by K  or  M,  to  signify  multiplication  by  1024  or
       1024*1024 respectively.

       --        This terminates the list of options. It is useful if the next
                 item on the command line starts with a hyphen but is  not  an
                 option.  This allows for the processing of patterns and file-
                 names that start with hyphens.

       -A number, --after-context=number
                 Output number lines of context after each matching  line.  If
                 filenames and/or line numbers are being output, a hyphen sep-
                 arator is used instead of a colon for the  context  lines.  A
                 line  containing  "--" is output between each group of lines,
                 unless they are in fact contiguous in  the  input  file.  The
                 value  of number is expected to be relatively small. However,
                 pcregrep guarantees to have up to 8K of following text avail-
                 able for context output.

       -a, --text
                 Treat  binary  files as text. This is equivalent to --binary-
                 files=text.

       -B number, --before-context=number
                 Output number lines of context before each matching line.  If
                 filenames and/or line numbers are being output, a hyphen sep-
                 arator is used instead of a colon for the  context  lines.  A
                 line  containing  "--" is output between each group of lines,
                 unless they are in fact contiguous in  the  input  file.  The
                 value  of number is expected to be relatively small. However,
                 pcregrep guarantees to have up to 8K of preceding text avail-
                 able for context output.

       --binary-files=word
                 Specify  how binary files are to be processed. If the word is
                 "binary" (the default), pattern matching is performed on  bi-
                 nary  files,  but  the  only  output  is  "Binary file <name>
                 matches" when a match succeeds. If the word is "text",  which
                 is  equivalent  to  the -a or --text option, binary files are
                 processed in the same way as any other file.  In  this  case,
                 when  a  match  succeeds,  the  output may be binary garbage,
                 which can have nasty effects if sent to a  terminal.  If  the
                 word  is  "without-match",  which is equivalent to the -I op-
                 tion, binary files are not processed at all; they are assumed
                 not to be of interest.

       --buffer-size=number
                 Set  the  parameter that controls how much memory is used for
                 buffering files that are being scanned.

       -C number, --context=number
                 Output number lines of context both  before  and  after  each
                 matching  line.  This is equivalent to setting both -A and -B
                 to the same value.

       -c, --count
                 Do not output individual lines from the files that are  being
                 scanned; instead output the number of lines that would other-
                 wise have been shown. If no lines are  selected,  the  number
                 zero  is  output.  If  several files are are being scanned, a
                 count is output for each of them. However,  if  the  --files-
                 with-matches  option  is  also  used,  only those files whose
                 counts are greater than zero are listed. When -c is used, the
                 -A, -B, and -C options are ignored.

       --colour, --color
                 If this option is given without any data, it is equivalent to
                 "--colour=auto".  If data is required, it must  be  given  in
                 the same shell item, separated by an equals sign.

       --colour=value, --color=value
                 This option specifies under what circumstances the parts of a
                 line that matched a pattern should be coloured in the output.
                 By  default,  the output is not coloured. The value (which is
                 optional, see above) may be "never", "always", or "auto".  In
                 the  latter case, colouring happens only if the standard out-
                 put is connected to a terminal. More resources are used  when
                 colouring  is enabled, because pcregrep has to search for all
                 possible matches in a line, not just one, in order to  colour
                 them all.

                 The colour that is used can be specified by setting the envi-
                 ronment variable PCREGREP_COLOUR or PCREGREP_COLOR. The value
                 of this variable should be a string of two numbers, separated
                 by a semicolon. They are copied  directly  into  the  control
                 string  for  setting  colour on a terminal, so it is your re-
                 sponsibility to ensure that they make sense.  If  neither  of
                 the  environment  variables  is  set,  the default is "1;31",
                 which gives red.

       -D action, --devices=action
                 If an input path is not a regular file or a  directory,  "ac-
                 tion"  specifies  how it is to be processed. Valid values are
                 "read" (the default) or "skip" (silently skip the path).

       -d action, --directories=action
                 If an input path is a directory, "action" specifies how it is
                 to  be  processed.   Valid  values are "read" (the default in
                 non-Windows environments, for compatibility with  GNU  grep),
                 "recurse"  (equivalent to the -r option), or "skip" (silently
                 skip the path, the default in Windows environments).  In  the
                 "read"  case,  directories  are read as if they were ordinary
                 files. In some operating systems the effect of reading a  di-
                 rectory  like  this is an immediate end-of-file; in others it
                 may provoke an error.

       -e pattern, --regex=pattern, --regexp=pattern
                 Specify a pattern to be matched. This option can be used mul-
                 tiple times in order to specify several patterns. It can also
                 be used as a way of specifying a single pattern  that  starts
                 with  a hyphen. When -e is used, no argument pattern is taken
                 from the command line; all  arguments  are  treated  as  file
                 names.  There is no limit to the number of patterns. They are
                 applied to each line in the order in which they  are  defined
                 until one matches.

                 If  -f is used with -e, the command line patterns are matched
                 first, followed by the patterns from the file(s), independent
                 of  the order in which these options are specified. Note that
                 multiple use of -e is not the same as a single  pattern  with
                 alternatives. For example, X|Y finds the first character in a
                 line that is X or Y, whereas if the two  patterns  are  given
                 separately,  with X first, pcregrep finds X if it is present,
                 even if it follows Y in the line. It finds Y only if there is
                 no  X  in  the line. This matters only if you are using -o or
                 --colo(u)r to show the part(s) of the line that matched.

       --exclude=pattern
                 Files (but not directories) whose names match the pattern are
                 skipped  without  being processed. This applies to all files,
                 whether listed on the command  line,  obtained  from  --file-
                 list, or by scanning a directory. The pattern is a PCRE regu-
                 lar expression, and is matched against the final component of
                 the  file  name,  not the entire path. The -F, -w, and -x op-
                 tions do not apply to this pattern. The option may  be  given
                 any number of times in order to specify multiple patterns. If
                 a file name matches both an --include and an  --exclude  pat-
                 tern, it is excluded. There is no short form for this option.

       --exclude-from=filename
                 Treat  each  non-empty  line  of  the file as the data for an
                 --exclude option. What constitutes a newline when reading the
                 file  is the operating system's default. The --newline option
                 has no effect on this option. This option may be  given  more
                 than once in order to specify a number of files to read.

       --exclude-dir=pattern
                 Directories whose names match the pattern are skipped without
                 being processed, whatever the setting of the --recursive  op-
                 tion.  This applies to all directories, whether listed on the
                 command line, obtained from --file-list,  or  by  scanning  a
                 parent  directory.  The pattern is a PCRE regular expression,
                 and is matched against the final component of  the  directory
                 name,  not the entire path. The -F, -w, and -x options do not
                 apply to this pattern. The option may be given any number  of
                 times  in order to specify more than one pattern. If a direc-
                 tory matches both --include-dir and --exclude-dir, it is  ex-
                 cluded. There is no short form for this option.

       -F, --fixed-strings
                 Interpret  each  data-matching  pattern  as  a  list of fixed
                 strings, separated by newlines, instead of as a  regular  ex-
                 pression. What constitutes a newline for this purpose is con-
                 trolled by the --newline option. The -w (match as a word) and
                 -x  (match whole line) options can be used with -F.  They ap-
                 ply to each of the fixed strings. A line is selected  if  any
                 of the fixed strings are found in it (subject to -w or -x, if
                 present). This option applies only to the patterns  that  are
                 matched  against  the contents of files; it does not apply to
                 patterns specified by any of the --include or  --exclude  op-
                 tions.

       -f filename, --file=filename
                 Read  patterns  from  the  file, one per line, and match them
                 against each line of input. What constitutes a  newline  when
                 reading  the  file  is  the  operating  system's default. The
                 --newline option has no effect on this option. Trailing white
                 space is removed from each line, and blank lines are ignored.
                 An empty file contains  no  patterns  and  therefore  matches
                 nothing. See also the comments about multiple patterns versus
                 a single pattern with alternatives in the description  of  -e
                 above.

                 If  this  option  is  given more than once, all the specified
                 files are read. A data line is output if any of the  patterns
                 match  it.  A  filename  can  be given as "-" to refer to the
                 standard input. When -f is used, patterns  specified  on  the
                 command  line  using  -e may also be present; they are tested
                 before the file's patterns.  However,  no  other  pattern  is
                 taken from the command line; all arguments are treated as the
                 names of paths to be searched.

       --file-list=filename
                 Read a list of  files  and/or  directories  that  are  to  be
                 scanned  from  the  given  file, one per line. Trailing white
                 space is removed from each line, and blank lines are ignored.
                 These  paths  are processed before any that are listed on the
                 command line. The filename can be given as "-"  to  refer  to
                 the standard input.  If --file and --file-list are both spec-
                 ified as "-", patterns are read first. This  is  useful  only
                 when  the  standard  input  is a terminal, from which further
                 lines (the list of files) can be read  after  an  end-of-file
                 indication.  If  this option is given more than once, all the
                 specified files are read.

       --file-offsets
                 Instead of showing lines or parts of lines that  match,  show
                 each  match  as  an  offset  from the start of the file and a
                 length, separated by a comma. In this  mode,  no  context  is
                 shown.  That  is,  the -A, -B, and -C options are ignored. If
                 there is more than one match in a line, each of them is shown
                 separately.  This  option  is mutually exclusive with --line-
                 offsets and --only-matching.

       -H, --with-filename
                 Force the inclusion of the filename at the  start  of  output
                 lines  when searching a single file. By default, the filename
                 is not shown in this case. For matching lines,  the  filename
                 is followed by a colon; for context lines, a hyphen separator
                 is used. If a line number is also being  output,  it  follows
                 the file name.

       -h, --no-filename
                 Suppress  the output filenames when searching multiple files.
                 By default, filenames  are  shown  when  multiple  files  are
                 searched.  For  matching lines, the filename is followed by a
                 colon; for context lines, a hyphen separator is used.   If  a
                 line number is also being output, it follows the file name.

       --help    Output  a  help  message, giving brief details of the command
                 options and file type support, and then exit.  Anything  else
                 on the command line is ignored.

       -I        Treat  binary  files as never matching. This is equivalent to
                 --binary-files=without-match.

       -i, --ignore-case
                 Ignore upper/lower case distinctions during comparisons.

       --include=pattern
                 If any --include patterns are specified, the only files  that
                 are  processed  are those that match one of the patterns (and
                 do not match an --exclude pattern). This option does not  af-
                 fect directories, but it applies to all files, whether listed
                 on the command line, obtained from --file-list, or  by  scan-
                 ning  a  directory. The pattern is a PCRE regular expression,
                 and is matched against the final component of the file  name,
                 not  the entire path. The -F, -w, and -x options do not apply
                 to this pattern. The option may be given any number of times.
                 If  a  file  name  matches both an --include and an --exclude
                 pattern, it is excluded.  There is no short form for this op-
                 tion.

       --include-from=filename
                 Treat  each  non-empty  line  of  the file as the data for an
                 --include option. What constitutes a newline for this purpose
                 is  the  operating system's default. The --newline option has
                 no effect on this option. This option may be given any number
                 of times; all the files are read.

       --include-dir=pattern
                 If  any --include-dir patterns are specified, the only direc-
                 tories that are processed are those that  match  one  of  the
                 patterns  (and  do  not match an --exclude-dir pattern). This
                 applies to all directories, whether  listed  on  the  command
                 line,  obtained from --file-list, or by scanning a parent di-
                 rectory. The pattern is a PCRE  regular  expression,  and  is
                 matched  against  the  final component of the directory name,
                 not the entire path. The -F, -w, and -x options do not  apply
                 to this pattern. The option may be given any number of times.
                 If a directory matches both --include-dir and  --exclude-dir,
                 it is excluded. There is no short form for this option.

       -L, --files-without-match
                 Instead  of  outputting lines from the files, just output the
                 names of the files that do not contain any lines  that  would
                 have  been  output. Each file name is output once, on a sepa-
                 rate line.

       -l, --files-with-matches
                 Instead of outputting lines from the files, just  output  the
                 names of the files containing lines that would have been out-
                 put. Each file name is  output  once,  on  a  separate  line.
                 Searching  normally stops as soon as a matching line is found
                 in a file. However, if the -c (count) option  is  also  used,
                 matching  continues in order to obtain the correct count, and
                 those files that have at least one  match  are  listed  along
                 with their counts. Using this option with -c is a way of sup-
                 pressing the listing of files with no matches.

       --label=name
                 This option supplies a name to be used for the standard input
                 when file names are being output. If not supplied, "(standard
                 input)" is used. There is no short form for this option.

       --line-buffered
                 When this option is given, input is read and  processed  line
                 by  line,  and the output is flushed after each write. By de-
                 fault, input is read in large chunks, unless pcregrep can de-
                 termine  that  it  is  reading from a terminal (which is cur-
                 rently possible only in Unix-like  environments).  Output  to
                 terminal  is  normally automatically flushed by the operating
                 system. This option can be useful when the input or output is
                 attached  to a pipe and you do not want pcregrep to buffer up
                 large amounts of data. However, its use will  affect  perfor-
                 mance, and the -M (multiline) option ceases to work.

       --line-offsets
                 Instead  of  showing lines or parts of lines that match, show
                 each match as a line number, the offset from the start of the
                 line,  and a length. The line number is terminated by a colon
                 (as usual; see the -n option), and the offset and length  are
                 separated  by  a  comma.  In  this mode, no context is shown.
                 That is, the -A, -B, and -C options are ignored. If there  is
                 more  than  one  match in a line, each of them is shown sepa-
                 rately. This option is mutually exclusive with --file-offsets
                 and --only-matching.

       --locale=locale-name
                 This  option specifies a locale to be used for pattern match-
                 ing. It overrides the value in the LC_ALL or  LC_CTYPE  envi-
                 ronment  variables.  If  no locale is specified, the PCRE li-
                 brary's default (usually the "C" locale) is used. There is no
                 short form for this option.

       --match-limit=number
                 Processing  some  regular  expression  patterns can require a
                 very large amount of memory, leading in some cases to a  pro-
                 gram  crash  if  not enough is available.  Other patterns may
                 take a very long time to search  for  all  possible  matching
                 strings.  The pcre_exec() function that is called by pcregrep
                 to do the matching has two parameters that can limit the  re-
                 sources that it uses.

                 The  --match-limit  option  provides  a means of limiting re-
                 source usage when processing patterns that are not  going  to
                 match, but which have a very large number of possibilities in
                 their search trees. The classic example  is  a  pattern  that
                 uses  nested unlimited repeats. Internally, PCRE uses a func-
                 tion called match() which it calls repeatedly (sometimes  re-
                 cursively).  The limit set by --match-limit is imposed on the
                 number of times this function is called during a match, which
                 has  the  effect  of limiting the amount of backtracking that
                 can take place.

                 The --recursion-limit option is similar to --match-limit, but
                 instead of limiting the total number of times that match() is
                 called, it limits the depth of recursive calls, which in turn
                 limits  the  amount of memory that can be used. The recursion
                 depth is a smaller number than the total number of calls, be-
                 cause  not  all calls to match() are recursive. This limit is
                 of use only if it is set smaller than --match-limit.

                 There are no short forms for these options. The default  set-
                 tings  are  specified when the PCRE library is compiled, with
                 the default default being 10 million.

       -M, --multiline
                 Allow patterns to match more than one line. When this  option
                 is given, patterns may usefully contain literal newline char-
                 acters and internal occurrences of ^ and  $  characters.  The
                 output  for  a  successful match may consist of more than one
                 line, the last of which is the one in which the match  ended.
                 If the matched string ends with a newline sequence the output
                 ends at the end of that line.

                 When this option is set, the PCRE library is called in  "mul-
                 tiline"  mode.   There is a limit to the number of lines that
                 can be matched, imposed by the way that pcregrep buffers  the
                 input  file as it scans it. However, pcregrep ensures that at
                 least 8K characters or the rest of the document (whichever is
                 the  shorter)  are  available for forward matching, and simi-
                 larly the previous 8K characters (or all the previous charac-
                 ters,  if  fewer  than 8K) are guaranteed to be available for
                 lookbehind assertions. This option does not work  when  input
                 is read line by line (see --line-buffered.)

       -N newline-type, --newline=newline-type
                 The  PCRE library supports five different conventions for in-
                 dicating the ends of lines. They are the single-character se-
                 quences CR (carriage return) and LF (linefeed), the two-char-
                 acter sequence CRLF, an "anycrlf"  convention,  which  recog-
                 nizes  any of the preceding three types, and an "any" conven-
                 tion, in which any Unicode line ending sequence is assumed to
                 end  a  line.  The  Unicode sequences are the three just men-
                 tioned, plus  VT  (vertical  tab,  U+000B),  FF  (form  feed,
                 U+000C),   NEL  (next  line,  U+0085),  LS  (line  separator,
                 U+2028), and PS (paragraph separator, U+2029).

                 When the PCRE library is built,  a  default  line-ending  se-
                 quence  is specified.  This is normally the standard sequence
                 for the operating system. Unless otherwise specified by  this
                 option,  pcregrep  uses  the library's default.  The possible
                 values for this option are CR, LF,  CRLF,  ANYCRLF,  or  ANY.
                 This  makes  it  possible  to use pcregrep to scan files that
                 have come from other environments without  having  to  modify
                 their  line  endings.  If the data that is being scanned does
                 not agree with the convention set by  this  option,  pcregrep
                 may  behave  in  strange ways. Note that this option does not
                 apply to files specified by the -f, --exclude-from, or  --in-
                 clude-from  options,  which are expected to use the operating
                 system's standard newline sequence.

       -n, --line-number
                 Precede each output line by its line number in the file, fol-
                 lowed  by  a colon for matching lines or a hyphen for context
                 lines. If the filename is also being output, it precedes  the
                 line number. This option is forced if --line-offsets is used.

       --no-jit  If  the  PCRE  library is built with support for just-in-time
                 compiling (which speeds up matching), pcregrep  automatically
                 makes use of this, unless it was explicitly disabled at build
                 time. This option can be used to disable the use  of  JIT  at
                 run  time. It is provided for testing and working round prob-
                 lems.  It should never be needed in normal use.

       -o, --only-matching
                 Show only the part of the line that matched a pattern instead
                 of  the  whole  line. In this mode, no context is shown. That
                 is, the -A, -B, and -C options are ignored. If there is  more
                 than  one  match in a line, each of them is shown separately.
                 If -o is combined with -v (invert the sense of the  match  to
                 find non-matching lines), no output is generated, but the re-
                 turn code is set appropriately. If the matched portion of the
                 line is empty, nothing is output unless the file name or line
                 number are being printed, in which case they are shown on  an
                 otherwise  empty line. This option is mutually exclusive with
                 --file-offsets and --line-offsets.

       -onumber, --only-matching=number
                 Show only the part of the line  that  matched  the  capturing
                 parentheses of the given number. Up to 32 capturing parenthe-
                 ses are supported, and -o0 is equivalent to -o without a num-
                 ber.  Because  these options can be given without an argument
                 (see above), if an argument is present, it must be  given  in
                 the  same  shell item, for example, -o3 or --only-matching=2.
                 The comments given for the non-argument case above also apply
                 to  this  case. If the specified capturing parentheses do not
                 exist in the pattern, or were not set in the  match,  nothing
                 is  output  unless  the  file  name  or line number are being
                 printed.

                 If this option is given multiple times,  multiple  substrings
                 are  output, in the order the options are given. For example,
                 -o3 -o1 -o3 causes the substrings matched by capturing paren-
                 theses  3  and  1  and then 3 again to be output. By default,
                 there is no separator (but see the next option).

       --om-separator=text
                 Specify a separating string for multiple occurrences  of  -o.
                 The  default is an empty string. Separating strings are never
                 coloured.

       -q, --quiet
                 Work quietly, that is, display nothing except error messages.
                 The  exit  status  indicates  whether or not any matches were
                 found.

       -r, --recursive
                 If any given path is a directory, recursively scan the  files
                 it  contains, taking note of any --include and --exclude set-
                 tings. By default, a directory is read as a normal  file;  in
                 some  operating  systems this gives an immediate end-of-file.
                 This option is a shorthand for setting the -d option to  "re-
                 curse".

       --recursion-limit=number
                 See --match-limit above.

       -s, --no-messages
                 Suppress  error  messages  about  non-existent  or unreadable
                 files. Such files are quietly skipped.  However,  the  return
                 code is still 2, even if matches were found in other files.

       -u, --utf-8
                 Operate  in UTF-8 mode. This option is available only if PCRE
                 has been compiled with UTF-8 support. All patterns (including
                 those  for  any --exclude and --include options) and all sub-
                 ject lines that are scanned must be valid  strings  of  UTF-8
                 characters.

       -V, --version
                 Write the version numbers of pcregrep and the PCRE library to
                 the standard output and then exit. Anything else on the  com-
                 mand line is ignored.

       -v, --invert-match
                 Invert  the  sense  of  the match, so that lines which do not
                 match any of the patterns are the ones that are found.

       -w, --word-regex, --word-regexp
                 Force the patterns to match only whole words. This is equiva-
                 lent  to  having \b at the start and end of the pattern. This
                 option applies only to the patterns that are matched  against
                 the  contents  of files; it does not apply to patterns speci-
                 fied by any of the --include or --exclude options.

       -x, --line-regex, --line-regexp
                 Force the patterns to be anchored (each must  start  matching
                 at  the beginning of a line) and in addition, require them to
                 match entire lines. This is equivalent  to  having  ^  and  $
                 characters at the start and end of each alternative branch in
                 every pattern. This option applies only to the patterns  that
                 are  matched against the contents of files; it does not apply
                 to patterns specified by any of the  --include  or  --exclude
                 options.


ENVIRONMENT VARIABLES

       The environment variables LC_ALL and LC_CTYPE are examined, in that or-
       der, for a locale. The first one that is set is used. This can be over-
       ridden  by the --locale option. If no locale is set, the PCRE library's
       default (usually the "C" locale) is used.


NEWLINES

       The -N (--newline) option allows pcregrep to scan files with  different
       newline conventions from the default. Any parts of the input files that
       are written to the standard output are copied identically,  with  what-
       ever  newline sequences they have in the input. However, the setting of
       this option does not affect the interpretation of  files  specified  by
       the -f, --exclude-from, or --include-from options, which are assumed to
       use the operating system's standard newline sequence, nor does  it  af-
       fect  the  way  in  which pcregrep writes informational messages to the
       standard error and output streams. For these it uses the string "\n" to
       indicate  newlines,  relying on the C I/O library to convert this to an
       appropriate sequence.


OPTIONS COMPATIBILITY

       Many of the short and long forms of pcregrep's options are the same  as
       in  the GNU grep program. Any long option of the form --xxx-regexp (GNU
       terminology) is also available as --xxx-regex (PCRE terminology).  How-
       ever,  the  --file-list, --file-offsets, --include-dir, --line-offsets,
       --locale, --match-limit, -M, --multiline, -N,  --newline,  --om-separa-
       tor,  --recursion-limit,  -u, and --utf-8 options are specific to pcre-
       grep, as is the use of the  --only-matching  option  with  a  capturing
       parentheses number.

       Although  most  of the common options work the same way, a few are dif-
       ferent in pcregrep. For example, the --include option's argument  is  a
       glob  for  GNU grep, but a regular expression for pcregrep. If both the
       -c and -l options are given, GNU grep lists only  file  names,  without
       counts, but pcregrep gives the counts.


OPTIONS WITH DATA

       There are four different ways in which an option with data can be spec-
       ified.  If a short form option is used, the  data  may  follow  immedi-
       ately, or (with one exception) in the next command line item. For exam-
       ple:

         -f/some/file
         -f /some/file

       The exception is the -o option, which may appear with or without  data.
       Because  of this, if data is present, it must follow immediately in the
       same item, for example -o3.

       If a long form option is used, the data may appear in the same  command
       line  item,  separated by an equals character, or (with two exceptions)
       it may appear in the next command line item. For example:

         --file=/some/file
         --file /some/file

       Note, however, that if you want to supply a file name beginning with  ~
       as  data  in a shell command, and have the shell expand ~ to a home di-
       rectory, you must separate the file name from the option,  because  the
       shell does not treat ~ specially unless it is at the start of an item.

       The  exceptions  to the above are the --colour (or --color) and --only-
       matching options, for which the data is optional. If one of  these  op-
       tions  does  have  data,  it  must be given in the first form, using an
       equals character. Otherwise pcregrep will assume that it has no data.


MATCHING ERRORS

       It is possible to supply a regular expression that takes  a  very  long
       time  to  fail  to  match certain lines. Such patterns normally involve
       nested indefinite repeats, for example: (a+)*\d when matched against  a
       line  of  a's with no final digit. The PCRE matching function has a re-
       source limit that causes it to abort in these  circumstances.  If  this
       happens, pcregrep outputs an error message and the line that caused the
       problem to the standard error stream. If there are more  than  20  such
       errors, pcregrep gives up.

       The --match-limit option of pcregrep can be used to set the overall re-
       source limit; there is a second option  called  --recursion-limit  that
       sets  a limit on the amount of memory (usually stack) that is used (see
       the discussion of these options above).


DIAGNOSTICS

       Exit status is 0 if any matches were found, 1 if no matches were found,
       and  2  for syntax errors, overlong lines, non-existent or inaccessible
       files (even if matches were found in other files) or too many  matching
       errors. Using the -s option to suppress error messages about inaccessi-
       ble files does not affect the return code.


SEE ALSO

       pcrepattern(3), pcresyntax(3), pcretest(1).


AUTHOR

       Philip Hazel
       University Computing Service
       Cambridge CB2 3QH, England.


REVISION

       Last updated: 03 April 2014
       Copyright (c) 1997-2014 University of Cambridge.