正则表达式grep详解

我总结了多种grep常用正则表达式及其组合,能够满足大多数使用场景。

我们先来看看正则表达式的匹配符:

  1. \ 转义字符 \+ \< \>

  2. . 匹配任意单个字符

  3. [1234abc][^1234][1-5][a-d] 字符序列单字符占位

  4. ^ 行首 ^.k

  5. $ 行尾 .k$

  6. \<,\>\<abc,abc\>,\<are\> 单词首尾边界 okhelloworld ok hello world

  7. | 连接操作符,并集 (\<are\>)| (\<you\>)

  8. (,) 选择操作符

  9. \n 反向引用

重复操作符:

  1. * 匹配0到多次
  2. ? 匹配0到1次
  3. + 匹配1到多次
  4. {n} 匹配n次
  5. {n,} 匹配n到多次
  6. {m,n} 匹配m到n次

先给出本文中的全部命令及其解释:

  • grep "after" profile 查找文件内的包含"after"的行
  • grep -n "after" profile 查找文件内的包含"after"的行,并查看匹配行所在文档的行号。
  • grep -n "after" profile | grep "then" 在包含"after"的行中查找含有"then"的行
  • grep -v -n "after" profile 查找不包含after的行,并显示行号
  • grep "a*re" hello.txt 这里的* 表示前面a出现0到多次,表示匹配如are aare xred的行
  • grep "a.re" hello.txt 这里的. 表示匹配任意单个字符
  • grep -E "a+re" hello.txt 这里的+表示a有一到多个
  • grep "a\+re" hello.txt 这里有\+就不需要-E选项了
  • grep "[b-d]" hello.txt 匹配包含bcd中任意一个字符的行
  • grep -v "[b-d]" hello.txt 匹配不包含bcd中任意一个字符的行
  • grep "?" hello.txt 查找带问号的行
  • grep "a.re" hello.txt 这里的.占一个位置,匹配任意字符
  • grep "..re" hello.txt 匹配re前面有两个任意字符的行
  • grep "[xz]k" hello.txt 匹配带有zk和xk的行
  • grep -v '[xz]k' hello.txt 匹配不带有zk和xk的行
  • grep "\<[zx]k" hello.txt 匹配含有以zk和xk为开头的单词的行
  • grep "^.k" hello.txt 匹配每一行的第二个字符一定得是k的行(行头)
  • grep ".k$" hello.txt 匹配以k结尾的行(行尾),该行最少两个字符,最后一个是k
  • grep "\<are\>" hello.txt 匹配包含单词are的行
  • grep "\<are" hello.txt 匹配以are为开头的单词所在的行
  • grep "re\>" hello.txt 匹配(一个单词的)单词尾
  • grep -E "are|you" hello.txt 匹配包含are或者you的行,满足其中任意条件(are或you)就会匹配。
  • grep are hello.txt | grep you | grep ok 必须同时满足三个条件(are、you和ok)才匹配。
  • grep "\<a*re\>" hello.txt 命令中的*表示(a)0到多次,即匹配单词以a字符开头,或者不以a字符开头,且以re结尾的行。
  • grep "\<a*re\>" hello.txt 等同于 grep -E "(\<a*re\>)" hello.txt
  • grep -E "a{3}" hello.txt 匹配有3个重复a的行
  • grep -E "a+" hello.txt 匹配含有一个或多个a的行,等同于grep "a\+" hello.txt
  • grep -E "a?" hello.txt 匹配出现0或1次a的行,等同于 grep "a\?" hello.txt
  • grep -E "*" hello.txt 匹配任意字符,即匹配全部内容

后面是上面命令的执行过程。

现在来试试吧!

1
2
3
4
5
6
7
8
9
10
11
12
13
[root@layne tdir]# cp /etc/profile  .
[root@layne tdir]# grep "after" profile #查找文件内的包含"after"的行
if [ "$2" = "after" ] ; then
pathmunge /usr/local/sbin after
pathmunge /usr/sbin after
pathmunge /sbin after
[root@layne tdir]# grep -n "after" profile #添加查找的行在文档的行号。
16: if [ "$2" = "after" ] ; then
42: pathmunge /usr/local/sbin after
43: pathmunge /usr/sbin after
44: pathmunge /sbin after
[root@layne tdir]# grep -n "after" profile | grep "then" #在包含"after"的行中查找含有"then"的行
16: if [ "$2" = "after" ] ; then

-v表示不包含的行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[root@layne tdir]# grep -v -n "after" profile  # 不包含after的行,并显示行号
1:# /etc/profile
2:
3:# System wide environment and startup programs, for login setup
4:# Functions and aliases go in /etc/bashrc
5:
6:# It's NOT a good idea to change this file unless you know what you
7:# are doing. It's much better to create a custom.sh shell script in
8:# /etc/profile.d/ to make custom changes to your environment, as this
9:# will prevent the need for merging in future updates.
10:
11:pathmunge () {
12: case ":${PATH}:" in
13: *:"$1":*)
14: ;;
15: *)
17: PATH=$PATH:$1
18: else
...

再尝试更多的例子!

创建hello.txt,内容为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
hello world
are you ok?
areyou ok?
areyou are youok?
aaare you ok?
aare you ok
aaaare you ok
abcre you ok?
xxre you ok
are yyyou ok?
xk
zk
ok
yk
zzk
zxzxk
bxx
cxx
dxx
areyou are youok?
zk kz 1
kz zk 2
okk koo 3
zkkz
kzzk

匹配are aare xre,0到多个a字符

1
2
3
4
5
6
7
8
9
10
11
[root@layne tdir]# grep "a*re" hello.txt   # `*` 表示前面a出现0到多次
are you ok?
areyou ok?
areyou are youok?
aaare you ok?
aare you ok
aaaare you ok
abcre you ok?
xxre you ok
are yyyou ok?
areyou are youok?

匹配“a任意单个字符re”

1
2
3
4
[root@layne tdir]# grep "a.re" hello.txt  #`.` 表示匹配任意单个字符
aaare you ok?
aare you ok
aaaare you ok

匹配a一个到多个任意字符re

1
2
[root@layne tdir]# grep "a+re" hello.txt
[root@layne tdir]#

但是,发现查询不出来,这是为什么?

上图第1个是基本匹配,第2~6个是扩展匹配

grep命令默认处于基本工作模式下,加上-E 选项让grep工作于扩展模式

这样就可以解决上述问题来匹配a一个到多个任意字符re了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[root@layne tdir]# grep -E "a+re" hello.txt
are you ok?
areyou ok?
areyou are youok?
aaare you ok?
aare you ok
aaaare you ok
are yyyou ok?
areyou are youok?
[root@layne tdir]# grep "a\+re" hello.txt #同样,\也可以实现上述命令
are you ok?
areyou ok?
areyou are youok?
aaare you ok?
aare you ok
aaaare you ok
are yyyou ok?
areyou are youok?

所以,在基本工作模式下,?,+,{,|,(,)这些符号就丢失了意义,需要加-E 选项让grep工作于扩展模式才能生效,或者前面加上\也能生效。

[a-d]匹配abcd中的一个

匹配包含bcd中任意一个字符的行**(两边都是闭区间)**和 不包含bcd中任意一个字符的行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
[root@layne tdir]# grep "[b-d]" hello.txt #匹配包含bcd中任意一个字符的行
hello world
abcre you ok?
bxx
cxx
dxx
[root@layne tdir]# grep -v "[b-d]" hello.txt #匹配不包含bcd中任意一个字符的行
are you ok?
areyou ok?
areyou are youok?
aaare you ok?
aare you ok
aaaare you ok
xxre you ok
are yyyou ok?
xk
zk
ok
yk
zzk
zxzxk
areyou are youok?
zk kz 1
kz zk 2
okk koo 3
zkkz
kzzk

查找带问号的行:

1
2
3
4
5
6
7
8
[root@layne tdir]# grep "?" hello.txt
are you ok?
areyou ok?
areyou are youok?
aaare you ok?
abcre you ok?
are yyyou ok?
areyou are youok?

.占一个位置,匹配任意字符

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
[root@layne tdir]# grep  "a.re"  hello.txt
aaare you ok?
aare you ok
aaaare you ok
[root@layne tdir]# grep "a..re" hello.txt
aaare you ok?
aaaare you ok
abcre you ok?
[root@layne tdir]# grep "..re" hello.txt
areyou are youok?
aaare you ok?
aare you ok
aaaare you ok
abcre you ok?
xxre you ok
areyou are youok?
[root@layne tdir]# grep "...re" hello.txt
areyou are youok?
aaare you ok?
aaaare you ok
abcre you ok?
areyou are youok?

匹配带有zk和xk的行 和没不带有zk和xk的行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
[root@layne tdir]# grep  "[xz]k"  hello.txt # 匹配带有zk和xk的行
xk
zk
zzk
zxzxk
zk kz 1
kz zk 2
zkkz
kzzk
[root@layne tdir]# grep -v '[xz]k' hello.txt # 匹配不带有zk和xk的行
hello world
are you ok?
areyou ok?
areyou are youok?
aaare you ok?
aare you ok
aaaare you ok
abcre you ok?
xxre you ok
are yyyou ok?
ok
yk
bxx
cxx
dxx
areyou are youok?
okk koo 3

匹配含有以zk和xk为开头的单词的行

1
2
3
4
5
6
[root@layne tdir]# grep  "\<[zx]k"  hello.txt
xk
zk
zk kz 1
kz zk 2
zkkz

匹配第二个字符一定得是k的行(行头)

1
2
3
4
5
6
7
8
[root@layne tdir]# grep  "^.k"  hello.txt
xk
zk
ok
yk
zk kz 1
okk koo 3
zkkz

匹配以k结尾的行(行尾)

1
2
3
4
5
6
7
8
9
10
11
[root@layne tdir]# grep  ".k$"  hello.txt  #该行最少两个字符,最后一个是k
aare you ok
aaaare you ok
xxre you ok
xk
zk
ok
yk
zzk
zxzxk
kzzk

匹配单词边界(匹配一个单词)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
[root@layne tdir]# grep  "\<are\>"  hello.txt #匹配包含单词are的行
are you ok?
areyou are youok?
are yyyou ok?
areyou are youok?
[root@layne tdir]# grep "\<are" hello.txt # 只匹配单词开头
are you ok?
areyou ok?
areyou are youok?
are yyyou ok?
areyou are youok?
[root@layne tdir]# grep "re\>" hello.txt # 匹配(一个单词的)单词尾
are you ok?
areyou are youok?
aaare you ok?
aare you ok
aaaare you ok
abcre you ok?
xxre you ok
are yyyou ok?
areyou are youok?

同时匹配多个关键字–或关系

1
2
3
4
5
6
7
8
9
10
11
[root@layne tdir]# grep -E "are|you" hello.txt #匹配包含are或者you的行,满足其中任意条件(are或you)就会匹配。
are you ok?
areyou ok?
areyou are youok?
aaare you ok?
aare you ok
aaaare you ok
abcre you ok?
xxre you ok
are yyyou ok?
areyou are youok?

同时匹配多个关键字–与关系

使用管道符连接多个 grep ,间接实现多个关键字的与关系匹配:

1
2
3
4
5
6
7
8
9
[root@layne tdir]# grep are hello.txt | grep you | grep ok #必须同时满足三个条件(are、you和ok)才匹配。
are you ok?
areyou ok?
areyou are youok?
aaare you ok?
aare you ok
aaaare you ok
are yyyou ok?
areyou are youok?

grep "\<a*re\>" hello.txt 命令中的*表示(a)0到多次,即匹配单词以a字符开头,或者不以a字符开头,且以re结尾的行。

1
2
3
4
5
6
7
8
[root@layne tdir]# grep  "\<a*re\>" hello.txt
are you ok?
areyou are youok?
aaare you ok?
aare you ok
aaaare you ok
are yyyou ok?
areyou are youok?

另外,grep "\<a*re\>" hello.txt 等同于 grep -E "(\<a*re\>)" hello.txt

1
2
3
4
5
6
7
8
[root@layne tdir]# grep -E "(\<a*re\>)" hello.txt
are you ok?
areyou are youok?
aaare you ok?
aare you ok
aaaare you ok
are yyyou ok?
areyou are youok?

grep -E "a{3}" hello.txt 匹配该行中3个a重复的

1
2
3
[root@layne tdir]# grep -E "a{3}" hello.txt
aaare you ok?
aaaare you ok

匹配一个到多个a :grep -E "a+" hello.txtgrep "a\+" hello.txt

匹配0到1次a:grep -E "a?" hello.txtgrep "a\?" hello.txt

匹配任意字符:grep -E "*" hello.txt (即匹配全部内容)