我最近看 R base 的字符串处理函数看到了 grep/agrep/grepl/regexpr/gregexpr/regexec/gregexec + regmatchs,因为这些函数名称长得很像,所以捋了捋下面这些。
|函数名称中的字符|英文释义|中文释义|
| ---- | ---- | ---- |
|rep|regular expression|正则表达式|
|g|global|全局|
|a|approximate|近似,模糊|
|l|logical|逻辑|
|reg|regular|正则的|
|expr|expression|表达式|
|exec|execute|执行|
现在的问题是,我只能试出来 regexpr 和 regexec 这两个函数输出的结果在格式上不一样,不明白还有什么区别。
my_vector <- c('>阿木<;>曼妮<', 'amu', 'amu<', '>.<')
regexpr(pattern = '[>.<]', my_vector)
## [1] 1 -1 4 1
## attr(,"match.length")
## [1] 1 -1 1 1
gregexpr(pattern = '[>.<]', my_vector)
## [[1]]
## [1] 1 4 6 9
## attr(,"match.length")
## [1] 1 1 1 1
##
## [[2]]
## [1] -1
## attr(,"match.length")
## [1] -1
##
## [[3]]
## [1] 4
## attr(,"match.length")
## [1] 1
##
## [[4]]
## [1] 1 2 3
## attr(,"match.length")
## [1] 1 1 1
regexec(pattern = '[>.<]', my_vector)
## [[1]]
## [1] 1
## attr(,"match.length")
## [1] 1
##
## [[2]]
## [1] -1
## attr(,"match.length")
## [1] -1
##
## [[3]]
## [1] 4
## attr(,"match.length")
## [1] 1
##
## [[4]]
## [1] 1
## attr(,"match.length")
## [1] 1
gregexec(pattern = '[>.<]', my_vector)
## [[1]]
## [,1] [,2] [,3] [,4]
## [1,] 1 4 6 9
## attr(,"match.length")
## [,1] [,2] [,3] [,4]
## [1,] 1 1 1 1
##
## [[2]]
## [1] -1
## attr(,"match.length")
## [1] -1
##
## [[3]]
## [,1]
## [1,] 4
## attr(,"match.length")
## [,1]
## [1,] 1
##
## [[4]]
## [,1] [,2] [,3]
## [1,] 1 2 3
## attr(,"match.length")
## [,1] [,2] [,3]
## [1,] 1 1 1