syntax.txt 12 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456
  1. RE2 regular expression syntax reference
  2. -------------------------­-------­-----
  3. Single characters:
  4. . any character, possibly including newline (s=true)
  5. [xyz] character class
  6. [^xyz] negated character class
  7. \d Perl character class
  8. \D negated Perl character class
  9. [[:alpha:]] ASCII character class
  10. [[:^alpha:]] negated ASCII character class
  11. \pN Unicode character class (one-letter name)
  12. \p{Greek} Unicode character class
  13. \PN negated Unicode character class (one-letter name)
  14. \P{Greek} negated Unicode character class
  15. Composites:
  16. xy «x» followed by «y»
  17. x|y «x» or «y» (prefer «x»)
  18. Repetitions:
  19. x* zero or more «x», prefer more
  20. x+ one or more «x», prefer more
  21. x? zero or one «x», prefer one
  22. x{n,m} «n» or «n»+1 or ... or «m» «x», prefer more
  23. x{n,} «n» or more «x», prefer more
  24. x{n} exactly «n» «x»
  25. x*? zero or more «x», prefer fewer
  26. x+? one or more «x», prefer fewer
  27. x?? zero or one «x», prefer zero
  28. x{n,m}? «n» or «n»+1 or ... or «m» «x», prefer fewer
  29. x{n,}? «n» or more «x», prefer fewer
  30. x{n}? exactly «n» «x»
  31. x{} (== x*) NOT SUPPORTED vim
  32. x{-} (== x*?) NOT SUPPORTED vim
  33. x{-n} (== x{n}?) NOT SUPPORTED vim
  34. x= (== x?) NOT SUPPORTED vim
  35. Implementation restriction: The counting forms «x{n,m}», «x{n,}», and «x{n}»
  36. reject forms that create a minimum or maximum repetition count above 1000.
  37. Unlimited repetitions are not subject to this restriction.
  38. Possessive repetitions:
  39. x*+ zero or more «x», possessive NOT SUPPORTED
  40. x++ one or more «x», possessive NOT SUPPORTED
  41. x?+ zero or one «x», possessive NOT SUPPORTED
  42. x{n,m}+ «n» or ... or «m» «x», possessive NOT SUPPORTED
  43. x{n,}+ «n» or more «x», possessive NOT SUPPORTED
  44. x{n}+ exactly «n» «x», possessive NOT SUPPORTED
  45. Grouping:
  46. (re) numbered capturing group (submatch)
  47. (?P<name>re) named & numbered capturing group (submatch)
  48. (?<name>re) named & numbered capturing group (submatch) NOT SUPPORTED
  49. (?'name're) named & numbered capturing group (submatch) NOT SUPPORTED
  50. (?:re) non-capturing group
  51. (?flags) set flags within current group; non-capturing
  52. (?flags:re) set flags during re; non-capturing
  53. (?#text) comment NOT SUPPORTED
  54. (?|x|y|z) branch numbering reset NOT SUPPORTED
  55. (?>re) possessive match of «re» NOT SUPPORTED
  56. re@> possessive match of «re» NOT SUPPORTED vim
  57. %(re) non-capturing group NOT SUPPORTED vim
  58. Flags:
  59. i case-insensitive (default false)
  60. m multi-line mode: «^» and «$» match begin/end line in addition to begin/end text (default false)
  61. s let «.» match «\n» (default false)
  62. U ungreedy: swap meaning of «x*» and «x*?», «x+» and «x+?», etc (default false)
  63. Flag syntax is «xyz» (set) or «-xyz» (clear) or «xy-z» (set «xy», clear «z»).
  64. Empty strings:
  65. ^ at beginning of text or line («m»=true)
  66. $ at end of text (like «\z» not «\Z») or line («m»=true)
  67. \A at beginning of text
  68. \b at ASCII word boundary («\w» on one side and «\W», «\A», or «\z» on the other)
  69. \B not at ASCII word boundary
  70. \G at beginning of subtext being searched NOT SUPPORTED pcre
  71. \G at end of last match NOT SUPPORTED perl
  72. \Z at end of text, or before newline at end of text NOT SUPPORTED
  73. \z at end of text
  74. (?=re) before text matching «re» NOT SUPPORTED
  75. (?!re) before text not matching «re» NOT SUPPORTED
  76. (?<=re) after text matching «re» NOT SUPPORTED
  77. (?<!re) after text not matching «re» NOT SUPPORTED
  78. re& before text matching «re» NOT SUPPORTED vim
  79. re@= before text matching «re» NOT SUPPORTED vim
  80. re@! before text not matching «re» NOT SUPPORTED vim
  81. re@<= after text matching «re» NOT SUPPORTED vim
  82. re@<! after text not matching «re» NOT SUPPORTED vim
  83. \zs sets start of match (= \K) NOT SUPPORTED vim
  84. \ze sets end of match NOT SUPPORTED vim
  85. \%^ beginning of file NOT SUPPORTED vim
  86. \%$ end of file NOT SUPPORTED vim
  87. \%V on screen NOT SUPPORTED vim
  88. \%# cursor position NOT SUPPORTED vim
  89. \%'m mark «m» position NOT SUPPORTED vim
  90. \%23l in line 23 NOT SUPPORTED vim
  91. \%23c in column 23 NOT SUPPORTED vim
  92. \%23v in virtual column 23 NOT SUPPORTED vim
  93. Escape sequences:
  94. \a bell (== \007)
  95. \f form feed (== \014)
  96. \t horizontal tab (== \011)
  97. \n newline (== \012)
  98. \r carriage return (== \015)
  99. \v vertical tab character (== \013)
  100. \* literal «*», for any punctuation character «*»
  101. \123 octal character code (up to three digits)
  102. \x7F hex character code (exactly two digits)
  103. \x{10FFFF} hex character code
  104. \C match a single byte even in UTF-8 mode
  105. \Q...\E literal text «...» even if «...» has punctuation
  106. \1 backreference NOT SUPPORTED
  107. \b backspace NOT SUPPORTED (use «\010»)
  108. \cK control char ^K NOT SUPPORTED (use «\001» etc)
  109. \e escape NOT SUPPORTED (use «\033»)
  110. \g1 backreference NOT SUPPORTED
  111. \g{1} backreference NOT SUPPORTED
  112. \g{+1} backreference NOT SUPPORTED
  113. \g{-1} backreference NOT SUPPORTED
  114. \g{name} named backreference NOT SUPPORTED
  115. \g<name> subroutine call NOT SUPPORTED
  116. \g'name' subroutine call NOT SUPPORTED
  117. \k<name> named backreference NOT SUPPORTED
  118. \k'name' named backreference NOT SUPPORTED
  119. \lX lowercase «X» NOT SUPPORTED
  120. \ux uppercase «x» NOT SUPPORTED
  121. \L...\E lowercase text «...» NOT SUPPORTED
  122. \K reset beginning of «$0» NOT SUPPORTED
  123. \N{name} named Unicode character NOT SUPPORTED
  124. \R line break NOT SUPPORTED
  125. \U...\E upper case text «...» NOT SUPPORTED
  126. \X extended Unicode sequence NOT SUPPORTED
  127. \%d123 decimal character 123 NOT SUPPORTED vim
  128. \%xFF hex character FF NOT SUPPORTED vim
  129. \%o123 octal character 123 NOT SUPPORTED vim
  130. \%u1234 Unicode character 0x1234 NOT SUPPORTED vim
  131. \%U12345678 Unicode character 0x12345678 NOT SUPPORTED vim
  132. Character class elements:
  133. x single character
  134. A-Z character range (inclusive)
  135. \d Perl character class
  136. [:foo:] ASCII character class «foo»
  137. \p{Foo} Unicode character class «Foo»
  138. \pF Unicode character class «F» (one-letter name)
  139. Named character classes as character class elements:
  140. [\d] digits (== \d)
  141. [^\d] not digits (== \D)
  142. [\D] not digits (== \D)
  143. [^\D] not not digits (== \d)
  144. [[:name:]] named ASCII class inside character class (== [:name:])
  145. [^[:name:]] named ASCII class inside negated character class (== [:^name:])
  146. [\p{Name}] named Unicode property inside character class (== \p{Name})
  147. [^\p{Name}] named Unicode property inside negated character class (== \P{Name})
  148. Perl character classes (all ASCII-only):
  149. \d digits (== [0-9])
  150. \D not digits (== [^0-9])
  151. \s whitespace (== [\t\n\f\r ])
  152. \S not whitespace (== [^\t\n\f\r ])
  153. \w word characters (== [0-9A-Za-z_])
  154. \W not word characters (== [^0-9A-Za-z_])
  155. \h horizontal space NOT SUPPORTED
  156. \H not horizontal space NOT SUPPORTED
  157. \v vertical space NOT SUPPORTED
  158. \V not vertical space NOT SUPPORTED
  159. ASCII character classes:
  160. [[:alnum:]] alphanumeric (== [0-9A-Za-z])
  161. [[:alpha:]] alphabetic (== [A-Za-z])
  162. [[:ascii:]] ASCII (== [\x00-\x7F])
  163. [[:blank:]] blank (== [\t ])
  164. [[:cntrl:]] control (== [\x00-\x1F\x7F])
  165. [[:digit:]] digits (== [0-9])
  166. [[:graph:]] graphical (== [!-~] == [A-Za-z0-9!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~])
  167. [[:lower:]] lower case (== [a-z])
  168. [[:print:]] printable (== [ -~] == [ [:graph:]])
  169. [[:punct:]] punctuation (== [!-/:-@[-`{-~])
  170. [[:space:]] whitespace (== [\t\n\v\f\r ])
  171. [[:upper:]] upper case (== [A-Z])
  172. [[:word:]] word characters (== [0-9A-Za-z_])
  173. [[:xdigit:]] hex digit (== [0-9A-Fa-f])
  174. Unicode character class names--general category:
  175. C other
  176. Cc control
  177. Cf format
  178. Cn unassigned code points NOT SUPPORTED
  179. Co private use
  180. Cs surrogate
  181. L letter
  182. LC cased letter NOT SUPPORTED
  183. L& cased letter NOT SUPPORTED
  184. Ll lowercase letter
  185. Lm modifier letter
  186. Lo other letter
  187. Lt titlecase letter
  188. Lu uppercase letter
  189. M mark
  190. Mc spacing mark
  191. Me enclosing mark
  192. Mn non-spacing mark
  193. N number
  194. Nd decimal number
  195. Nl letter number
  196. No other number
  197. P punctuation
  198. Pc connector punctuation
  199. Pd dash punctuation
  200. Pe close punctuation
  201. Pf final punctuation
  202. Pi initial punctuation
  203. Po other punctuation
  204. Ps open punctuation
  205. S symbol
  206. Sc currency symbol
  207. Sk modifier symbol
  208. Sm math symbol
  209. So other symbol
  210. Z separator
  211. Zl line separator
  212. Zp paragraph separator
  213. Zs space separator
  214. Unicode character class names--scripts:
  215. Adlam
  216. Ahom
  217. Anatolian_Hieroglyphs
  218. Arabic
  219. Armenian
  220. Avestan
  221. Balinese
  222. Bamum
  223. Bassa_Vah
  224. Batak
  225. Bengali
  226. Bhaiksuki
  227. Bopomofo
  228. Brahmi
  229. Braille
  230. Buginese
  231. Buhid
  232. Canadian_Aboriginal
  233. Carian
  234. Caucasian_Albanian
  235. Chakma
  236. Cham
  237. Cherokee
  238. Chorasmian
  239. Common
  240. Coptic
  241. Cuneiform
  242. Cypriot
  243. Cyrillic
  244. Deseret
  245. Devanagari
  246. Dives_Akuru
  247. Dogra
  248. Duployan
  249. Egyptian_Hieroglyphs
  250. Elbasan
  251. Elymaic
  252. Ethiopic
  253. Georgian
  254. Glagolitic
  255. Gothic
  256. Grantha
  257. Greek
  258. Gujarati
  259. Gunjala_Gondi
  260. Gurmukhi
  261. Han
  262. Hangul
  263. Hanifi_Rohingya
  264. Hanunoo
  265. Hatran
  266. Hebrew
  267. Hiragana
  268. Imperial_Aramaic
  269. Inherited
  270. Inscriptional_Pahlavi
  271. Inscriptional_Parthian
  272. Javanese
  273. Kaithi
  274. Kannada
  275. Katakana
  276. Kayah_Li
  277. Kharoshthi
  278. Khitan_Small_Script
  279. Khmer
  280. Khojki
  281. Khudawadi
  282. Lao
  283. Latin
  284. Lepcha
  285. Limbu
  286. Linear_A
  287. Linear_B
  288. Lisu
  289. Lycian
  290. Lydian
  291. Mahajani
  292. Makasar
  293. Malayalam
  294. Mandaic
  295. Manichaean
  296. Marchen
  297. Masaram_Gondi
  298. Medefaidrin
  299. Meetei_Mayek
  300. Mende_Kikakui
  301. Meroitic_Cursive
  302. Meroitic_Hieroglyphs
  303. Miao
  304. Modi
  305. Mongolian
  306. Mro
  307. Multani
  308. Myanmar
  309. Nabataean
  310. Nandinagari
  311. New_Tai_Lue
  312. Newa
  313. Nko
  314. Nushu
  315. Nyiakeng_Puachue_Hmong
  316. Ogham
  317. Ol_Chiki
  318. Old_Hungarian
  319. Old_Italic
  320. Old_North_Arabian
  321. Old_Permic
  322. Old_Persian
  323. Old_Sogdian
  324. Old_South_Arabian
  325. Old_Turkic
  326. Oriya
  327. Osage
  328. Osmanya
  329. Pahawh_Hmong
  330. Palmyrene
  331. Pau_Cin_Hau
  332. Phags_Pa
  333. Phoenician
  334. Psalter_Pahlavi
  335. Rejang
  336. Runic
  337. Samaritan
  338. Saurashtra
  339. Sharada
  340. Shavian
  341. Siddham
  342. SignWriting
  343. Sinhala
  344. Sogdian
  345. Sora_Sompeng
  346. Soyombo
  347. Sundanese
  348. Syloti_Nagri
  349. Syriac
  350. Tagalog
  351. Tagbanwa
  352. Tai_Le
  353. Tai_Tham
  354. Tai_Viet
  355. Takri
  356. Tamil
  357. Tangut
  358. Telugu
  359. Thaana
  360. Thai
  361. Tibetan
  362. Tifinagh
  363. Tirhuta
  364. Ugaritic
  365. Vai
  366. Wancho
  367. Warang_Citi
  368. Yezidi
  369. Yi
  370. Zanabazar_Square
  371. Vim character classes:
  372. \i identifier character NOT SUPPORTED vim
  373. \I «\i» except digits NOT SUPPORTED vim
  374. \k keyword character NOT SUPPORTED vim
  375. \K «\k» except digits NOT SUPPORTED vim
  376. \f file name character NOT SUPPORTED vim
  377. \F «\f» except digits NOT SUPPORTED vim
  378. \p printable character NOT SUPPORTED vim
  379. \P «\p» except digits NOT SUPPORTED vim
  380. \s whitespace character (== [ \t]) NOT SUPPORTED vim
  381. \S non-white space character (== [^ \t]) NOT SUPPORTED vim
  382. \d digits (== [0-9]) vim
  383. \D not «\d» vim
  384. \x hex digits (== [0-9A-Fa-f]) NOT SUPPORTED vim
  385. \X not «\x» NOT SUPPORTED vim
  386. \o octal digits (== [0-7]) NOT SUPPORTED vim
  387. \O not «\o» NOT SUPPORTED vim
  388. \w word character vim
  389. \W not «\w» vim
  390. \h head of word character NOT SUPPORTED vim
  391. \H not «\h» NOT SUPPORTED vim
  392. \a alphabetic NOT SUPPORTED vim
  393. \A not «\a» NOT SUPPORTED vim
  394. \l lowercase NOT SUPPORTED vim
  395. \L not lowercase NOT SUPPORTED vim
  396. \u uppercase NOT SUPPORTED vim
  397. \U not uppercase NOT SUPPORTED vim
  398. \_x «\x» plus newline, for any «x» NOT SUPPORTED vim
  399. Vim flags:
  400. \c ignore case NOT SUPPORTED vim
  401. \C match case NOT SUPPORTED vim
  402. \m magic NOT SUPPORTED vim
  403. \M nomagic NOT SUPPORTED vim
  404. \v verymagic NOT SUPPORTED vim
  405. \V verynomagic NOT SUPPORTED vim
  406. \Z ignore differences in Unicode combining characters NOT SUPPORTED vim
  407. Magic:
  408. (?{code}) arbitrary Perl code NOT SUPPORTED perl
  409. (??{code}) postponed arbitrary Perl code NOT SUPPORTED perl
  410. (?n) recursive call to regexp capturing group «n» NOT SUPPORTED
  411. (?+n) recursive call to relative group «+n» NOT SUPPORTED
  412. (?-n) recursive call to relative group «-n» NOT SUPPORTED
  413. (?C) PCRE callout NOT SUPPORTED pcre
  414. (?R) recursive call to entire regexp (== (?0)) NOT SUPPORTED
  415. (?&name) recursive call to named group NOT SUPPORTED
  416. (?P=name) named backreference NOT SUPPORTED
  417. (?P>name) recursive call to named group NOT SUPPORTED
  418. (?(cond)true|false) conditional branch NOT SUPPORTED
  419. (?(cond)true) conditional branch NOT SUPPORTED
  420. (*ACCEPT) make regexps more like Prolog NOT SUPPORTED
  421. (*COMMIT) NOT SUPPORTED
  422. (*F) NOT SUPPORTED
  423. (*FAIL) NOT SUPPORTED
  424. (*MARK) NOT SUPPORTED
  425. (*PRUNE) NOT SUPPORTED
  426. (*SKIP) NOT SUPPORTED
  427. (*THEN) NOT SUPPORTED
  428. (*ANY) set newline convention NOT SUPPORTED
  429. (*ANYCRLF) NOT SUPPORTED
  430. (*CR) NOT SUPPORTED
  431. (*CRLF) NOT SUPPORTED
  432. (*LF) NOT SUPPORTED
  433. (*BSR_ANYCRLF) set \R convention NOT SUPPORTED pcre
  434. (*BSR_UNICODE) NOT SUPPORTED pcre