Skip to content

Regex search

Regular expressions (regex) are patterns used to match character combinations in strings. They are powerful tools for searching, validating, and manipulating text.

Regular expression patterns can be embedded in the query string by wrapping them in forward-slashes ("/"). RegexSearch

Lucene’s regular expression engine used by data.europa.eu supports all Unicode characters. However, the following characters are reserved as operators:

Operator Description Example
. Matches any single character (except newline) a.c matches abc, a1c
? Matches 0 or 1 occurrence of the preceding element colou?r matches color or colour
+ Matches 1 or more occurrences of the preceding element a+ matches a, aa
* Matches 0 or more occurrences of the preceding element ab* matches a, ab, abb
| Acts as a logical OR between expressions cat|dog matches either cat or dog
{} Specifies a specific number of occurrences a{2,4} matches aa, aaa, or aaaa
[] Defines a character class; matches any character within brackets [abc] matches a, b, or c
() Groups expressions and captures matched text (abc)+ matches one or more occurrences of "abc"
" Is treated as a regular character and does not require escaping when used in regex patterns "hello" matches the exact string "hello"
\ Escapes a special character to treat it as a literal \. matches the period character

To use one of these characters literally within regular expression, escape it with a preceding backslash or surround it with double quotes.

For example: \\ renders as a literal '\'

Please remember that the regular expression search is case sensitive.

Unsupported operators

The regular expression engine of data.europa.eu does not support anchor operators, such as ^ (beginning of line) or $ (end of line). To match a term, the regular expression must match the entire string.