Regex search
Regular expressions (regex) are patterns used to match character combinations in strings. They are powerful tools for searching, validating, and manipulating text.
Regular expression patterns can be embedded in the query string by wrapping them in forward-slashes ("/").
Lucene’s regular expression engine used by data.europa.eu supports all Unicode characters. However, the following characters are reserved as operators:
Operator | Description | Example |
---|---|---|
. |
Matches any single character (except newline) | a.c matches abc , a1c |
? |
Matches 0 or 1 occurrence of the preceding element | colou?r matches color or colour |
+ |
Matches 1 or more occurrences of the preceding element | a+ matches a , aa |
* |
Matches 0 or more occurrences of the preceding element | ab* matches a , ab , abb |
| |
Acts as a logical OR between expressions | cat|dog matches either cat or dog |
{} |
Specifies a specific number of occurrences | a{2,4} matches aa , aaa , or aaaa |
[] |
Defines a character class; matches any character within brackets | [abc] matches a , b , or c |
() |
Groups expressions and captures matched text | (abc)+ matches one or more occurrences of "abc" |
" |
Is treated as a regular character and does not require escaping when used in regex patterns | "hello" matches the exact string "hello" |
\ |
Escapes a special character to treat it as a literal | \. matches the period character |
To use one of these characters literally within regular expression, escape it with a preceding backslash or surround it with double quotes.
For example: \\ renders as a literal '\'
Please remember that the regular expression search is case sensitive.
Unsupported operators
The regular expression engine of data.europa.eu does not support anchor operators, such as ^ (beginning of line) or $ (end of line). To match a term, the regular expression must match the entire string.