Skip to content

How to search for datasets

The metadata catalogue of datasets can be explored through a search engine (data tab), through a SPARQL endpoint and API endpoint.

An up-to-date list of all catalogues can always be retrieved by this SPARQL query.

Search manually

You can search for datasets containing one or more keywords by typing the keywords into the search field and then clicking on the search button. The search process will reduce your keywords to their root form, to ensure variants of your keyword also match. For example, "Europa" can also match "Europe", or "walking" can also match "walk" and "walked".

When entering more than one search term it is also possible to use the logical operators (connectors) AND, OR as well as parentheses (). Furthermore, it is also possible to do a wildcard search and search for an exact phrase.

Logical Operator OR

The OR operator can be used in order to

  • connect two or more similar concepts (synonyms)
  • broaden your results, telling the database that ANY of your search terms can be present in the resulting dataset

Example: population OR education OR science

All three circles represent the result set for this search. It is a big set because any of those words are valid using the OR operator.

Logical Operator OR

Logical Operator AND

The AND operator can be used to:

  • find sources containing two or more ideas
  • narrow the search

The database will only retrieve items containing both keywords. The AND operator can be used multiple times in one query.

Example: population AND education AND science

Logical Operator AND

Parantheses ()

Parentheses are used to:

  • perform a more complex search using both AND and OR by placing parentheses around synonyms
  • save time by searching multiple synonyms at once

Example: (environment OR nature) AND (refurbish OR reuse)

This avoids the need to perform multiple searches for combinations of keywords.

Parentheses

Basic rules for using AND, OR operators and parentheses ()

  • The OR is implicit: search function automatically puts an OR in between your search terms. It means that the search for population OR education OR science gets the same matches as the search for population education science.

cat OR dog

cat dog

  • Always enter AND and OR operators in uppercase letters.
  • Never translate AND and OR operators into any other language. It does not matter in what language your keywords are the operators always must be AND and OR.

katze OR hund

katze ODER hund

  • Please keep in mind to always close parentheses (). The combination of logical operators with an odd number of parentheses in the query leads to incorrect results in the search.

(cat AND dog) OR bird

(cat AND dog) OR bird)

Wildcard

Wildcard search can be used to optimize the search result when you do not know the entire keyword by using ? (question mark) to replace a single character and * (asterisk) to replace zero or more characters.

Example: ois?au dat*

The example above will return datasets that can contain oisiau data, oisoau dataset, or oiseau datasets. Wildcard search does not reduce a keyword to its root form, like in basic search, since it would be wrong to do so on a term that some of its letters are unknown.

Please note that the more keywords need to be checked, just in case they match (e.g. a*, or b*, or c*), the heavier and the poorer the wildcard search performance can be. Having a wildcard at the beginning of a keyword (e.g. *ing, or ?iseau) is ignored and the search term will be treated as it is to avoid such a heavy and expensive process. Invalid wildcard search terms (e.g. ois?au OR (data* AND organization (unbalanced parentheses)) will be also treated as it is.

Exact Phrase

When you place your keywords in double quotes, they will be considered as a phrase, not case sensitive and characters are taken literally as it is. The search will return datasets containing the keywords in exactly the same order.

Example: "Manual public space"

The example above returns datasets containing exactly “manual public space”. It will not return datasets containing “manual space public”, or “space public manual”, or any other results with the search terms appearing in different sequences than “manual-public-space”. If you search for "public spa*", the results will only list datasets containing “public spa*”.

SPARQL search enables even more advanced users to find datasets using Resource Description Framework (RDF) query language. SPARQL can help to find specific information from a large amount of RDF dataset, even if it organized in a complex way. For more information see the data.europa.eu SPARQL, and the SPARQL search.

Terms of reuse

Most of the data accessible via data.europa.eu is released by the respective data providers using an open licence. Data can be used for free for commercial and non-commercial purposes, provided the source is acknowledged. Specific conditions for reuse, relating mostly to the protection of data privacy and intellectual property, apply to a small amount of data. A link to these conditions can be found for each dataset.

The terms of use can be found in the data.europa.eu copyright notice. Most data is covered by open licences. As of September 2021, the most common open licences were the Creative Commons ‘CC‑BY‑4.0’ licence, the ‘Data licence Germany – attribution’ licence or Etalab’s Open Licence (used by the French government).

Terms of reuse