Title: | Fast and Easy Eurostat Data Import and Search |
---|---|
Description: | Interface to Eurostat’s API (SDMX 2.1) with fast data.table-based import of data, labels, and metadata. On top of the core functionality, data search and data description/comparison functions are also provided. Use <https://github.com/alekrutkowski/eurodata_codegen> — a point-and-click app for rapid and easy generation of richly-commented R code — to import a Eurostat dataset or its subset (based on the eurodata::importData() function). |
Authors: | Aleksander Rutkowski [aut, cre] |
Maintainer: | Aleksander Rutkowski <[email protected]> |
License: | GPL-2 |
Version: | 1.7.0 |
Built: | 2024-11-18 16:12:24 UTC |
Source: | https://github.com/alekrutkowski/eurodata |
Some manipulations of the EurostatDataList
data.frame
(imported with importDataList
)
e.g. filtering with package dplyr may remove the S3 class tag
EurostatDataList
. This function coerces it back to EurostatDataList
after checking that the critical columns
(PCode
, Dataset name
,Link
) are present. This is useful
if a user wants to print and browse this filtered data.frame as a specially
formatted HTML table.
as.EurostatDataList(x, SearchCriteria = "", ...)
as.EurostatDataList(x, SearchCriteria = "", ...)
x |
A (most likely filtered subset of) |
SearchCriteria |
A string describing the search criteria used for filtering/subsetting. |
... |
Additional arguments to be passed to or from methods (currently not used). |
A data.frame of S3 class EurostatDataList
.
Search Eurostat datasets and see the result as a table in a browser
browseDataList(subs)
browseDataList(subs)
subs |
An expression to be passed to
|
Side effect (via print
) – a table opened in a browser via browseURL
.
Value – a list with:
criteria – a string, search criteria,
time – the time of the query,
df – a data.frame, imported via importDataList
and
filtered based on the conditions specified in subs
.
html – a string, with the HTML code that generated the table in a browser.
## Not run: browseDataList(grepl('servic',`Dataset name`)) browseDataList(grepl('bop',Code) & !grepl('its',Code)) ## End(Not run) ## Not run: browseDataList(grepl('GDP',`Dataset name`) & grepl('main',`Dataset name`) & grepl('international',`Dataset name`) & !grepl('quarterly',`Dataset name`)) browseDataList(grepl('bop',Code) & grepl('its',Code)) ## End(Not run)
## Not run: browseDataList(grepl('servic',`Dataset name`)) browseDataList(grepl('bop',Code) & !grepl('its',Code)) ## End(Not run) ## Not run: browseDataList(grepl('GDP',`Dataset name`) & grepl('main',`Dataset name`) & grepl('international',`Dataset name`) & !grepl('quarterly',`Dataset name`)) browseDataList(grepl('bop',Code) & grepl('its',Code)) ## End(Not run)
Compare specific Eurostat datasets on the basis of information from Metabase
compare(..., import_labels = TRUE, import_dim_labels = TRUE)
compare(..., import_labels = TRUE, import_dim_labels = TRUE)
... |
Two or more Eurostat dataset code names, e.g. |
import_labels |
Boolean: should labels for the codes inside dimensions be imported. Default: |
import_dim_labels |
Boolean: should the dimensions (e.g. |
A data.table with columns Dim_name
, Dim_name_label
(if import_dim_labels
=TRUE
),
Dim_val
, Dim_val_label
(if import_labels
=TRUE
), and logical columns corresponding to the dataset names
in ...
indicating in which dataset a given dimension and dimension value appears and in which it does not.
## Not run: compare('nama_10_gdp', 'nama_10_pe') ## End(Not run)
## Not run: compare('nama_10_gdp', 'nama_10_pe') ## End(Not run)
Describe a given Eurostat dataset on the basis of information from Metabase
describe( EurostatDatasetCode, import_labels = !wide, wide = FALSE, import_dim_labels = TRUE )
describe( EurostatDatasetCode, import_labels = !wide, wide = FALSE, import_dim_labels = TRUE )
EurostatDatasetCode |
A string with Eurostat dataset code name, e.g. |
import_labels |
Boolean: should labels for the codes inside dimensions be imported. Default: if |
wide |
Boolean: should each dimension be compressed to one row and all values within each dimension to a single,
comma-separated string. Default: |
import_dim_labels |
Boolean: should the dimensions (e.g. |
A data.table with columns Dim_name
, Dim_name_label
(if import_dim_labels
=TRUE
),
either Dim_val
(if wide=FALSE
) or Dim_values
(if wide=TRUE
),
Dim_val_label
(if import_labels
=TRUE
), and a column with a name = EurostatDatasetCode
with all
its values = TRUE
.
## Not run: describe('nama_10_gdp') ## End(Not run)
## Not run: describe('nama_10_gdp') ## End(Not run)
A tool for a quick ad-hoc search.
find(...)
find(...)
... |
A series of unquoted words to be searched either in Eurostat dataset codes or in dataset full names. All words not preceded by minus (-) will be linked with logical AND; all words preceded by a minus entail exclusion (logical NOT), a bit like in Google search. It is possible to search also with phrases that include spaces – in such a case the phrases should be quoted. Partial word/phrase match is applied. See the examples below. |
Side effect (via print
) – a text report file opened via file.show
.
Value – a list with:
criteria – a string, search criteria,
time – the time of the query,
df – a data.frame, imported via importDataList
and
filtered based on the conditions specified in ...
,
report – a string, with the text report.
## Not run: find(bop, its) find(bop,-ybk,its) find(nama_) find(nama,10,64) find('economic indic') ## End(Not run)
## Not run: find(bop, its) find(bop,-ybk,its) find(nama_) find(nama,10,64) find('economic indic') ## End(Not run)
Download and import a Eurostat dataset
importData(EurostatDatasetCode, filters = NULL)
importData(EurostatDatasetCode, filters = NULL)
EurostatDatasetCode |
A string (upper/lower-case difference is not relevant) with Eurostat dataset code name,
e.g. |
filters |
Optional: a list of atomic vectors. The names of the elements of the list should correspond to the
names of the dimensions of the dataset (defined in |
A Eurostat dataset as a ‘flat’ data.frame. A ‘flat’ dataset has all numeric values in one column, with each row representing one of the available combinations of all dimensions (e.g. if dimensions are: countries, years, sectors, and indicators, there can be a row for value added in retail in Germany in 2013).
## Not run: # Full dataset import: importData('nama_10_gdp') # Import only a subset of a dataset: importData('bop_its6_det', filters = list(geo=c('AT','BG'), TIME_PERIOD=2014:2020, bop_item='SC')) ## End(Not run)
## Not run: # Full dataset import: importData('nama_10_gdp') # Import only a subset of a dataset: importData('bop_its6_det', filters = list(geo=c('AT','BG'), TIME_PERIOD=2014:2020, bop_item='SC')) ## End(Not run)
Import and reshape Eurostat inventory of datasets
importDataList()
importDataList()
The imported data.frame reflects the hierarchical
structure of datasets (see the columns Data subgroup, level 0
,
Data subgroup, level 1
, Data subgroup, level 2
, etc.).
It is tagged with S3 class EurostatDataList
.
## Not run: importDataList() ## End(Not run)
## Not run: importDataList() ## End(Not run)
Import the appropriate description file
for the selected Eurostat dimension, e.g. for "geo"
it is "Geopolitical entity (reporting)"
,
for "nace_r2"
it is "Classification of economic activities - NACE Rev.2"
,
for "indic_sb"
it is "Economical indicator for structural business statistics"
etc.
Click on "Code lists" just under "Apply download operations on" at https://ec.europa.eu/eurostat/databrowser/bulk?lang=en
for the list of all codes.
Each description is imported from inside the XML file
(via the path: m:Structure / m:Structures / s:Codelists / s:Codelist / c:Name xml:lang="en")
from the respective URL, e.g. for "geo"
it is
https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/codelist/ESTAT/GEO.
importDimLabel(EurostatDimCode)
importDimLabel(EurostatDimCode)
EurostatDimCode |
A string – the code name of the Eurostat dimension, e.g. |
A character vector of length 1: the label/description of EurostatDimCode
.
## Not run: importDimLabel('nace_r2') ## End(Not run)
## Not run: importDimLabel('nace_r2') ## End(Not run)
Import the appropriate ‘code list’ from
for the selected Eurostat dimension, e.g. geo
(countries or other geographic entities),
nace_r2
(sectors), indic_sb
(indicators), etc.
importLabels(EurostatDimCode)
importLabels(EurostatDimCode)
EurostatDimCode |
A string – the code name of the Eurostat dimension, e.g. |
A data.frame with 2 columns: codes (with a name determined by EurostatDimCode
)
and corresponding labels (named with suffix _labels
).
## Not run: importLabels('nace_r2') ## End(Not run)
## Not run: importLabels('nace_r2') ## End(Not run)
The Eurostat “Metabase” shows which datasets contain which
dimensions (where a dimension is e.g. geo
or nace_r2
or indic_sb
) and, within each dataset and dimension,
which codes (e.g. which countries for the geo
dimension).
importMetabase()
importMetabase()
The imported data.frame which reflects the hierarchical structure described above. It is a ‘flat’ data.frame with 3 columns, where each row corresponds to the combination of:
Code
– Eurostat dataset code names,
e.g. "nama_10_a64"
Dim_name
– Eurostat dimension code names,
e.g. "nace_r2"
Dim_val
– Eurostat dimension code values,
e.g. "EU28"
if Dim_name
is "geo"
;
not to be confused with the actual numeric values
in the actual datasets
## Not run: importMetabase() ## End(Not run)
## Not run: importMetabase() ## End(Not run)