2.1 Data collection
In this tutorial I’ll be retrieve data and metadata from World Bank on world popoulation between 1960 and 2021 WDI package.
We start by loading the population data and countries metadata from World Bank.
#population data
<- WDI::WDI(indicator='SP.POP.TOTL',
country_pop start=1960,
end=2022)
#metadata
<- WDI::WDI_data$country country_meta
Now let’s have a brief look on these two large tables.
#data table
glimpse(country_pop)
## Rows: 16,492
## Columns: 5
## $ country <chr> "Africa Eastern and Southern", "Africa Eastern and Souther…
## $ iso2c <chr> "ZH", "ZH", "ZH", "ZH", "ZH", "ZH", "ZH", "ZH", "ZH", "ZH"…
## $ iso3c <chr> "AFE", "AFE", "AFE", "AFE", "AFE", "AFE", "AFE", "AFE", "A…
## $ year <int> 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012…
## $ SP.POP.TOTL <dbl> 694665117, 677243299, 660046272, 643090131, 626392880, 609…
::gt(head(country_pop)) gt
country | iso2c | iso3c | year | SP.POP.TOTL |
---|---|---|---|---|
Africa Eastern and Southern | ZH | AFE | 2021 | 694665117 |
Africa Eastern and Southern | ZH | AFE | 2020 | 677243299 |
Africa Eastern and Southern | ZH | AFE | 2019 | 660046272 |
Africa Eastern and Southern | ZH | AFE | 2018 | 643090131 |
Africa Eastern and Southern | ZH | AFE | 2017 | 626392880 |
Africa Eastern and Southern | ZH | AFE | 2016 | 609978946 |
#metadata table
glimpse(country_meta)
## Rows: 299
## Columns: 9
## $ iso3c <chr> "ABW", "AFE", "AFG", "AFR", "AFW", "AGO", "ALB", "AND", "ARB…
## $ iso2c <chr> "AW", "ZH", "AF", "A9", "ZI", "AO", "AL", "AD", "1A", "AE", …
## $ country <chr> "Aruba", "Africa Eastern and Southern", "Afghanistan", "Afri…
## $ region <chr> "Latin America & Caribbean", "Aggregates", "South Asia", "Ag…
## $ capital <chr> "Oranjestad", "", "Kabul", "", "", "Luanda", "Tirane", "Ando…
## $ longitude <chr> "-70.0167", "", "69.1761", "", "", "13.242", "19.8172", "1.5…
## $ latitude <chr> "12.5167", "", "34.5228", "", "", "-8.81155", "41.3317", "42…
## $ income <chr> "High income", "Aggregates", "Low income", "Aggregates", "Ag…
## $ lending <chr> "Not classified", "Aggregates", "IDA", "Aggregates", "Aggreg…
::gt(head(country_meta)) gt
iso3c | iso2c | country | region | capital | longitude | latitude | income | lending |
---|---|---|---|---|---|---|---|---|
ABW | AW | Aruba | Latin America & Caribbean | Oranjestad | -70.0167 | 12.5167 | High income | Not classified |
AFE | ZH | Africa Eastern and Southern | Aggregates | Aggregates | Aggregates | |||
AFG | AF | Afghanistan | South Asia | Kabul | 69.1761 | 34.5228 | Low income | IDA |
AFR | A9 | Africa | Aggregates | Aggregates | Aggregates | |||
AFW | ZI | Africa Western and Central | Aggregates | Aggregates | Aggregates | |||
AGO | AO | Angola | Sub-Saharan Africa | Luanda | 13.242 | -8.81155 | Lower middle income | IBRD |
Looks like a lot of data! Let’s dive a bit deeper and explore the data visually.