Interface to provide pandas pd.DataFrame
objects for all the dataset and
organisation metadata, and datasets themselves in the data.gov.uk CKAN database.
The main goal here, is to reduce the barrier of entry to exploring the data.gov.uk datasets. The collection contains many web
This can be run as a normal python script, but for the best experience, it is recommended to
run this in ipython
or some sort of interactive console so that data can be inspected ad-hoc.
import datagovuk as dgu
orgs = dgu.organisation_structure()
print(orgs.head())
rsc = dgu.resources()
# Fetch a dataset from data.gov.uk
organogram_reference = rsc[
(rsc.format == 'CSV') &
(rsc.name == 'organogram-uk-statistics-authority')
].iloc[0]
organogram = dgu.resource(organogram_reference)
print(organogram)
>>> highlighted name parent title
>>> id
>>> 5ea7a4ac-7455-4ab4-8296-b6b600bf9b6e False cranfield-university fc87db43-996f-442b-b4b2-60f0287a9e22 Cranfield University
>>> a42aa1ab-8fbf-4fdf-bbca-07570caa1cfb False university-of-edinburgh fc87db43-996f-442b-b4b2-60f0287a9e22 University of Edinburgh
>>> fc87db43-996f-442b-b4b2-60f0287a9e22 False academics None Academics
>>> 3a9d8dc4-4f45-4d48-928a-6e3f04449dba False crown-prosecution-service b5dbc6b9-f976-4b78-8bab-2ac41e78ed38 Crown Prosecution Service
>>> 486b7bf1-77d8-4ef2-8722-eab1eaf19b2e False government-legal-department b5dbc6b9-f976-4b78-8bab-2ac41e78ed38 Government Legal Department
- datagovuk.organisation_structure()
- datagovuk.organisations_groups()
- datagovuk.organisations_users()
- datagovuk.organisations()
- datagovuk.datasets()
- datagovuk.resources()
- datagovuk.resource(df: pd.DataFrame) #
Using the datagocuk.resource(...)
method requires a handler to be written to comprehend the dataset being imported.
datagovuk
uses a