A ruby library for working with Data Packages.
The library is intending to support:
- Parsing and using data package metadata and data
- Validating data packages to ensure they conform with the Data Package specification
Add the gem into your Gemfile:
Require the gem, if you need to:
Parsing a data package descriptor from a remote location:
This assumes that
Similarly you can load a package descriptor from a local JSON file.
The data package descriptor
datapackage.json file, is expected to be at the root directory
of the data package and the
path attribute of the package's
resources will be resolved
relative to it.
You can also load a data package descriptor directly from a Hash:
There are a set of helper methods for accessing data from the package, e.g:
A data package must contain an array of Data Resources.
You can access the resources in your Data Package either by their name or by their index in the
You can then read the source depending on its type. For example if resource is local and not multipart it could by open as a file:
See TableSchema documentation for other things you can do with tabular resource.
If the resource is valid it will be added to the
resources array of the Data Package;
if it's invalid it will not be added and you should try creating and validating your resource to see why it fails.
Data Package and Data Resource descriptors can be validated against JSON schemas that we call
The profiles from the registry come bundled with the gem. You can reference them in your Data Package descriptor by their identifier in the registry:
data-packagethe default profile for a Data Package
data-resourcethe default profile for a Data Resource
tabular-data-packagefor a Tabular Data Package
tabular-data-resourcefor a Tabular Data Resource
fiscal-data-packagefor a Fiscal Data Package
If you have a custom profile schema you can reference it by its URL:
Data Resources and Data Packages are validated against their profiles to ensure they respect the expected structure.
The same methods used to check the validity of a Resource -
iter_errors- are also available for a Package.
The difference is that after a Package descriptor is validated against its
profile, each of its
resources are also validated against their
In order for a Package to be valid all its Resources have to be valid.
These notes are intended to help people that want to contribute to this package itself. If you just want to use it, you can safely ignore them.
After checking out the repo, run
bundle to install dependencies. Then, run
rake spec to run the tests.
To install this gem onto your local machine, run
bundle exec rake install.
To release a new version, update the version number in
version.rb, and then run
bundle exec rake release,
which will create a git tag for the version, push git commits and tags, and push the
.gem file to rubygems.org.
We cache the local schemas from https://specs.frictionlessdata.io/schemas/registry.json. The local schemas should be kept up to date with the remote ones using: