Table schema tooling in Go.
This package uses semantic versioning 2.0.0.
Please use go modules if you're using a version that supports it. To know which version of Go you're running, please run:
If you're running go1.13+, you're good to go!
If you can not upgrade right now, you need to make sure your environment is using Go modules by setting the
GO111MODULE environment variable. In bash, that could be done with the following command:
Have tabular data stored in local files? Remote files? Packages like the csv are going to help on loading the data you need and making it ready for processing.
Supported physical representations:
You would like to use tableschema-go but the physical representation you use is not listed here? No problem! Please create an issue before start contributing. We will be happy to help you along the way.
Got that new dataset and wants to start getting your hands dirty ASAP? No problems, let the schema package try to infer the data types based on the table data.
Want to go faster? Please give InferImplicitCasting a try and let us know how it goes.
There might be cases in which the inferred schema is not correct. One of those cases is when your data use strings like "N/A" to represent missing cells. That would usually make our inferential algorithm think the field is a string.
When that happens, you can manually perform those last minutes tweaks Schema.
After all that, you could persist your schema to disk:
And use the local schema later:
Finally, if your schema is saved remotely, you can also use it:
If you have a lot of data and can no load everything in memory, you can easily iterate trough it:
If you store data in a GZIP file, you can load it compressed using the same
Even better if you could do it regardless the physical representation! The table package declares some interfaces that will help you to achieve this goal:
Class represents field in the schema.
For example, data values can be castd to native Go types. Decoding a value will check if the value is of the expected type, is in the correct format, and complies with any constraints imposed by a schema.
The following example will raise exception the passed-in is less than allowed by
minimum constraints of the field.
Errors will be returned as well when the user tries to cast values which are not well formatted dates.
Values that can't be castd will return an
Casting a value that doesn't meet the constraints will return an
Available types, formats and resultant value of the cast:
|geopoint||default, array, object||[float64, float64]|
|string||default, uri, email, binary||string|
|date||default, any, \<PATTERN>||time.Time|
|datetime||default, any, \<PATTERN>||time.Time|
|time||default, any, \<PATTERN>||time.Time|
Once you're done processing the data, it is time to persist results. As an example, let us assume we have a remote table schema called
summary, which contains two fields:
More detailed documentation about API methods and plenty of examples is available at https://godoc.org/github.com/frictionlessdata/tableschema-go
Found a problem and would like to fix it? Have that great idea and would love to see it in the repository?
Please open an issue before start working
That could save a lot of time from everyone and we are super happy to answer questions and help you alonge the way. Furthermore, feel free to join frictionlessdata Gitter chat room and ask questions.
This project follows the Open Knowledge International coding standards
Before start coding:
- Fork and pull the latest version of the master branch
- Make sure you have go 1.8+ installed and you're using it
- Make sure you dep installed
Before sending the PR:
And make sure your all tests pass.