A Java library for working with Table Schema. Snapshots on Jitpack. tableschema-java is a library aimed at parsing CSV and JSON-Array documents into a tabular format according to Table Schema, a format definition based on JSON Schema.
It allows you to read and write tabular data with assurances to format integrity (it also allows reading and writing CSV free-form, ie. without a Schema).
Cast data from a CSV without a schema:
You can write a
Table into a CSV file:
You can build a
Schema instance from scratch or modify an existing one:
You can also build a
Schema instance with
JSONObject instances instead of
When using the
addField method, the schema undergoes validation after every field addition.
If adding a field causes the schema to fail validation, then the field is automatically removed.
Alternatively, you might want to build your
Schema by loading the schema definition from a JSON file:
If you don't have a schema for a CSV and don't want to manually define one then you can generate it:
The type inferral algorithm tries to cast to available types and each successful type casting increments a popularity score for the successful type cast in question. At the end, the best score so far is returned. The inferral algorithm traverses all of the table's rows and attempts to cast every single value of the table. When dealing with large tables, you might want to limit the number of rows that the inferral algorithm processes:
List<Object> data and
String headers are available, the schema can also be inferred from the a Schema object:
Row limit can also be set:
Using an instance of Table or Scheme to infer a schema invokes the same method from the TypeInferred Singleton:
You can write a
Schema into a JSON file:
If you have a schema, you can input it as parameter when creating the
Table instance so that the data from the CSV will be cast into the field types defined in the schema:
To check if a given set of values complies with the schema, you can use
If a value in the given set of values cannot be cast to its expected type as defined by the schema, then an
InvalidCastException is thrown.
Data values can be cast to native Java objects with a Field instance. This allows formats and constraints to be defined for the field in the field descriptor:
Casting a value will check the value is of the expected type, is in the correct format, and complies with any constraints imposed in the descriptor.
Value that can't be cast will raise an
By default, casting a value that does not meet the constraints will raise a
Constraints can be ignored with by setting a boolean flag to false:
You can call the
checkConstraintViolations method to find out which constraints are being validated.
The method returns a map of violated constraints:
castValue used the
TypeInferrer singleton to cast the given value into the desired type.
For instance, you can use the
TypeInferrer singleton to cast a String representation of a number into a float like so:
Found a problem and would like to fix it? Have that great idea and would love to see it in the repository?
Please open an issue before you start working.
It could save a lot of time for everyone and we are super happy to answer questions and help you along the way. Furthermore, feel free to join frictionlessdata Gitter chat room and ask questions.
This project follows the Open Knowledge International coding standards.
Make sure all tests pass.