United front for open, permissive, high quality CJK datasets and clients.
Cihai
is a team, effort, united effort for incubating open,
permissive, high quality CJK datasets and clients.
- cihai-handbook - convert your cjk dataset to datapackages-friendly format.
- Official client libraries. cihai-python a python client for cihai+datapackages datasets (cjklib style).
- Public datasets maintained by cihai team. See cihaidata-unihan on github.
Have a CJK dataset? Consider permissively licensing your dataset and adopting datapackages standards.
For an example of a datapackage + cihai enabled dataset, see https://github.com/cihai/cihaidata-unihan.
Cihai CJK datasets follows dataprotocols conventions:
- datapackages format.
- datapackage.json format - has metadata for source file
- json table schema -
datapackage.json
schema information. - simple data format -
scripts/process.py
producesdata/unihan.csv
- (optional) PEP 301: python package format - python package installation.
Docs | http://cihai.rtfd.org |
Changelog | http://cihai.readthedocs.org/en/latest/history.html |