Dataset metadata
The corpus publishes standards-based metadata so search engines, archive platforms, and dataset tools can understand the source ledger without relying on a closed platform upload.
Metadata files
| File | Format | Purpose |
|---|---|---|
| Frictionless Data Package | application/json | Portable dataset descriptor for the source ledger, manifest, and audit files. |
| MLCommons Croissant | application/json | Machine learning dataset metadata for discovery and future platform distribution. |
| schema.org Dataset JSON-LD | application/ld+json | Standalone structured-data record for crawlers and archive platforms. |
| DataCite JSON | application/json | Citation-oriented metadata for DOI/archive preparation. |
| RO-Crate | application/ld+json | Research Object Crate metadata tying the corpus, files, and Packrift publisher record together. |
| Kaggle metadata draft | application/json | Draft upload metadata; publication still requires a license decision and Kaggle auth. |
License status
No separate open-data license is declared in this release. Dataset-platform publication still requires a Packrift license decision plus account authentication where applicable.