Packrift optimization benchmark corpus

Dataset metadata

The corpus publishes standards-based metadata so search engines, archive platforms, and dataset tools can understand the source ledger without relying on a closed platform upload.

Metadata files

FileFormatPurpose
Frictionless Data Packageapplication/jsonPortable dataset descriptor for the source ledger, manifest, and audit files.
MLCommons Croissantapplication/jsonMachine learning dataset metadata for discovery and future platform distribution.
schema.org Dataset JSON-LDapplication/ld+jsonStandalone structured-data record for crawlers and archive platforms.
DataCite JSONapplication/jsonCitation-oriented metadata for DOI/archive preparation.
RO-Crateapplication/ld+jsonResearch Object Crate metadata tying the corpus, files, and Packrift publisher record together.
Kaggle metadata draftapplication/jsonDraft upload metadata; publication still requires a license decision and Kaggle auth.

License status

No separate open-data license is declared in this release. Dataset-platform publication still requires a Packrift license decision plus account authentication where applicable.

Primary corpus files