Exporting data from Arraylake
Arraylake's CLI tool allows users to smoothly export data from an Arraylake Repo to local files or an object storage destination.
arraylake repo export repo_name destination
To export data, provide a source repo and storage destination:
repo_name
: name of the Arraylake Repo you would like to export, format: 'org/repo'.destination
: Export destination (either ans3://
url or a filesystem path), format: 's3://my/bucket/repo.zarr' orlocal/path
al repo export earthmover/ocean s3://my-export-target/ocean.zarr
✓ Initializing target at s3://my-export-target/ocean...succeeded
✓ Generating manifest for full export as of 66341d6b46547256bc60254c...succeeded
✓ Checking out earthmover/ocean...succeeded
⠴ Exporting chunks to s3://my-export-target/ocean... 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1559/1559 0:01:07 0:00:39
Export summary for earthmover/ocean
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Stat ┃ Value ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Objects transferred │ 1559 │
│ Data transferred │ 124.6 MB │
│ Export time │ 39 seconds │
│ Transfer rate │ 3.19 MB/s │
└─────────────────────────────────────────────────┴────────────────────────────┘
If an export operation is interrupted, Arraylake will record and store progress so that the export may be resumed without repeating the entire operation.
Customize exports
Format
Data can be exported in a number of formats using the --format
flag:
--format: [zarr2 | zarr3alpha]
The default format is zarr2
.
Specifying export versions
Users can export a view of their data from a specific commit or branch. Pass a commit ID or branch name to --ref
to export the repo from that reference's state. Pass --from-ref
in addition to --ref
to export only the changes committed between two reference points.
Concurrency
Use --concurrency
to specify the level of concurrency within the export process. Maximum number of concurrent copy operations is 64.
Additional configuration
Additional configuration of the destination store can be accomplished by passing a YAML file with the following format to al repo export
:
endpoint_url : my_endpoint_url
access_key_id : my_access_key_id
secret_access_key: my_secret_access_key