Skip to main content

Managing Repos

Once your Arraylake organization has been fully configured and you have installed the client library, you're ready to start managing data! 🎉

Create a Repo​

For the purposes of this example, our org name will be earthmover. If running these commands interactively, replace earthmover with your org name.

For this example, we are going to create a Repo called ocean to hold oceanography data 🌊.

arraylake repo create earthmover/ocean

Where is Repo data stored?​

Arraylake lets you configure the storage location for a Repo's data using org-level bucket configurations. Choose a specific bucket by providing bucket_config_nickname to create_repo. If not specified, the organization's default bucket is used.

Within the bucket and prefix set by a org-level bucket configuration, data for new Icechunk Repos are stored within another prefix. By default, the extra prefix is set to Repo name prefixed with 8 random characters. Choose a specific extra prefix by passing the prefix kwarg to create_repo. For example, for a bucket configured with bucket='my-bucket-name' and prefix='my-bucket-prefix,

  1. create_repo("repo-A") stores data in my-bucket-name/my-bucket-prefix/[8-RANDOM_CHARACTERS]_repo_A
  2. create_repo("repo-B", prefix='zoo') stores data in my-bucket-name/my-bucket-prefix/zoo/

Open a Repo​

If you're working in Python, you can open a Repo and start interacting with your data.

repo = client.get_repo("earthmover/ocean")

List Repos​

You can list repos associated with an organization.

arraylake repo list earthmover

Delete a Repo​

Finally, we can delete a repo.

warning

Deleting a repo cannot be undone! Use this operation carefully.

arraylake repo delete earthmover/ocean