Skip to main content

Managing Storage

An Arraylake BucketConfig houses the settings for one or more Chunkstores. Each BucketConfig holds the configuration (e.g. object store bucket name, key prefix, access credentials, etc.) that enables the Arraylake client and services to read and write array data.

All the Repositories that use a given bucket config are safely isolated from each other. A BucketConfig allows organizations to easily and securely manage storage configuration that is shared between repositories. An organization may have multiple bucket configurations.

Important properties for BucketConfigs include:

  • nickname: a nickname for easy referencing in code, on the command line, and on the web
  • platform: the object storage provider
  • name: the name of the bucket in object storage where chunks will be stored
  • prefix: an optional key prefix
  • extra_config: additional configuration options for the storage backend (e.g. endpoint_url)

For convenience, The Arraylake Python client also supports expressing these last three components in URI form (i.e. {platform}://{name}[/{prefix}]). Each BucketConfig must have a unique URI. Arraylake will not allow you to create two BucketConfigs pointing to the same location in object storage.

info

As of Arraylake v0.9.5, BucketConfigs support arbitrary key prefixes within a given bucket. Prefixes can be used to create partitions between data in cases when an organization has access to one bucket only.

To make use of this prefix support, simply append the key to your bucket's URI (e.g. s3://my-bucket/my-prefix) when creating the bucket config.

BucketConfigs support a variety of platforms, configuration options, and access modes. For details on configuration for a specific supported object storage provider, see Storage Integrations.

warning

Clients prior to v0.9.5 cannot use repos which rely on a BucketConfig that has a prefix.

tip

You can modify a BucketConfig as much as you like so long as there are no Repos that rely on it. Once it is in use by one or more Repos, only the BucketConfig's nickname can be modified.

Create a BucketConfig

If you're just getting started, you probably only need one bucket config for your entire organization (see Organizations and Access for more detail). For the purposes of this example, our org name will be earthmover. If running these commands interactively, replace earthmover with your org name.

For this example, we are going to create a bucket config nicknamed production to hold the chunks for all the repositories with production-quality datasets.

Create a bucket in the webapp.

The create bucket dialog.

In the web app, Org Admins can add BucketConfigs by clicking on the "Add Bucket" button in the Buckets section of the Organization Settings page (see below).

List BucketConfigs

You can list BucketConfigs associated with an organization.

List buckets in web app

The organization buckets list

In the web app, users can view existing buckets for their org from the org buckets section (app.earthmover.io/[orgname]/buckets).

The default BucketConfig is marked with a purple badge.

Administer buckets in web app

The organization settings buckets section

Org Admins can also access and manage the buckets list via the Buckets section on the Organizations Settings page (app.earthmover.io/[orgname]/settings).

Clicking on the gear icon in the bucket's row will bring up the settings dialog for that bucket.

Delete a BucketConfig

Finally, we can delete a BucketConfig.

warning

A BucketConfig cannot be deleted while it is in use by any Repo. Deleting a BucketConfig also cannot be undone! Use this operation carefully.

Bucket settings dialog

The bucket settings dialog

In the web app, a BucketConfig can be deleted from the bucket's settings dialog.