Managing Storage
An Arraylake BucketConfig
houses the settings for one or more
Chunkstores. Each BucketConfig
holds the configuration (e.g. object store bucket name, key prefix, access
credentials, etc.) that enables the Arraylake client
and services to read and write array data.
All the Repositories that use a given bucket config are safely
isolated from each other. A BucketConfig
allows organizations to easily and
securely manage storage configuration that is shared between repositories. An
organization may have multiple bucket configurations.
Important properties for BucketConfig
s include:
nickname
: a nickname for easy referencing in code, on the command line, and on the webplatform
: the object storage providername
: the name of the bucket in object storage where chunks will be storedprefix
: an optional key prefixextra_config
: additional configuration options for the storage backend (e.g.endpoint_url
)
For convenience, The Arraylake Python client also supports expressing these last three components
in URI form (i.e. {platform}://{name}[/{prefix}]
). Each BucketConfig must have
a unique URI. Arraylake will not allow you to create two BucketConfig
s pointing to
the same location in object storage.
As of Arraylake v0.9.5, BucketConfigs support arbitrary key prefixes within a given bucket. Prefixes can be used to create partitions between data in cases when an organization has access to one bucket only.
To make use of this prefix support, simply append the key to your bucket's URI
(e.g. s3://my-bucket/my-prefix
) when creating the bucket config.
BucketConfig
s support a variety of platforms, configuration options, and access
modes. For details on configuration for a specific supported object storage
provider, see Storage Integrations.
Clients prior to v0.9.5 cannot use repos which rely on a BucketConfig that has a prefix.
You can modify a BucketConfig as much as you like so long as there are no Repos
that rely on it. Once it is in use by one or more Repos, only the BucketConfig's
nickname
can be modified.
Create a BucketConfig
If you're just getting started, you probably only need one bucket config for
your entire organization (see Organizations and Access for more
detail). For the purposes of this example, our org name will be earthmover
. If
running these commands interactively, replace earthmover
with your org name.
For this example, we are going to create a bucket config nicknamed production
to hold the chunks for all the repositories with production-quality datasets.
- Web App
- Python
- Python (asyncio)
In the web app, Org Admins can add BucketConfig
s by clicking on the "Add Bucket" button in the Buckets section of the Organization Settings page (see below).
from arraylake import Client
client = Client()
client.create_bucket_config(
org="earthmover",
nickname="production",
uri="s3://my-production-data",
extra_config={'region_name': 'us-east-1'}
)
from arraylake import AsyncClient
aclient = AsyncClient()
await aclient.create_bucket_config(
org="earthmover",
nickname="production",
uri="s3://my-production-data",
extra_config={'region_name': 'us-east-1'}
)
List BucketConfigs
You can list BucketConfig
s associated with an organization.
- Web App
- Python
- Python (asyncio)
In the web app, users can view existing buckets for their org from the org buckets section (app.earthmover.io/[orgname]/buckets
).
The default BucketConfig is marked with a purple badge.
Org Admins can also access and manage the buckets list via the Buckets section on the Organizations Settings page (app.earthmover.io/[orgname]/settings
).
Clicking on the gear icon in the bucket's row will bring up the settings dialog for that bucket.
client.list_bucket_configs("earthmover")
In the Python Client, the default BucketConfig will be indicated by the is_default
flag on the resulting
list item.
await aclient.list_bucket_configs("earthmover")
In the Python Client, the default BucketConfig will be indicated by the is_default
flag on the resulting
list item.
Delete a BucketConfig
Finally, we can delete a BucketConfig
.
A BucketConfig cannot be deleted while it is in use by any Repo. Deleting a BucketConfig also cannot be undone! Use this operation carefully.
- Web App
- Python
- Python (asyncio)
In the web app, a BucketConfig can be deleted from the bucket's settings dialog.
client.delete_bucket_config(
org="earthmover",
nickname="production",
imsure=True, imreallysure=True
)
await aclient.delete_bucket_config(
org="earthmover",
nickname="production",
imsure=True, imreallysure=True
)