Skip to main content

Data Storage

Arraylake stores data in cloud object storage. When interacting with data via the Arraylake client library, you read and write data directly to / from object storage.

There are two ways to connect object storage to Arraylake:

  • BYOB - Bring your own bucket. With this option, all the data live in your cloud in your own object storage bucket.
  • Earthmover Managed - Coming Soon With this option, Earthmover manages the storage bucket on your behalf.

Supported Object Storage

Arraylake works with a wide range of commercial and open-source object storage services, including any S3-compatible object store as well as Google Cloud Storage.

For a full list of supported storage providers, see Storage Integrations.

note

Support for Azure Blob Storage is on the roadmap!

Object Storage Credentials

In order to access data, the Arraylake client and platform must be able to access the underlying object storage. This requires credentials.

Arraylake supports three types of object storage credentials.

  • Self-managed credentials - With this option, the Earthmover platform itself has no credentials to the object store. Credentials live solely on the client machine, managed by the user. Credentials must be configured locally according to the cloud provider's instructions.
  • Delegated credentials / credential vending - With this option, the Earthmover platform has credentials to access the object store. It uses these credentials to directly access data on the backend. It also perfoms credential vending: the backend generates credentials and forwards them to the client, enabling direct access to the object store on the client side. Within this option, there are two sub options for how to set up credentials:
    • AWS role-based access delegation - This option, available only in AWS, means that the bucket owner creates a specialized IAM role and grants trusted entity status to Earthmover's AWS account ID. This is the preferred method for AWS users.
    • HMAC Credentials - Coming Soon With this option, the bucket owner creates a set of S3 credentials (access_key_id, secret_access_key) and uploads them directly to the Earthmover platform.
warning

Self-managed credentials will soon be deprecated. All customers should migrate to delegated credentials.

To learn more about using these options, see Managing Storage.