Skip to main content

Arraylake Quick Start

Get started with Arraylake in 5 minutes by following along with this quick tutorial:

tip

Arraylake is only available to Earthmover customers. If you're interested in becoming a customer, feel free to book a demo.

Install the ArrayLake client and dependencies for the quickstart:

# with pip
pip install "arraylake" xarray netcdf4 dask pooch
# or with conda
conda install -c conda-forge arraylake xarray netcdf4 dask pooch
note

This tutorial assumes you already have an organization setup with a valid bucket configuration.

Log in

arraylake auth login

Create your first repository:

arraylake repo create myorg/myrepo

Where myorg is the name of your organization and myrepo is the name of your repository.

Once you've created your repo, you can see it alongside the rest of your organizations repos:

arraylake repo list myorg

Write data with the Python client:

Now we'll switch to Python where we'll put some data into the repo and do a few quick tasks. First we'll connect our Client:

from arraylake import Client

client = Client()
repo = client.get_repo("myorg/myrepo")

(AWS credentials with appropriate write access to your target S3 bucket should be available in your environment.)

Next, we'll pull some of Xarray's tutorial data and dump it into ArrayLake:

import xarray as xr

air_temp = xr.tutorial.open_dataset("air_temperature").chunk("1mb")
rasm = xr.tutorial.open_dataset("rasm").chunk("1mb")

air_temp.to_zarr(repo.store, group='air_temperature')
rasm.to_zarr(repo.store, group='rasm')

Now that we've put some data into Arraylake, we can commit our changes:

commit_id = repo.commit("My first commit 🥹")

Next, we can go back and access the data in our store:

ds = xr.open_zarr(repo.store, group="rasm")

And you are off to the races. Check out the Tutorial to go deeper.