🌱 Quick Start

Get started with Arraylake in 5 minutes by following this quick tutorial.

info

Arraylake is only available to Earthmover customers. If you're interested in becoming a customer, please book a demo to learn more!

This tutorial assumes that your organization administrator has already set up and configured your org.

Install the ArrayLake client and dependencies for the quickstart:

pip
conda

pip install "arraylake[icechunk]" "xarray>=2025.1.1" netcdf4 dask pooch

conda install -c conda-forge arraylake "zarr>=3" "xarray>=2025.1.1" netcdf4 dask pooch icechunk

CLI
Python

arraylake auth login

In some environments, e.g., jupyterhub, the authentication pages on the browser may not automatically open. In this case you should use arraylake auth login --no-browser

from arraylake import Client

client = Client()
client.login()

Create your first repository:

CLI
Python

arraylake repo create myorg/myrepo

client.create_repo("myorg/myrepo")

Where myorg is the name of your organization and myrepo is the name of your repository.

Once you've created your repo, you can see it alongside the rest of your organizations repos:

CLI
Python

arraylake repo create myorg/myrepo

client.list_repos("myorg")

Now if you have been using the CLI you can switch to Python and put data into the repo. First connect with the Arraylake Client:

from arraylake import Client

client = Client()
repo = client.get_repo("myorg/myrepo")

info

The repo will be created as an Icechunk Repository, and Arraylake will automatically configure the appropriate cloud service credentials for you.

Next, load some of Xarray's tutorial data:

import xarray as xr

air_temp = xr.tutorial.open_dataset("air_temperature").chunk("1mb")
rasm = xr.tutorial.open_dataset("rasm").chunk("1mb")

Then use Icechunk to write to the Arraylake Zarr store:

# Start an icechunk session
session = repo.writable_session(branch = "main")

# write the data to zarr
air_temp.to_zarr(session.store, group='air_temperature')
rasm.to_zarr(session.store, group='rasm')

# commit the data to Arraylake
first_commit_id = session.commit("My first commit 🥹")

In the future you can access the most recent data in the store:

session = repo.readonly_session(branch="main")
ds = xr.open_zarr(session.store, group='rasm', consolidated=False)

or you can time-travel and access the very first commit:

session = repo.readonly_session(snapshot=first_commit_id)
ds = xr.open_zarr(session.store, group='rasm', consolidated=False)

And you are off to the races. Check out the User Guide to go deeper, or look at the repository history in the Arraylake web app.