Skip to main content

Quick Start

Get started with Arraylake in 5 minutes by following this quick tutorial.

info

Arraylake is only available to Earthmover customers. If you're interested in becoming a customer, please book a demo to learn more!

This tutorial assumes that your organization administrator has already set up and configured your org.

Install the ArrayLake client and dependencies for the quickstart:

pip install arraylake[icechunk] "xarray>=2025.1.1" netcdf4 dask pooch

Log in:

arraylake auth login

In some environments, e.g., jupyterhub, the authentication pages on the browser may not automatically open. In this case you should use arraylake auth login --no-browser

Create your first repository:

arraylake repo create myorg/myrepo

Where myorg is the name of your organization and myrepo is the name of your repository.

Once you've created your repo, you can see it alongside the rest of your organizations repos:

arraylake repo create myorg/myrepo

Now if you have been using the CLI you can switch to Python and put data into the repo. First connect with the Arraylake Client:

from arraylake import Client

client = Client()
repo = client.get_repo("myorg/myrepo")
info

The repo will be created as an Icechunk Repository, and Arraylake will automatically configure the appropriate cloud service credentials for you.

Next, load some of Xarray's tutorial data:

import xarray as xr

air_temp = xr.tutorial.open_dataset("air_temperature").chunk("1mb")
rasm = xr.tutorial.open_dataset("rasm").chunk("1mb")

Then use Icechunk to write to the Arraylake Zarr store:

# Start an icechunk session
session = repo.writable_session(branch = "main")

# write the data to zarr
air_temp.to_zarr(repo.store, group='air_temperature')
rasm.to_zarr(repo.store, group='rasm')

# commit the data to Arraylake
first_commit_id = session.commit("My first commit 🥹")

In the future you can access the most recent data in the store:

session = repo.readonly_session(branch="main")
ds = xr.open_zarr(session.store, group='rasm', consolidated=False)

or you can time-travel and access the very first commit:

session = repo.readonly_session(snapshot=first_commit_id)
ds = xr.open_zarr(session.store, group='rasm', consolidated=False)

And you are off to the races. Check out the User Guide to go deeper, or look at the repository history in the Arraylake web app.