Quick Start
Get started with Arraylake in 5 minutes by following this quick tutorial.
Arraylake is only available to Earthmover customers. If you're interested in becoming a customer, please book a demo to learn more!
This tutorial assumes that your organization administrator has already set up and configured your org.
Install the ArrayLake client and dependencies for the quickstart:
- pip
- conda
pip install arraylake[icechunk] "xarray>=2025.1.1" netcdf4 dask pooch
conda install -c conda-forge arraylake "zarr>=3" "xarray>=2025.1.1" netcdf4 dask pooch icechunk
Log in:
- CLI
- Python
arraylake auth login
In some environments, e.g., jupyterhub, the authentication pages on the browser may not automatically open. In this case you should use arraylake auth login --no-browser
from arraylake import Client
client = Client()
client.login()
Create your first repository:
- CLI
- Python
arraylake repo create myorg/myrepo
client.create_repo("myorg/myrepo")
Where myorg
is the name of your organization and myrepo
is the name of your repository.
Once you've created your repo, you can see it alongside the rest of your organizations repos:
- CLI
- Python
arraylake repo create myorg/myrepo
client.list_repos("myorg")
Now if you have been using the CLI you can switch to Python and put data into the repo. First connect with the Arraylake
Client:
from arraylake import Client
client = Client()
repo = client.get_repo("myorg/myrepo")
The repo will be created as an Icechunk Repository, and Arraylake will automatically configure the appropriate cloud service credentials for you.
Next, load some of Xarray's tutorial data:
import xarray as xr
air_temp = xr.tutorial.open_dataset("air_temperature").chunk("1mb")
rasm = xr.tutorial.open_dataset("rasm").chunk("1mb")
Then use Icechunk to write to the Arraylake Zarr store:
# Start an icechunk session
session = repo.writable_session(branch = "main")
# write the data to zarr
air_temp.to_zarr(repo.store, group='air_temperature')
rasm.to_zarr(repo.store, group='rasm')
# commit the data to Arraylake
first_commit_id = session.commit("My first commit 🥹")
In the future you can access the most recent data in the store:
session = repo.readonly_session(branch="main")
ds = xr.open_zarr(session.store, group='rasm', consolidated=False)
or you can time-travel and access the very first commit:
session = repo.readonly_session(snapshot=first_commit_id)
ds = xr.open_zarr(session.store, group='rasm', consolidated=False)
And you are off to the races. Check out the User Guide to go deeper, or look at the repository history in the Arraylake web app.