🌱 Quick Start
Get started with Arraylake in 5 minutes by following this quick tutorial.
Arraylake is only available to Earthmover customers. If you're interested in becoming a customer, please book a demo to learn more!
This tutorial assumes that your organization administrator has already set up and configured your org.
Install the ArrayLake client and dependencies for the quickstart:
- pip
- conda
pip install "arraylake" "xarray>=2025.1.1" netcdf4 dask pooch
conda install -c conda-forge arraylake "zarr>=3" "xarray>=2025.1.1" netcdf4 dask pooch
Log in:
- CLI
- Python
arraylake auth login
In some environments, e.g., jupyterhub, the authentication pages on the browser may not automatically open. In this case you should use arraylake auth login --no-browser
from arraylake import Client
client = Client()
client.login()
Create your first repository:
- CLI
- Python
arraylake repo create myorg/myrepo
client.create_repo("myorg/myrepo")
Where myorg
is the name of your organization and myrepo
is the name of your repository.
This is an Icechunk Repository, managed by Arraylake.
Once you've created your repo, you can see it alongside the rest of your organizations repos:
- CLI
- Python
arraylake repo list myorg
client.list_repos("myorg")
Now if you have been using the CLI you can switch to Python and put data into the repo. First connect with the Arraylake
Client:
from arraylake import Client
client = Client()
repo = client.get_repo("myorg/myrepo")
Arraylake will automatically obtain the needed cloud service credentials for you to access the data directly in object storage.
Next, load some of Xarray's tutorial data:
import xarray as xr
air_temp = xr.tutorial.open_dataset("air_temperature").chunk("1mb")
rasm = xr.tutorial.open_dataset("rasm").chunk("1mb")
Then use Xarray to write to the repo:
# Start an icechunk session
session = repo.writable_session(branch = "main")
# write the data to zarr
air_temp.to_zarr(session.store, group='air_temperature')
rasm.to_zarr(session.store, group='rasm')
# commit the data to Arraylake
first_commit_id = session.commit("My first commit 🥹")
In the future you can access the most recent data in the repo:
session = repo.readonly_session(branch="main")
ds = xr.open_zarr(session.store, group='rasm')
or you can time-travel and access the very first commit:
session = repo.readonly_session(snapshot=first_commit_id)
ds = xr.open_zarr(session.store, group='rasm')
And you are off to the races. Check out the User Guide to go deeper, or look at the repository history in the Arraylake web app.