Arraylake Quick Start
Get started with Arraylake in 5 minutes by following along with this quick tutorial:
Arraylake is only available to Earthmover customers. If you're interested in becoming a customer, feel free to book a demo.
Install the ArrayLake client and dependencies for the quickstart:
# with pip
pip install arraylake xarray netcdf4 dask pooch
# or with conda
conda install -c conda-forge arraylake xarray netcdf4 dask pooch
This tutorial assumes you already have an organization setup with a valid bucket configuration.
arraylake auth login
Create your first repository:
arraylake repo create myorg/myrepo
Where myorg
is the name of your organization and myrepo
is the name of your repository.
Once you've created your repo, you can see it alongside the rest of your organizations repos:
arraylake repo list myorg
Write data with the Python client:
Now we'll switch to Python where we'll put some data into the repo and do a few quick tasks. First we'll connect our Client:
from arraylake import Client
client = Client()
repo = client.get_repo("myorg/myrepo")
(AWS credentials with appropriate write access to your target S3 bucket should be available in your environment.)
Next, we'll pull some of Xarray's tutorial data and dump it into ArrayLake:
import xarray as xr
air_temp = xr.tutorial.open_dataset("air_temperature").chunk("1mb")
rasm = xr.tutorial.open_dataset("rasm").chunk("1mb")
air_temp.to_zarr(repo.store, group='air_temperature')
rasm.to_zarr(repo.store, group='rasm')
Now that we've put some data into Arraylake, we can commit our changes:
commit_id = repo.commit("My first commit 🥹")
Next, we can go back and access the data in our store:
ds = xr.open_zarr(repo.store, group="rasm")
And you are off to the races. Check out the Tutorial to go deeper.