Managing Repos
Once your Arraylake organization has been fully configured and you have installed the client library, you're ready to start managing data! 🎉
Create a Repo​
For the purposes of this example, our org name will be earthmover
.
If running these commands interactively, replace earthmover
with your org name.
For this example, we are going to create a Repo called ocean
to hold oceanography data 🌊.
- CLI
- Python
- Python (asyncio)
arraylake repo create earthmover/ocean
from arraylake import Client
client = Client()
client.create_repo("earthmover/ocean")
from arraylake import AsyncClient
aclient = AsyncClient()
await aclient.create_repo("earthmover/ocean")
Where is Repo data stored?​
Arraylake lets you configure the storage location for a Repo's data using org-level bucket configurations.
Choose a specific bucket by providing bucket_config_nickname
to create_repo
.
If not specified, the organization's default bucket is used.
Within the bucket
and prefix
set by a org-level bucket configuration, data for new Icechunk Repos are stored within another prefix.
By default, the extra prefix is set to Repo name prefixed with 8 random characters.
Choose a specific extra prefix by passing the prefix
kwarg to create_repo
.
For example, for a bucket configured with bucket='my-bucket-name'
and prefix='my-bucket-prefix
,
create_repo("repo-A")
stores data inmy-bucket-name/my-bucket-prefix/[8-RANDOM_CHARACTERS]_repo_A
create_repo("repo-B", prefix='zoo')
stores data inmy-bucket-name/my-bucket-prefix/zoo/
Open a Repo​
If you're working in Python, you can open a Repo and start interacting with your data.
- Python
- Python (asyncio)
repo = client.get_repo("earthmover/ocean")
arepo = await aclient.get_repo("earthmover/ocean") # returns an AsyncRepo object
List Repos​
You can list repos associated with an organization.
- CLI
- Python
- Python (asyncio)
arraylake repo list earthmover
client.list_repos("earthmover")
await aclient.list_repos("earthmover")
Delete a Repo​
Finally, we can delete a repo.
Deleting a repo cannot be undone! Use this operation carefully.
- CLI
- Python
- Python (asyncio)
arraylake repo delete earthmover/ocean
client.delete_repo("earthmover/ocean", imsure=True, imreallysure=True)
await aclient.delete_repo("earthmover/ocean", imsure=True, imreallysure=True)