Organizations, Users, and Access Management
Before using Arraylake, you'll need to configure your organization, users, and access management
Organizations
An ArrayLake org
represents an collection of users
that maintains multiple repos
.
A company or institution will typically have a single Arraylake org
, and an Arraylake User can belong to one or more organizations.
The org is the owner of all repos.
When using the Arraylake API, the org identifier prefixes a repository name to uniquely identify a repository. For example
my-company/sentinel-repo
represents the sentinel-repo
belonging to my-company
.
Arraylake orgs are created by the Earthmover team once your account has been provisioned.
For more depth, have a look at the Arraylake data model.
Users
An Arraylake user
is an individual human who can access the Arraylake platform.
Users are uniquely identified by an email address.
User Roles
Currently Arraylake supports two types of roles:
- Regular user (can read and write data and create / delete repos)
- Admin (can manage other users, API keys, and Bucket configuration)
Creating and Managing Users
When your Arraylake org is created, the Earthmover team will create at least one admin user. This admin user can create more users and edit their permissions.
Users are managed via the Organization Settings dashboard:
To add a new user, click the big purple "Add Member" button. Then enter the user email:
To set user permissions, click on the "Permissions" tab and then click the pen symbol to edit permissions.
To delete a user, click the garbage-can symbol.
Creating and Managing API Keys
API keys can be used in place of user accounts to provide access to Arraylake data. API keys can be thought of as "service accounts," suitable for use in automated data processing jobs, dashboards, or other machine-to-machine connections.
To manage API Keys, navigate to the "API Clients" tab:
Here you can view and delete existing API keys.
To create a new API key, click on the big purple "New API Client" button.
Enter a name for the API key and select a lifetime. (Once an API key expires it can no longer be used.)
Once the key has been created, you will be shown a secret token.
About API Keys
- API Keys are associated with shared Service Accounts. They are intended to be used for machine-to-machine authentication on behalf of these accounts, and should not be used for individual user access to services.
- API Keys use an identifier of the form:
<name>@<org>.service.earthmover.io
. The<name>
component of this identifier is defined by the requester. (Note that this is not actually a valid email address.) - Secret tokens expire after 1 year by default. This is configurable as needed by the requester.
- Secret tokens are a single string, prefixed with the
ema_
identifier.- Example Token:
ema_123456789123456789_123456789123456789123456789
.
- Example Token:
- Tokens should be considered secret, and should not be shared publicly. Owners should take appropriate precautions when distributing in deployed environments. For example: consider storing the token in a service like AWS SSM Paramater store to enable deployed services or jobs to access them.
To delete an API key, click the garbage-can symbol.
Client Authentication
Interacting with data in Arraylake from Python requires authentication by the client.
Arraylake supports two methods of authenticating requests to the service:
- User identities: A User authenticates directly
- API Keys: An API token associated with a shared service account is used
Authenticating as a User
To authenticate as a user, use the Arraylake CLI or Python API.
Running the following command will initiate the login flow by directing you to a login page associated with your organization:
- CLI
- Python
arraylake auth login
# Or, if running from a remote environment
arraylake auth login --no-browser
from arraylake import Client
client = Client()
client.login()
This flow will yield a code that can be provided to the command line prompt and ultimately authenticate your access to the service.
Subsequently you can use arraylake auth logout
to logout, or arraylake auth refresh
to refresh your authenticated status.
Authenticating with an API Key
At this time, API Key authentication is available via the Python API only. To authenticate with an API Key, pass the
appropriate API token as a parameter to the Client
. We recommend storing the token as an environment variable rather than hard coding
the token in code. For example, if the environment variable MY_ARRAYLAKE_API_TOKEN
is set,
you can access it as follows:
- Python
import os
from arraylake import Client
api_token = os.environ.get('MY_ARRAYLAKE_API_TOKEN')
client = Client(token=api_token)
client.list_repos('my-org')
If present, the ARRAYLAKE_TOKEN
environment variable will automatically be detected and used to populate the client token. This is the easiest way to configure access to Arraylake from automated scripts, cron jobs, and CI environments.
Enterprise Authentication Support
Arraylake supports social login using Google, GitHub, and Microsoft accounts. For most organizations, no additional configuration is needed. Simply add users to your organization using their email address and instruct them to login with the appropriate authentication provider.
GitHub
If you plan to use GitHub to allow your users to login via Arraylake, remember that you'll be adding users via the email associated with their account. To find the email address associated with a GitHub user account, you can use the GitHub API. Below is an example showing how this can be done using GitHub's CLI:
gh api \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
/users/jhamman
Microsoft Active Directory
If your organization uses enterprise Microsoft Active Directory to manage your users, you may need to approve Earthmover's OIDC application before your users can login. This can be done by following this link.