Python SDK Reference
Complete API reference for the MetalCloud Python SDK.
Installation
Terminal
pip install metalcloud
Requires Python 3.9+. The SDK has minimal dependencies and works on macOS, Linux, and Windows.
Client
metalcloud.Client()
The main client for interacting with MetalCloud.
Python
import metalcloud # Using environment variable (recommended) client = metalcloud.Client() # Or with explicit API key client = metalcloud.Client(api_key="mc_...") # With custom options client = metalcloud.Client( api_key="mc_...", base_url="https://api.metalcloud.space", timeout=60 )
| Parameter | Type | Description |
|---|---|---|
api_key |
str | Your API key. Defaults to METALCLOUD_API_KEY env var. |
base_url |
str | API base URL. Defaults to production endpoint. |
timeout |
int | Request timeout in seconds. Default: 300. |
client.inference()
Run inference on a model. Returns a response object or generator (if streaming).
Python
# Basic inference response = client.inference( model="meta-llama/Llama-3.3-70B", prompt="Explain quantum computing", max_tokens=500 ) print(response.text) # Streaming inference for chunk in client.inference( model="meta-llama/Llama-3.3-70B", prompt="Write a story", stream=True ): print(chunk.text, end="") # Chat format response = client.inference( model="meta-llama/Llama-3.3-70B", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello!"} ] )
| Parameter | Type | Description |
|---|---|---|
model* |
str | Model identifier (e.g., "meta-llama/Llama-3.3-70B") |
prompt |
str | Text prompt for completion. Use this OR messages. |
messages |
list | Chat messages array. Use this OR prompt. |
max_tokens |
int | Maximum tokens to generate. Default: 256. |
temperature |
float | Sampling temperature (0-2). Default: 1.0. |
top_p |
float | Nucleus sampling threshold. Default: 1.0. |
stream |
bool | Enable streaming responses. Default: False. |
stop |
list[str] | Stop sequences to end generation. |
client.list_models()
List available models and their specifications.
Python
models = client.list_models() for model in models: print(f"{model.id}: {model.parameters}B params")
Job Management
client.get_job()
Get status and details of a submitted job.
Python
job = client.get_job(job_id="job_abc123") print(job.status) # "pending", "running", "completed", "failed" print(job.result) # Output when completed
client.list_jobs()
List your recent jobs with optional filtering.
Python
jobs = client.list_jobs( limit=10, status="completed" ) for job in jobs: print(f"{job.id}: {job.status}")
Error Handling
The SDK raises typed exceptions for different error conditions:
Python
from metalcloud.exceptions import ( AuthenticationError, RateLimitError, InsufficientCreditsError, ModelNotFoundError ) try: response = client.inference(...) except AuthenticationError: print("Invalid API key") except RateLimitError as e: print(f"Rate limited. Retry after {e.retry_after}s") except InsufficientCreditsError: print("Add credits to continue")