Python SDK Reference

Complete API reference for the MetalCloud Python SDK.

Installation

Terminal
pip install metalcloud

Requires Python 3.9+. The SDK has minimal dependencies and works on macOS, Linux, and Windows.

Client

metalcloud.Client()

The main client for interacting with MetalCloud.

Python
import metalcloud

# Using environment variable (recommended)
client = metalcloud.Client()

# Or with explicit API key
client = metalcloud.Client(api_key="mc_...")

# With custom options
client = metalcloud.Client(
    api_key="mc_...",
    base_url="https://api.metalcloud.space",
    timeout=60
)
Parameter Type Description
api_key str Your API key. Defaults to METALCLOUD_API_KEY env var.
base_url str API base URL. Defaults to production endpoint.
timeout int Request timeout in seconds. Default: 300.

client.inference()

Run inference on a model. Returns a response object or generator (if streaming).

Python
# Basic inference
response = client.inference(
    model="meta-llama/Llama-3.3-70B",
    prompt="Explain quantum computing",
    max_tokens=500
)
print(response.text)

# Streaming inference
for chunk in client.inference(
    model="meta-llama/Llama-3.3-70B",
    prompt="Write a story",
    stream=True
):
    print(chunk.text, end="")

# Chat format
response = client.inference(
    model="meta-llama/Llama-3.3-70B",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
)
Parameter Type Description
model* str Model identifier (e.g., "meta-llama/Llama-3.3-70B")
prompt str Text prompt for completion. Use this OR messages.
messages list Chat messages array. Use this OR prompt.
max_tokens int Maximum tokens to generate. Default: 256.
temperature float Sampling temperature (0-2). Default: 1.0.
top_p float Nucleus sampling threshold. Default: 1.0.
stream bool Enable streaming responses. Default: False.
stop list[str] Stop sequences to end generation.

client.list_models()

List available models and their specifications.

Python
models = client.list_models()

for model in models:
    print(f"{model.id}: {model.parameters}B params")

Job Management

client.get_job()

Get status and details of a submitted job.

Python
job = client.get_job(job_id="job_abc123")
print(job.status)  # "pending", "running", "completed", "failed"
print(job.result)  # Output when completed

client.list_jobs()

List your recent jobs with optional filtering.

Python
jobs = client.list_jobs(
    limit=10,
    status="completed"
)

for job in jobs:
    print(f"{job.id}: {job.status}")

Error Handling

The SDK raises typed exceptions for different error conditions:

Python
from metalcloud.exceptions import (
    AuthenticationError,
    RateLimitError,
    InsufficientCreditsError,
    ModelNotFoundError
)

try:
    response = client.inference(...)
except AuthenticationError:
    print("Invalid API key")
except RateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after}s")
except InsufficientCreditsError:
    print("Add credits to continue")