Python SDK Reference

Complete API reference for the MetalCloud Python SDK.

Installation

                    Terminal
                
pip install metalcloud

Requires Python 3.9+. The SDK has minimal dependencies and works on macOS, Linux, and Windows.

Client

metalcloud.Client()

The main client for interacting with MetalCloud.

                    Python
                

import metalcloud

# Using environment variable (recommended)
client = metalcloud.Client()

# Or with explicit API key
client = metalcloud.Client(api_key="mc_...")

# With custom options
client = metalcloud.Client(
    api_key="mc_...",
    base_url="https://api.metalcloud.space",
    timeout=60
)
                

Parameter	Type	Description
`api_key`	str	Your API key. Defaults to `METALCLOUD_API_KEY` env var.
`base_url`	str	API base URL. Defaults to production endpoint.
`timeout`	int	Request timeout in seconds. Default: 300.

client.inference()

Run inference on a model. Returns a response object or generator (if streaming).

                    Python
                

# Basic inference
response = client.inference(
    model="meta-llama/Llama-3.3-70B",
    prompt="Explain quantum computing",
    max_tokens=500
)
print(response.text)

# Streaming inference
for chunk in client.inference(
    model="meta-llama/Llama-3.3-70B",
    prompt="Write a story",
    stream=True
):
    print(chunk.text, end="")

# Chat format
response = client.inference(
    model="meta-llama/Llama-3.3-70B",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
)
                

Parameter	Type	Description
`model`*	str	Model identifier (e.g., "meta-llama/Llama-3.3-70B")
`prompt`	str	Text prompt for completion. Use this OR messages.
`messages`	list	Chat messages array. Use this OR prompt.
`max_tokens`	int	Maximum tokens to generate. Default: 256.
`temperature`	float	Sampling temperature (0-2). Default: 1.0.
`top_p`	float	Nucleus sampling threshold. Default: 1.0.
`stream`	bool	Enable streaming responses. Default: False.
`stop`	list[str]	Stop sequences to end generation.

client.list_models()

List available models and their specifications.

                    Python
                
models = client.list_models()

for model in models:
    print(f"{model.id}: {model.parameters}B params")

Job Management

client.get_job()

Get status and details of a submitted job.

                    Python
                
job = client.get_job(job_id="job_abc123")
print(job.status)  # "pending", "running", "completed", "failed"
print(job.result)  # Output when completed

client.list_jobs()

List your recent jobs with optional filtering.

                    Python
                

jobs = client.list_jobs(
    limit=10,
    status="completed"
)

for job in jobs:
    print(f"{job.id}: {job.status}")
                

Error Handling

The SDK raises typed exceptions for different error conditions:

                    Python
                

from metalcloud.exceptions import (
    AuthenticationError,
    RateLimitError,
    InsufficientCreditsError,
    ModelNotFoundError
)

try:
    response = client.inference(...)
except AuthenticationError:
    print("Invalid API key")
except RateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after}s")
except InsufficientCreditsError:
    print("Add credits to continue")