Quickstart Guide

Get from zero to running AI inference on Apple Silicon in under 5 minutes.

Prerequisites

Before you begin, make sure you have:

Step 1: Install the SDK

1

Install via pip

The MetalCloud Python SDK is available on PyPI:

Terminal
pip install metalcloud

This installs the metalcloud package and all required dependencies.

Step 2: Authenticate

2

Set your API key

You can authenticate in two ways:

Option A: Environment Variable (Recommended)

Terminal
export METALCLOUD_API_KEY="your-api-key-here"

Option B: Direct in Code

Python
import metalcloud

client = metalcloud.Client(api_key="your-api-key-here")
๐Ÿ” Security Tip

Never commit your API key to version control. Use environment variables or a secrets manager in production.

Step 3: Run Your First Job

3

Submit an inference request

Here's a complete example that runs Llama 3.3 70B:

Python
import metalcloud

# Initialize the client (uses METALCLOUD_API_KEY env var)
client = metalcloud.Client()

# Run inference
response = client.inference(
    model="meta-llama/Llama-3.3-70B",
    prompt="Explain the benefits of unified memory in Apple Silicon chips.",
    max_tokens=500,
    temperature=0.7
)

# Print the response
print(response.text)

# View usage stats
print(f"Tokens used: {response.usage.total_tokens}")
print(f"Cost: ยฃ{response.usage.cost:.4f}")

Streaming Responses

For real-time output, use streaming:

Python
for chunk in client.inference(
    model="meta-llama/Llama-3.3-70B",
    prompt="Write a haiku about cloud computing.",
    stream=True
):
    print(chunk.text, end="", flush=True)

Next Steps

๐Ÿ’ฌ Need Help?

Join our Discord community or email support@metalcloud.space.