Documentation

Everything you need to deploy AI workloads on Apple Silicon

๐Ÿš€

Quickstart

Get up and running in under 5 minutes. Install the SDK, authenticate, and run your first job.

Get started โ†’
๐Ÿ

Python SDK

Complete reference for the MetalCloud Python SDK. Job submission, streaming, and more.

View reference โ†’
๐Ÿ”Œ

REST API

Direct API access for any language. OpenAPI spec, authentication, and endpoints.

Explore API โ†’
๐Ÿค–

Model Library

Pre-configured models ready to deploy. Llama, Mistral, DeepSeek, and more.

Browse models โ†’
๐ŸŽ

MLX Guide

Using Apple's MLX framework on MetalCloud. Optimizations, examples, and best practices.

Learn MLX โ†’
๐Ÿ’ณ

Billing & Usage

Understanding your bill, usage tracking, and cost optimization strategies.

View billing โ†’

Popular Topics

Frequently accessed documentation

Quick Example

Run inference in just a few lines of code

Python
import metalcloud

# Initialize client
client = metalcloud.Client()

# Run inference on Llama 70B
response = client.inference(
    model="meta-llama/Llama-3.3-70B",
    prompt="Explain unified memory in Apple Silicon",
    max_tokens=500
)

print(response.text)