By Anthony Ronning in Guides — 03 Sep 2025

Maple Proxy Documentation

Introducing Maple Proxy: OpenAI-Compatible API Access to Private LLMs

Want to use Maple's secure LLMs with your existing OpenAI code? Today we're excited to announce Maple Proxy, a lightweight proxy server that brings OpenAI-compatible API access to Maple's end-to-end encrypted LLM service. With Maple Proxy, developers can integrate private, secure AI completions into their applications using any OpenAI client library, no code changes required.

Why a Proxy?

Maple runs all LLM inference inside Trusted Execution Environments (TEEs), providing hardware-level security and privacy for your AI workloads. This means your prompts and responses are encrypted end-to-end and never accessible to anyone, not even Maple.

However, this security comes with complexity. Every request requires:

Trusted Execution Environment (TEE) attestation verification to ensure you're talking to genuine secure hardware
End-to-end encryption negotiation
Secure key exchange protocols

Maple Proxy handles all this for you. It acts as a bridge between your standard OpenAI-compatible client and Maple's secure infrastructure, managing the attestation handshake and encryption so you don't have to.

Available Models & Pricing

Maple Proxy provides access to a growing selection of state-of-the-art models. For detailed model capabilities and example prompts, see our comprehensive model guide.

Model	Description	Price per Million Tokens
llama-3.3-70b	Therapy notes, daily tasks, general reasoning	$4 input / $4 output
gpt-oss-120b	ChatGPT creativity & structured data	$4 input / $4 output
deepseek-r1-0528	Research, advanced math, coding	$4 input / $4 output
mistral-small-3-1-24b	Conversations, visual insights	$4 input / $4 output
qwen2-5-72b	Multilingual tasks, coding	$4 input / $4 output
qwen3-coder-480b	Specialized coding assistant	$4 input / $4 output
leon-se/gemma-3-27b-it-fp8-dynamic	Blazing-fast image analysis	$10 input / $10 output

All pricing is pay-as-you-go. Purchase API credits in increments starting at $10.

Two Ways to Get Started

Option 1: Desktop App with Built-in Proxy (Easiest for Local Development)

The Maple desktop app includes an integrated proxy server that automatically handles all configuration:

Download the Maple app from trymaple.ai/downloads
Sign up and upgrade to Pro, Team, or Max plan (starting at $20/month)
Navigate to API Management in the app settings
Purchase API credits (minimum $10)
Open the Local Proxy tab and click "Start Proxy"

The desktop app will:

Automatically create and manage API keys
Start the proxy on localhost:8080 (configurable)
Handle all TEE attestation and encryption
Show real-time proxy status and logs

Once running, simply point any OpenAI client to http://localhost:8080/v1:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="auto-configured-by-desktop-app"
)

response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "Hello, secure world!"}],
    stream=True
)

for chunk in response:
    print(chunk.choices[0].delta.content, end="")

Option 2: Standalone Docker Deployment (Best for Production)

For production deployments or CI/CD pipelines, use the standalone Maple Proxy Docker image:

Pull the Docker image:

docker pull ghcr.io/opensecretcloud/maple-proxy:latest

Create an API key in your Maple account at trymaple.ai
Run the proxy:

docker run -p 8080:8080 \
  -e MAPLE_BACKEND_URL=https://enclave.trymaple.ai \
  ghcr.io/opensecretcloud/maple-proxy:latest

Use with your API key:

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'http://localhost:8080/v1',
  apiKey: 'your-maple-api-key'
});

const response = await client.chat.completions.create({
  model: 'gpt-oss-120b',
  messages: [{ role: 'user', content: 'Explain TEEs in simple terms' }],
  stream: true
});

for await (const chunk of response) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

Docker Compose for Production

For production deployments with proper configuration management:

version: '3.8'

services:
  maple-proxy:
    image: ghcr.io/opensecretcloud/maple-proxy:latest
    container_name: maple-proxy
    ports:
      - "8080:8080"
    environment:
      - MAPLE_BACKEND_URL=https://enclave.trymaple.ai
      - MAPLE_ENABLE_CORS=true
      - RUST_LOG=info
      # Note: In production, clients should provide their own API keys
      # Do NOT set MAPLE_API_KEY here for multi-user deployments
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 3s
      retries: 3

API Endpoints

Maple Proxy implements the core OpenAI API endpoints:

List Available Models

curl http://localhost:8080/v1/models \
  -H "Authorization: Bearer YOUR_MAPLE_API_KEY"

Create Chat Completion

curl -N http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer YOUR_MAPLE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.3-70b",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Write a haiku about privacy"}
    ],
    "stream": true
  }'

Note: Maple currently supports streaming responses only. All completions will be streamed back to your client.

Client Library Examples

Python

from openai import OpenAI

# For desktop app with auto-configured proxy
client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="any-string"  # Desktop app handles auth
)

# For standalone proxy
client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="your-maple-api-key"
)

# Streaming is required for all requests
response = client.chat.completions.create(
    model="qwen3-coder-480b",
    messages=[{"role": "user", "content": "Write a Python function to sort a list"}],
    temperature=0.7,
    max_tokens=500,
    stream=True
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Node.js/TypeScript

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'http://localhost:8080/v1',
  apiKey: process.env.MAPLE_API_KEY
});

// Streaming example with TypeScript
async function streamCompletion() {
  const stream = await client.chat.completions.create({
    model: 'deepseek-r1-0528',
    messages: [{ role: 'user', content: 'Solve this step by step: 25 * 4 + 10' }],
    stream: true,
  });

  for await (const chunk of stream) {
    process.stdout.write(chunk.choices[0]?.delta?.content || '');
  }
}

Any OpenAI-Compatible Tool

Since Maple Proxy implements the OpenAI API specification, it works with any tool that supports custom OpenAI endpoints:

LangChain: Set openai_api_base="http://localhost:8080/v1"
LlamaIndex: Configure api_base="http://localhost:8080/v1"
Amp: Add custom OpenAI provider with Maple Proxy URL
Open Interpreter: Configure with --api_base http://localhost:8080/v1
Goose: Block's AI developer agent - configure with Maple Proxy endpoint
And many more...

Production Best Practices

For Local Development

Use the Maple desktop app for automatic proxy management
The desktop app handles API key management and configuration
Enable auto-start in the desktop app for convenience

For Production Deployments

Use the standalone Docker image for better resource isolation
Never hardcode API keys in your deployment configuration
Each client/user should provide their own Maple API key
Monitor the /health endpoint for service availability
Use environment variables or secrets management for API keys

Security Considerations

API keys are tied to your Maple user account and usage is billed accordingly
Treat API keys like passwords, never commit them to version control
Rotate keys regularly using the Maple dashboard
For multi-tenant applications, require each tenant to provide their own key

How It Works Under the Hood

When you make a request through Maple Proxy:

Client Request: Your OpenAI client sends a standard API request to the proxy
TEE Handshake: Proxy establishes a secure connection with Maple's TEE infrastructure
Attestation: Proxy verifies the TEE's attestation to ensure genuine secure hardware
Encryption: Your request is encrypted end-to-end before transmission
Authentication: Proxy validates your Maple API key
Processing: The LLM processes your request inside the secure enclave
Response: Encrypted response is decrypted by the proxy and returned in OpenAI format

All of this happens transparently, your application just sees a standard OpenAI API response.

Getting Help

GitHub: github.com/opensecretcloud/maple-proxy - Report issues and contribute
Discord: Join our community for support and discussions
Documentation: Full API reference and examples available in the repository

Start Building with Private AI Today

Maple Proxy makes it simple to add private, secure AI capabilities to your applications without sacrificing developer experience. Whether you're building a chatbot, code assistant, or AI-powered analytics tool, you can now do it with the confidence that your data remains completely private.

Get started in minutes:

Download Maple or pull the Docker image
Sign up for a Pro account ($20/month)
Purchase API credits ($4 per million tokens for most models)
Start building with any OpenAI-compatible client

Your code stays the same. Your data stays private.

Welcome to the future of secure AI development.

Maple Proxy Documentation

Introducing Maple Proxy: OpenAI-Compatible API Access to Private LLMs

Why a Proxy?

Available Models & Pricing

Two Ways to Get Started

Option 1: Desktop App with Built-in Proxy (Easiest for Local Development)

Option 2: Standalone Docker Deployment (Best for Production)

Docker Compose for Production

API Endpoints

List Available Models

Create Chat Completion

Client Library Examples

Python

Node.js/TypeScript

Any OpenAI-Compatible Tool

Production Best Practices

For Local Development

For Production Deployments

Security Considerations

How It Works Under the Hood

Getting Help

Start Building with Private AI Today

Maple AI Model Guide With Example Prompts

Introducing Maple Proxy – the Maple AI API that brings encrypted LLMs to your OpenAI apps

Introducing Maple Proxy: OpenAI-Compatible API Access to Private LLMs

Why a Proxy?

Available Models & Pricing

Two Ways to Get Started

Option 1: Desktop App with Built-in Proxy (Easiest for Local Development)

Option 2: Standalone Docker Deployment (Best for Production)

Docker Compose for Production

API Endpoints

List Available Models

Create Chat Completion

Client Library Examples

Python

Node.js/TypeScript

Any OpenAI-Compatible Tool

Production Best Practices

For Local Development

For Production Deployments

Security Considerations

How It Works Under the Hood

Getting Help

Start Building with Private AI Today

Maple AI Model Guide With Example Prompts

Introducing Maple Proxy – the Maple AI API that brings encrypted LLMs to your OpenAI apps

You might also like...

Introducing Maple Proxy – the Maple AI API that brings encrypted LLMs to your OpenAI apps