Llama Models

Explore our collection of Llama models for various use cases and deployment options.

Browser Models

Run models directly in your browser with WebGPU

Experience the power of Llama models running directly in your browser. No server required, your data stays private and secure.

Llama 3 8B Instruct
Llama 2 7B Chat
Mistral 7B Instruct

CosmWasm Models

Deploy models to CosmWasm blockchain

Deploy Llama models as smart contracts on CosmWasm-enabled blockchains for decentralized AI inference with transparent execution.

Secure & Transparent
Decentralized Inference
Token-Gated Access

API Models

Access models through our API

Integrate Llama models into your applications with our simple REST API. High-performance inference with flexible deployment options.

REST API
Webhook Integration
Rate Limiting & Caching

Fine-tuned Models

Custom-trained models for specific domains

Llama models fine-tuned for specific domains and use cases, offering enhanced performance for specialized tasks.

Legal Assistant
Medical Consultant
Financial Analyst

Quantized Models

Optimized models for efficient deployment

Precision-reduced models that maintain quality while dramatically reducing size and increasing inference speed.

4-bit Quantization
8-bit Quantization
GGUF Format

Coming Soon

!

New models in development

Stay tuned for upcoming model releases and features. We're constantly expanding our offerings.

Llama 3 70B
Multimodal Models
Custom Training Pipeline