Open Source Models

Explore and Deploy Open Source AI Models

LLaMA 3 70B

Meta AI

52,000

Meta's latest large language model with significantly improved performance and longer context window

Parameters

70B

Inference Requirements

GPU: NVIDIA A100

VRAM: 80GB

Disk: 140GB

Throughput: ~40 tokens/s

Fine-tuning Requirements

GPU: NVIDIA A100 80GB x8

VRAM: 600GB

Disk: 300GB

Batch Size: 16

Optimization Techniques

LoRAQLoRAFlash Attention 2DeepSpeed

Quantization Options

4-bit8-bitGPTQAWQGGUF

License

LLaMA 3 License

Mixtral-8x7B

Mistral AI

12,500

A high-performance mixture-of-experts model with state-of-the-art performance

Parameters

47B

Inference Requirements

GPU: NVIDIA A100/H100

VRAM: 48GB

Disk: 100GB

Throughput: ~30 tokens/s

Fine-tuning Requirements

GPU: NVIDIA A100 80GB x4

VRAM: 240GB

Disk: 200GB

Batch Size: 32

Optimization Techniques

LoRAQLoRAFlash Attention 2

Quantization Options

4-bit8-bitGPTQAWQ

License

Apache 2.0

Gemma 7B

Google

9,800

Google's efficient and powerful open foundation model

Parameters

7B

Inference Requirements

GPU: NVIDIA RTX 3090

VRAM: 12GB

Disk: 15GB

Throughput: ~45 tokens/s

Fine-tuning Requirements

GPU: NVIDIA A100

VRAM: 40GB

Disk: 50GB

Batch Size: 64

Optimization Techniques

LoRAQLoRAFlash Attention 2

Quantization Options

4-bit8-bitGPTQ

License

Gemma License

Phi-3

Microsoft

9,500

Latest version of Microsoft's compact yet powerful language model with enhanced reasoning and coding capabilities

Parameters

3.8B

Inference Requirements

GPU: NVIDIA RTX 3060

VRAM: 8GB

Disk: 8GB

Throughput: ~55 tokens/s

Fine-tuning Requirements

GPU: NVIDIA RTX 4090

VRAM: 24GB

Disk: 25GB

Batch Size: 64

Optimization Techniques

LoRAQLoRA8-bit Training

Quantization Options

4-bit8-bitGGUF

License

MIT

Yi-34B

01.AI

5,200

A powerful multilingual model with strong performance across various tasks

Parameters

34B

Inference Requirements

GPU: NVIDIA A40/A6000

VRAM: 40GB

Disk: 70GB

Throughput: ~40 tokens/s

Fine-tuning Requirements

GPU: NVIDIA A100 80GB x2

VRAM: 160GB

Disk: 150GB

Batch Size: 48

Optimization Techniques

LoRAQLoRAFlash Attention 2

Quantization Options

4-bit8-bitGPTQAWQ

License

Apache 2.0

Qwen-72B

Alibaba Cloud

7,800

A large language model trained on a massive multilingual corpus with strong performance in both Chinese and English

Parameters

72B

Inference Requirements

GPU: NVIDIA A100

VRAM: 80GB

Disk: 150GB

Throughput: ~25 tokens/s

Fine-tuning Requirements

GPU: NVIDIA A100 80GB x8

VRAM: 600GB

Disk: 300GB

Batch Size: 16

Optimization Techniques

LoRAQLoRAFlash Attention 2DeepSpeed

Quantization Options

4-bit8-bitGPTQAWQ

License

Qwen License

Baichuan2-13B

Baichuan Inc

6,900

An advanced bilingual language model optimized for Chinese and English tasks

Parameters

13B

Inference Requirements

GPU: NVIDIA A40

VRAM: 24GB

Disk: 30GB

Throughput: ~40 tokens/s

Fine-tuning Requirements

GPU: NVIDIA A100 80GB

VRAM: 80GB

Disk: 100GB

Batch Size: 32

Optimization Techniques

LoRAQLoRAFlash Attention 2

Quantization Options

4-bit8-bitGPTQAWQ

License

Apache 2.0

DeepSeek Coder

DeepSeek

4,200

A code language model trained on a high-quality codebase with superior performance in code generation and understanding

Parameters

33B

Inference Requirements

GPU: NVIDIA A40

VRAM: 40GB

Disk: 80GB

Throughput: ~35 tokens/s

Fine-tuning Requirements

GPU: NVIDIA A100 80GB x2

VRAM: 160GB

Disk: 200GB

Batch Size: 32

Optimization Techniques

LoRAQLoRAFlash Attention 2

Quantization Options

4-bit8-bitGPTQ

License

Apache 2.0

Whisper Large V3

OpenAI

45,800

Latest version of OpenAI's multilingual speech recognition model with improved accuracy and robustness

Parameters

1.5B

Inference Requirements

GPU: NVIDIA GPU

VRAM: 4GB

Disk: 10GB

Throughput: ~1x realtime

Fine-tuning Requirements

GPU: NVIDIA GPU

VRAM: 8GB

Disk: 20GB

Batch Size: 16

Optimization Techniques

LoRAGradient Checkpointing

License

MIT

Stable Diffusion XL Turbo

Stability AI

62,500

Latest SDXL model optimized for real-time image generation with significantly faster inference

Parameters

2.7B

Inference Requirements

GPU: NVIDIA RTX 3090

VRAM: 12GB

Disk: 25GB

Throughput: ~1s per image

Fine-tuning Requirements

GPU: NVIDIA A100

VRAM: 40GB

Disk: 100GB

Batch Size: 16

Optimization Techniques

DreamBoothTextual InversionLoRA

License

OpenRAIL