back

Optimized for NVIDIA RTX GPUs the new models are available in FP8 quantizations that reduce VRAM and increase performance by 40%.