Nemotron 3 Ultra: NVIDIA’s Open Reasoning Model
Learn what Nemotron 3 Ultra is, what it can do, hardware needs, access options, and when to use it for agents, coding, and RAG.
Learn what Nemotron 3 Ultra is, what it can do, hardware needs, access options, and when to use it for agents, coding, and RAG.
Learn how to use ESMFold2 online through Biohub, Tamarind Bio, APIs, and local developer options for protein structure prediction.
State-of-the-art image generation, in your browser. Bonsai Image 4B is a compressed text-to-image model from PrismML, built for local generation on iPhone, Mac, and GPUs.
Detect and label objects in images and videos. LocateAnything is an NVIDIA vision-language model that finds objects, text, GUI elements, and points in images with natural language prompts.
Whisper AI is OpenAI’s speech recognition model for transcribing, translating, and understanding spoken audio.
DeepSeek OCR 2 is an open-source OCR and document understanding model built for complex layouts, Markdown output, and human-like reading order.
DeepSeek OCR is an open-source vision-language OCR model that converts document images into structured text and Markdown with efficient visual token compression.
LTX-2 is an open-source AI video model that generates synchronized video and audio for creative, research, and production workflows.
VoxCPM is an open-source TTS model family for multilingual speech generation, voice design, and realistic voice cloning.
Breakthrough in Emotionally Expressive and Duration-Controlled Auto-Regressive Zero-Shot Text-to-Speech