llms / directory
MODEL WEIGHTS▌
197listings · open vs closed weights · readme & download links
Miso 1
openMisa Labs
Miso 1 is the most emotive voice model in the world, capable of human-like emotional responses and fast reaction times. It is open source and comes with a full API for developers to build upon.
Gemma 4
openGoogle DeepMind
Gemma 4 is an advanced model built from Gemini 3 research, designed to maximize intelligence-per-parameter. It supports multimodal reasoning and is optimized for various applications, including mobile and IoT devices.
LFM2.5-8B-A1B
openLiquid AI
LFM2.5-8B-A1B is a device-optimized model designed for real-life applications on various devices including phones, laptops, and robots. It features an expanded context length and a hybrid MoE architecture, making it fast and reliable for diverse use cases.
ESM Cambrian
openBiohub
ESM Cambrian (ESMC) is a next-generation evolutionary scale model that predicts protein structure and facilitates the design of new proteins. It leverages billions of protein sequences to internalize the fundamental properties of protein biology, enabling high-accuracy predictions and innovative protein designs.
LocateAnything
openThe Hong Kong Polytechnic University, Princeton University, Nanjing University, University of Illinois Urbana-Champaign
Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding. LocateAnything performs diverse localization tasks under a unified vision-language model, including document understanding, GUI grounding, dense object detection, and OCR localization.
LongCat Video Avatar 1.5
openvictor
LongCat Video Avatar 1.5 is a model designed for creating animated video avatars. It leverages advanced techniques to generate lifelike representations in video format.
Kronos
openshiyu-coder
Kronos is the first open-source foundation model for financial candlesticks, trained on data from over 45 global exchanges. It is designed to handle the unique characteristics of financial data, providing a specialized solution for forecasting in financial markets.
Cohere Command A+
openCohere Labs
Cohere Command A+ is an open-source LLM optimized for agentic, multilingual, and reasoning-heavy tasks. It supports vision inputs and is designed for efficient deployment on minimal hardware.
Marlin 2B
openNemoStation
Marlin 2B is a video VLM designed to extract structured information from videos, providing precise scene and event captions with timestamps. It excels in dense captioning and temporal grounding tasks.
Lance
openByteDance
Lance is a 3B native unified multimodal model that supports image and video understanding, generation, and editing within a single framework. It is efficient at 3B scale, delivering strong performance across various benchmarks.
Dramabox
openResemble AI
Dramabox is an expressive text-to-speech model with voice cloning capabilities. It allows users to control speaker identity, emotion, and delivery through prompts, making it ideal for creating dynamic audio content.
Kronos
openshiyu-coder
Kronos is the first open-source foundation model for financial candlesticks (K-lines), trained on data from over 45 global exchanges. It is designed to handle the unique, high-noise characteristics of financial data.
HY-World 2.0
openTencent-Hunyuan
HY-World 2.0 is a multi-modal world model framework for generating and reconstructing 3D worlds from various input modalities. It produces editable 3D assets that can be imported into game engines, offering capabilities for both world generation and reconstruction.
Mistral Medium 3.5
openMistral AI
Mistral Medium 3.5 is a flagship model designed for instruction-following, reasoning, and coding tasks. It operates as a dense 128B model with a 256k context window, enabling efficient performance in real-world applications.
VibeVoice
openMicrosoft
VibeVoice is a family of open-source frontier voice AI models that includes both Text-to-Speech (TTS) and Automatic Speech Recognition (ASR) models. It supports long-form audio processing and multilingual capabilities.
ACE-Step 1.5
openACE Music
ACE-Step 1.5 is a highly efficient open-source music foundation model that delivers commercial-grade music generation on consumer hardware. It supports lightweight personalization and runs locally with less than 4GB of VRAM.
DeepSeek V4
openDeepSeek, Inc.
DeepSeek V4 is an open-source model offering cost-effective 1M context length with enhanced agentic capabilities and world-class reasoning. It includes two variants: V4-Pro and V4-Flash, catering to different performance needs.
Wan2.1
openWan-Video
Wan2.1 is an open suite of video foundation models that excels in video generation tasks including Text-to-Video, Image-to-Video, and Video Editing. It is designed to perform efficiently on consumer-grade GPUs while delivering state-of-the-art performance.
Wan 2.7
openAlibaba Cloud
Wan 2.7 is an advanced AI model for video editing and image generation, allowing users to create and customize visuals with text prompts and multi-image guidance. It supports long-form text generation in multiple languages and offers precise control over color and image editing.
VOID: Video Object and Interaction Deletion
openNetflix
VOID removes objects from videos along with all interactions they induce on the scene. It handles not just secondary effects like shadows and reflections, but also physical interactions like objects falling when a person is removed.
GLM-5.1
openZ.ai
GLM-5.1 is a next-generation flagship model for agentic engineering, offering significantly stronger coding capabilities than its predecessor. It excels in handling ambiguous problems and sustains optimization over extended sessions.
wizardlm-13b
openMicrosoft
Microsoft · 1 Arena leaderboard
zephyr-7b-alpha
openHuggingFace
HuggingFace · 1 Arena leaderboard
zephyr-7b-beta
openHuggingFace
HuggingFace · 1 Arena leaderboard