llms / directory

MODEL WEIGHTS

197listings · open vs closed weights · readme & download links

Miso 1

open

Misa Labs

Miso 1 is the most emotive voice model in the world, capable of human-like emotional responses and fast reaction times. It is open source and comes with a full API for developers to build upon.

voice
0 · 0 comments

Gemma 4

open

Google DeepMind

Gemma 4 is an advanced model built from Gemini 3 research, designed to maximize intelligence-per-parameter. It supports multimodal reasoning and is optimized for various applications, including mobile and IoT devices.

language· 12B, 26B, 31B
0 · 0 commentsweights link →

LFM2.5-8B-A1B

open

Liquid AI

LFM2.5-8B-A1B is a device-optimized model designed for real-life applications on various devices including phones, laptops, and robots. It features an expanded context length and a hybrid MoE architecture, making it fast and reliable for diverse use cases.

· 8B· 128,000 ctx
0 · 0 comments

ESM Cambrian

open

Biohub

ESM Cambrian (ESMC) is a next-generation evolutionary scale model that predicts protein structure and facilitates the design of new proteins. It leverages billions of protein sequences to internalize the fundamental properties of protein biology, enabling high-accuracy predictions and innovative protein designs.

protein-language
1 · 0 comments

LocateAnything

open

The Hong Kong Polytechnic University, Princeton University, Nanjing University, University of Illinois Urbana-Champaign

Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding. LocateAnything performs diverse localization tasks under a unified vision-language model, including document understanding, GUI grounding, dense object detection, and OCR localization.

vision-language
0 · 0 comments

LongCat Video Avatar 1.5

open

victor

LongCat Video Avatar 1.5 is a model designed for creating animated video avatars. It leverages advanced techniques to generate lifelike representations in video format.

generative-media
0 · 0 comments

Kronos

open

shiyu-coder

Kronos is the first open-source foundation model for financial candlesticks, trained on data from over 45 global exchanges. It is designed to handle the unique characteristics of financial data, providing a specialized solution for forecasting in financial markets.

language· 499.2M· 512 ctx
0 · 0 commentsweights link →

Cohere Command A+

open

Cohere Labs

Cohere Command A+ is an open-source LLM optimized for agentic, multilingual, and reasoning-heavy tasks. It supports vision inputs and is designed for efficient deployment on minimal hardware.

language· 25B· 128,000 ctx
0 · 0 commentsweights link →

Marlin 2B

open

NemoStation

Marlin 2B is a video VLM designed to extract structured information from videos, providing precise scene and event captions with timestamps. It excels in dense captioning and temporal grounding tasks.

video-language· 2B
0 · 0 comments

Lance

open

ByteDance

Lance is a 3B native unified multimodal model that supports image and video understanding, generation, and editing within a single framework. It is efficient at 3B scale, delivering strong performance across various benchmarks.

multimodal· 3B
0 · 0 commentsweights link →

Dramabox

open

Resemble AI

Dramabox is an expressive text-to-speech model with voice cloning capabilities. It allows users to control speaker identity, emotion, and delivery through prompts, making it ideal for creating dynamic audio content.

text-to-speech· 3.3B
0 · 0 commentsweights link →

Kronos

open

shiyu-coder

Kronos is the first open-source foundation model for financial candlesticks (K-lines), trained on data from over 45 global exchanges. It is designed to handle the unique, high-noise characteristics of financial data.

language· 499.2M· 512 ctx
0 · 0 commentsweights link →

HY-World 2.0

open

Tencent-Hunyuan

HY-World 2.0 is a multi-modal world model framework for generating and reconstructing 3D worlds from various input modalities. It produces editable 3D assets that can be imported into game engines, offering capabilities for both world generation and reconstruction.

generative-media· ~1.2B
0 · 0 commentsweights link →

Mistral Medium 3.5

open

Mistral AI

Mistral Medium 3.5 is a flagship model designed for instruction-following, reasoning, and coding tasks. It operates as a dense 128B model with a 256k context window, enabling efficient performance in real-world applications.

language· 128B· 256,000 ctx
1 · 0 commentsweights link →

VibeVoice

open

Microsoft

VibeVoice is a family of open-source frontier voice AI models that includes both Text-to-Speech (TTS) and Automatic Speech Recognition (ASR) models. It supports long-form audio processing and multilingual capabilities.

speech· 64,000 ctx
0 · 0 comments

ACE-Step 1.5

open

ACE Music

ACE-Step 1.5 is a highly efficient open-source music foundation model that delivers commercial-grade music generation on consumer hardware. It supports lightweight personalization and runs locally with less than 4GB of VRAM.

generative-media· 4B
0 · 0 commentsweights link →

DeepSeek V4

open

DeepSeek, Inc.

DeepSeek V4 is an open-source model offering cost-effective 1M context length with enhanced agentic capabilities and world-class reasoning. It includes two variants: V4-Pro and V4-Flash, catering to different performance needs.

language· 1.6T / 284B· 1,000,000 ctx
0 · 0 commentsweights link →

Wan2.1

open

Wan-Video

Wan2.1 is an open suite of video foundation models that excels in video generation tasks including Text-to-Video, Image-to-Video, and Video Editing. It is designed to perform efficiently on consumer-grade GPUs while delivering state-of-the-art performance.

generative-media· 14B
0 · 0 commentsweights link →

Wan 2.7

open

Alibaba Cloud

Wan 2.7 is an advanced AI model for video editing and image generation, allowing users to create and customize visuals with text prompts and multi-image guidance. It supports long-form text generation in multiple languages and offers precise control over color and image editing.

generative-media
0 · 0 comments

VOID: Video Object and Interaction Deletion

open

Netflix

VOID removes objects from videos along with all interactions they induce on the scene. It handles not just secondary effects like shadows and reflections, but also physical interactions like objects falling when a person is removed.

video-to-video· 5B
0 · 0 commentsweights link →

GLM-5.1

open

Z.ai

GLM-5.1 is a next-generation flagship model for agentic engineering, offering significantly stronger coding capabilities than its predecessor. It excels in handling ambiguous problems and sustains optimization over extended sessions.

code· 754B
0 · 0 commentsweights link →

wizardlm-13b

open

Microsoft

Microsoft · 1 Arena leaderboard

language· 13B
0 · 0 commentsweights link →

zephyr-7b-alpha

open

HuggingFace

HuggingFace · 1 Arena leaderboard

language· 7B
0 · 0 commentsweights link →

zephyr-7b-beta

open

HuggingFace

HuggingFace · 1 Arena leaderboard

language· 7B· 16,384 ctx
0 · 0 commentsweights link →