In a major breakthrough for AI technology, Stability AI is excited to introduce the most sophisticated development in their Stable Diffusion text-to-image suite of models, the SDXL 0.9. Building upon the successes of the Stable Diffusion XL beta version released in April, the SDXL 0.9 dramatically improves the intricacies of image and composition detail, setting a new standard in the realm of AI imagery.
Despite being designed to operate on a standard consumer GPU, the SDXL 0.9 heralds a new era for creative use cases in AI generative imagery. Its capability to create hyper-realistic visuals suitable for numerous fields such as film, television, music, and instructional videos, as well as design and industrial uses, positions SDXL as a powerhouse in the real-world applications of AI imagery.
Unmatched Quality and Performance
The SDXL 0.9 not only offers enhanced image quality, but it also provides a variety of advanced functionalities. These include image-to-image prompting, inpainting (restoring missing parts of an image), and outpainting (creating a seamless extension of an existing image). The capabilities offered by the SDXL series are poised to redefine the landscape of AI-powered imaging.
The paramount enhancement in SDXL 0.9 is a significant boost in the parameter count. SDXL 0.9 prides itself as one of the most comprehensive open-source image models, with a 3.5B parameter base model and a 6.6B parameter model ensemble pipeline. This allows the final output to be fine-tuned by running on two models and aggregating the results, with the second stage model dedicated to refining the details of the generated output from the first stage.
In comparison, the previous beta version had a smaller parameter count at 3.1B and used only a single model. SDXL 0.9 operates on two CLIP models, including one of the most substantial OpenCLIP models (OpenCLIP ViT-G/14) trained to date. This enhances 0.9’s processing power and ability to generate realistic imagery with greater depth and superior resolution of 1024x1024.
Accessible and User-Friendly
In terms of system requirements, SDXL 0.9 is remarkably user-friendly. It can be run on a modern consumer GPU, requiring just a Windows 10 or 11, or Linux operating system, with 16GB RAM, an Nvidia GeForce RTX 20 graphics card (or a higher standard) equipped with at least 8GB of VRAM. Linux users can also opt for a compatible AMD card with 16GB VRAM.
Impressive Launch Statistics and Availability
The beta launch of SDXL garnered significant attention, with nearly 7,000 users from our Discord community generating over 700,000 images. More than 54,000 images were submitted into Discord community 'Showdowns' and 3,521 SDXL images were recognized as winners.
SDXL 0.9 is now available on the ClipDrop platform by Stability AI. Stability AI API and DreamStudio customers will gain access to the model on June 26, as will other leading image generating tools like NightCafe. For now, SDXL 0.9 will be exclusively available for research purposes to gather feedback and refine the model prior to its open release. The code to run it will be made publicly available on GitHub.
Researchers interested in accessing these models can apply using the following links: SDXL-0.9-Base model, and SDXL-0.9-Refiner. Please log in to your HuggingFace Account with your academic email to request access.
Following the release of SDXL 0.9, Stability AI is planning a full open release of SDXL 1.0 targeted for mid-July.