- author: Matthew Berman
Stable Diffusion XL 0.9: A Leap Forward in AI Image Generation
Stability AI has just announced the release of Stable Diffusion XL 0.9, which marks a significant advancement in AI-generated image quality. This open-source model offers stunning results that are comparable in quality to mid-journey generated images. In this article, we will explore the features of Stable Diffusion XL 0.9, learn how to use it, and compare its output to previous versions.
Improved Image Quality and Composition
Stable Diffusion XL beta was released in April, and in just a few months, Stable Diffusion XL 0.9 has taken a massive leap forward in terms of image and composition detail. The rate of improvement in AI models, whether for text or generative art, is staggering, and Stable Diffusion XL 0.9 is no exception. The best part is that it is an open-source project, completely free for everyone to use.
To showcase the improvements, let's look at some examples comparing the beta version to the new Stable Diffusion XL 0.9:
Beta Version vs. New Version (Example 1):
- Left: The beta version.
- Right: The new version.
- The new version exhibits significantly more detail, color, and a stunning bokeh effect in the background.
Beta Version vs. New Version (Example 2):
- Left: The beta version.
- Right: The new version.
- The new version showcases a remarkable level of detail, capturing the intricate fur of a wolf in Yosemite National Park.
Beta Version vs. New Version (Example 3):
- Left: The beta version.
- Right: The new version.
- The new version presents flawless details and improvements in capturing the intricacies of a manicured hand holding a coffee cup.
Expanded Functionality and Use Cases
Stable Diffusion XL 0.9 offers an array of functionalities beyond basic text prompting and image-to-image prompting. It includes:
- In-painting: Allows users to replace portions of an image with generative art.
- Out-painting: Enables seamless extension of existing images, creating AI-generated art extensions.
Key Driver for Advancements: Increased Parameter Count
The main driver behind the advancements in composition and output quality in Stable Diffusion XL 0.9 is the significant increase in parameter count. The new version boasts one of the largest parameter counts among open-source image models, with a base model of 3.5 billion parameters and an Ensemble Pipeline of 6.6 billion parameters. In comparison, the beta version has 3.1 billion parameters and uses a single model.
Stable Diffusion XL utilizes two "clip" models, including one of the largest open "clip" models currently trained. This enhancement in processing power contributes to more realistic imagery, greater depth, and improved resolution of up to 1024x1024 pixels.
System Requirements and Accessibility
Despite its powerful output and advanced model architecture, Stable Diffusion XL 0.9 can run on a modern consumer GPU. The system requirements include:
- Operating System: Windows 10 or 11, or Linux
- RAM: 16 GB (not VRAM)
- Graphics Card: Nvidia GeForce RTX 20 series with 8 GB VRAM
Stable Diffusion XL 0.9 is available for use on Clip Drop and the API in Dream Studio. The code is already available on Stability AI's GitHub page, offering users the opportunity to download and experiment with it. While Stable Diffusion XL 0.9 is currently available for research purposes only, a fully refined version with subsequent improvements, Stable Diffusion XL 1.0, is set to release in mid-July.
Creative Possibilities Explored with Stable Diffusion XL
The Stable Diffusion XL model, as demonstrated on Clip Drop's website, presents a wide range of creative possibilities. Some notable examples include anime-style artwork, realistic images with a skeleton overlay, hyper-realism, and the tilt-shift effect. The images generated by Stable Diffusion XL are on par with mid-journey results, showcasing the model's exceptional capabilities.
Pricing and Comparison with Mid-Journey
One of the key advantages of Stable Diffusion XL is that it is completely free to use, unlike mid-journey, which offers limited free access. With Stable Diffusion XL, users can generate up to 400 images per day, providing virtually unlimited creative potential. Additionally, Stable Diffusion XL offers other AI features, including background removal, picture clean-up, relighting, and image upscaling.
To provide a direct comparison, the author tested Stable Diffusion XL with some prompts taken from the mid-journey website. While the generated images may not resemble those produced by mid-journey's models, they exhibit a high level of artistry and quality. Some even surpassed mid-journey's results, emphasizing the exceptional progress made by Stable Diffusion XL.
Additional Features: Relighting
Stable Diffusion XL also includes exciting additional features, such as relighting. Users can position virtual lights in any direction on a face or object to manipulate lighting conditions. This real-time feature allows users to experiment with different light intensity, radius, and distance, thereby altering the appearance and mood of the image. The results are astounding, further enhancing the creative potential of Stable Diffusion XL.
The Future of Stable Diffusion XL
Looking ahead, Stability AI plans to release Stable Diffusion XL 1.0, the fully open version, in mid-July. The progression of this open-source model is remarkable, and the potential for even more astounding improvements is incredibly promising. As open-source AI technologies continue to advance at a rapid pace, Stable Diffusion XL exemplifies the future of AI-generated art.
In conclusion, Stable Diffusion XL 0.9 marks a significant leap forward in AI image generation, offering stunning and realistic results. With the model's open-source nature and free accessibility, users can explore and create without limitations. As Stable Diffusion XL progresses, the upcoming release of version 1.0 promises even more remarkable advancements in generative AI art.