Today, Bob shared another Alibaba project — "Outfit Anyone".
I personally tried it on Hugging Face, and you can also check out the results: https://huggingface.co/spaces/
Project Introduction
Virtual try-on technology has become a transformative tool, allowing users to experiment with various fashion styles without physically trying them on. However, existing methods often face challenges in generating high fidelity and maintaining detail consistency. Although diffusion models have proven their ability to generate high-quality and realistic images, they still encounter challenges in control and consistency in conditional generation scenarios like virtual try-ons. "Outfit Anyone" addresses these limitations by leveraging a two-stream conditional diffusion model, enabling it to skillfully handle clothing deformation and produce more realistic results.
Research Methodology
The core of the "Outfit Anyone" method is a conditional diffusion model that processes images of the model, clothing, and related text prompts, where the clothing image serves as a control factor. Internally, the network is divided into two streams that independently process model and clothing data. These streams converge within a fusion network, helping embed clothing details into the model feature representation.
Based on this foundation, we built "Outfit Anyone," which includes two key components: the "Zero-shot Try-on Network" for generating initial try-on images and the "Post-hoc Refiner" for enhancing clothing and skin textures in the output images.
Let's take a look at various different effects:
Real-world Single garment Full outfit Unusual fashion Various body types Anime characters
The official website further showcases the effects before and after using the "Refiner," demonstrating its ability to significantly enhance clothing textures and realism while maintaining consistency in the clothing.
Before
After
"Outfit Anyone" + "Animate Anyone"
):