Outfit Anyone - Alibaba's AI virtual try-on project

Today, Bob shared another Alibaba project — "Outfit Anyone".

I personally tried it on Hugging Face, and you can also check out the results: https://huggingface.co/spaces/

Project Introduction

Virtual try-on technology has become a transformative tool, allowing users to experiment with various fashion styles without physically trying them on. However, existing methods often face challenges in generating high fidelity and maintaining detail consistency. Although diffusion models have proven their ability to generate high-quality and realistic images, they still encounter challenges in control and consistency in conditional generation scenarios like virtual try-ons. "Outfit Anyone" addresses these limitations by leveraging a two-stream conditional diffusion model, enabling it to skillfully handle clothing deformation and produce more realistic results.

Research Methodology

The core of the "Outfit Anyone" method is a conditional diffusion model that processes images of the model, clothing, and related text prompts, where the clothing image serves as a control factor. Internally, the network is divided into two streams that independently process model and clothing data. These streams converge within a fusion network, helping embed clothing details into the model feature representation.

Based on this foundation, we built "Outfit Anyone," which includes two key components: the "Zero-shot Try-on Network" for generating initial try-on images and the "Post-hoc Refiner" for enhancing clothing and skin textures in the output images.

Let's take a look at various different effects:

Real-world

Single garment
Full outfit

Unusual fashion
Various body types
Anime characters

The official website further showcases the effects before and after using the "Refiner," demonstrating its ability to significantly enhance clothing textures and realism while maintaining consistency in the clothing.

Before

After

"Outfit Anyone" + "Animate Anyone"