top of page

latest stuff in ai, directly in your inbox. 🤗

Thanks for submitting!

Writer's pictureYash Thakker

DragDiffusion: Revolutionizing Image Editing with Diffusion Models

Are you interested in the latest developments in the realm of image editing? A new tool has hit the scene, offering a fresh take on the traditional methods of editing: DragDiffusion. A product of in-depth research by Yujun Shi, Chuhui Xue, Jiachun Pan, Wenqing Zhang, Vincent Y. F. Tan, and Song Bai, this tool enables interactive point-based image editing through the unique application of diffusion models. You can access more about this project here.


What is DragDiffusion?

Please note that DragDiffusion is a research project, not a commercial product. The primary function of this tool is to facilitate interactive point-based image editing. It's designed to run on a Nvidia GPU with a Linux system, although other configurations are yet to be tested.

How to Install and Run DragDiffusion?

Installing and running DragDiffusion involves a straightforward process. For the installation of required libraries, run the following command: bashCopy code

conda env create -f environment.yaml conda activate dragdiff Before running DragDiffusion, set up "accelerate" with the following command: accelerate config.

After setting up, two steps are needed to use DragDiffusion:

Step 1: Train a LoRA

To train a LoRA (Latent Representation Augmentation) on your input image, first place the image under a folder. Make sure this folder only contains this one image. Then, set "SAMPLE_DIR" and "OUTPUT_DIR" in the script "lora/train_lora.sh" to be proper values. "SAMPLE_DIR" should be the directory containing your input image, and "OUTPUT_DIR" should be where you want to save the trained LoRA. Also, you need to set the option "--instance_prompt" in the script "lora/train_lora.sh" to be a proper prompt. Note that this prompt does not have to be a complicated one.

After the "lora/train_lora.sh" file has been configured properly, run the following command to train a LoRA: bash lora/train_lora.sh.

Step 2: Perform "Drag" Editing

Upon training the LoRA, you can now run the following command to start the Gradio user interface: python3 drag_ui_real.py. Please refer to the Demo video for a detailed explanation of how to perform the "drag" editing.

The editing process involves several steps, from dropping your input image into the left-most box to clicking the "Run" button to initiate the algorithm. The final results will be displayed in the right-most box.

Here is an explanation for parameters in the user interface:

  • prompt: The prompt describing the user input image (should be the same as the prompt used to train LoRA).

  • lora_path: The path to the trained LoRA.

  • n_pix_step: Maximum number of steps of motion supervision. Increase this value if handle points have not been "dragged" to the desired position.

  • lam: The regularization coefficient controlling unmasked region stays unchanged. Increase this value if the unmasked region has changed more than what was desired.

  • n_actual_inference_step: Number of DDIM inversion steps performed.

DragDiffusion is a step forward in the world of image editing, introducing a whole new level of interaction and precision to the process. It showcases the power and potential of diffusion models in transforming traditional practices, enhancing user experience, and opening new possibilities in the field.


127 views0 comments

Comments


TOP AI TOOLS

snapy.ai

Snapy allows you to edit your videos with the power of ai. Save at least 30 minutes of editing time for a typical 5-10 minute long video.

- Trim silent parts of your videos
- Make your content more interesting for your audience
- Focus on making more quality content, we will take care of the editing

Landing AI

A platform to create and deploy custom computer vision projects.

SupaRes

An image enhancement platform.

MemeMorph

A tool for face-morphing and memes.

SuperAGI

SuperAGI is an open-source platform providing infrastructure to build autonomous AI agents.

FitForge

A tool to create personalized fitness plans.

FGenEds

A tool to summarize lectures and educational materials.

Shortwave

A platform for emails productivity.

Publer

An all-in-one social media management tool.

Typeface

A tool to generate personalized content.

Addy AI

A Google Chrome Exntesion as an email assistant.

Notability

A telegrambot to organize notes in Notion.

bottom of page