Custom Diffusion 실습

AI Track/CV

Custom Diffusion 실습

쫑쫑JJONG 2023. 6. 21. 14:12

728x90

일단은 먼저 가상환경을 pyenv로 만들어 보자

conda가 default 로 설정 되었기 때문에 밑에 명령어를 통해 비활성화 시켜주자

deactivate
conda config --set auto_activate_base false

그 다음 우분투에서 pyenv설정법을 보자

https://aisj.tistory.com/191#--%--pyenv-virtualenv

ttps://raw.githubusercontent.com/yyuu/pyenv-installer/master/bin/pyenv-installer | bash

를 하면 다음과 같이 나온다

환경변수를 추가해달라고 하는데

sudo nano ~/.bashrc

아래의 내용을 추가
export PYENV_ROOT="$HOME/.pyenv"
command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init -)"

# Load pyenv-virtualenv automatically by adding
# the following to ~/.bashrc:
eval "$(pyenv virtualenv-init -)"

source ~/.bashrc

그 다음 라이브러리 들을 설치해 주자

sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev \
libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw5-dev \
xz-utils tk-dev libffi-dev liblzma-dev python-openssl git

그 다음 셀을 새로고침 해주자

exec "$SHELL"

그 다음 다운받을 수 있는 파이썬 버전들을 확인해 주고

pyenv install --list | grep " 3\.[678]"

custom Diffision은 stable Diffusion의 환경에 따르고

stable Diffusion의 파이썬 버전은 3.8.5이므로

pyenv install 3.8.5

그 다음 다운로드를 확인하고 싶으면

pyenv versions

를 통해 확인이 가능하다

이제 가상환경을 생성해주자

pyenv virtualenv [파이썬 버전] [가상환경이름]
pyenv virtualenv 3.8.5 custom

잘 생성된 것을 확인할 수 있다

그 다음 원하는 디렉토리로 이동을 한 다음

mkdir -p ~/custom_diffusion
cd ~/custom_diffusion

가상환경을 적용해주자

pyenv local custom

이렇게 되면 성공

Custom Diffusion 실습

아래의 공식 깃허브 주소로 들어가서 하라는 setting을 진행한다

https://github.com/adobe-research/custom-diffusion

GitHub - adobe-research/custom-diffusion: Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)

Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023) - GitHub - adobe-research/custom-diffusion: Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffu...

github.com

Single concept

single concept 세팅은 다음과 같다 (RTX 4090 GPU 한개 기준)

먼저 레포에서 하라는 대로 custom diffusion을 clone하고 그 안에 stable diffusion을 clone한다

git clone https://github.com/adobe-research/custom-diffusion.git
cd custom-diffusion
git clone https://github.com/CompVis/stable-diffusion.git
cd stable-diffusion
conda env create -f environment.yaml
conda activate ldm
pip install clip-retrieval tqdm

그 다음 공개된 모델의 checkpoint를 다운 받자

여기의 경우 구분을 위해 custom-diffusion/pretrained_model_path 라는 디렉토리에 구분해서 받았다

예시는 다음의 명령으로 모델을 다운 받았다

wget https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/resolve/main/sd-v1-4.ckpt

더 많은 모델은 아래의 링크에서 받을 수 있다고 한다

https://www.cs.cmu.edu/~custom-diffusion/assets/models/

그 다음 데이터 셋을 받는다

## download dataset
wget https://www.cs.cmu.edu/~custom-diffusion/assets/data.zip
unzip data.zip

그 다음은 아래의 설명으로 나와있는데

## run training (30 GB on 2 GPUs)
bash scripts/finetune_real.sh "cat" data/cat real_reg/samples_cat  cat finetune_addtoken.yaml <pretrained-model-path>

가장 먼저 다음 명령어는 써져있는대로 2 GPU 기반이기에 1 GPU로 돌리면 다음과 같은 에러가 나온다

내 장치는 인덱스가 한 개 인데 여기에는 두개라고 나온다

다음과 같은 수정을 거쳐야 한다

첫 번째 명령어 fine_real.sh를 실행 시키기기 때문에 이 파일에 들어가서

if [ "${ARRAY[4]}" == "finetune.yaml" ]; then
    python -u  train.py \
            --base configs/custom-diffusion/${ARRAY[4]}  \
            -t --gpus 0, \
            --resume-from-checkpoint-custom  ${ARRAY[5]} \
            --caption "${ARRAY[0]}" \
            --datapath ${ARRAY[1]} \
            --reg_datapath "${ARRAY[2]}/images.txt" \
            --reg_caption "${ARRAY[2]}/caption.txt" \
            --name "${ARRAY[3]}-sdv4"
else
    python -u  train.py \
            --base configs/custom-diffusion/${ARRAY[4]}  \
            -t --gpus 0, \
            --resume-from-checkpoint-custom  ${ARRAY[5]} \
            --caption "<new1> ${ARRAY[0]}" \
            --datapath ${ARRAY[1]} \
            --reg_datapath "${ARRAY[2]}/images.txt" \
            --reg_caption "${ARRAY[2]}/caption.txt" \
            --modifier_token "<new1>" \
            --name "${ARRAY[3]}-sdv4"

다음과 같이

-t -gpus 0,1 \ 를 -t -gpus 0, \ 으로 위아래 모두 바꿔주자

,가 없으면 코드를 실행할 때 다음과 같은

Error ~ GPU가 .split(",")할 수 없다

이런 에러가 나오지 주의 하자

그 다음 실행을 하면 다음과 같이

RuntimeError : CUDA out of memory

오류가 나오는데 이를 해결하기 위해 위에서 실행한

finetune_addtoken.yaml 파일에 들어가서

model:
  base_learning_rate: 1.0e-05
  target: src.model.CustomDiffusion
  params:
    linear_start: 0.00085
    linear_end: 0.0120
    num_timesteps_cond: 1
    log_every_t: 200
    timesteps: 1000
    first_stage_key: "image"
    cond_stage_key: "caption"
    image_size: 64
    channels: 4
    cond_stage_trainable: True   # Note: different from the one we trained before
    add_token: True
    freeze_model: "crossattn-kv"
    conditioning_key: crossattn
    monitor: val/loss_simple_ema
    scale_factor: 0.18215
    use_ema: False

    unet_config:
      target: ldm.modules.diffusionmodules.openaimodel.UNetModel
      params:
        image_size: 64 # unused
        in_channels: 4
        out_channels: 4
        model_channels: 320
        attention_resolutions: [ 4, 2, 1 ]
        num_res_blocks: 2
        channel_mult: [ 1, 2, 4, 4 ]
        num_heads: 8
        use_spatial_transformer: True
        transformer_depth: 1
        context_dim: 768
        use_checkpoint: False
        legacy: False

    first_stage_config:
      target: ldm.models.autoencoder.AutoencoderKL
      params:
        embed_dim: 4
        monitor: val/rec_loss
        ddconfig:
          double_z: true
          z_channels: 4
          resolution: 256
          in_channels: 3
          out_ch: 3
          ch: 128
          ch_mult:
          - 1
          - 2
          - 4
          - 4
          num_res_blocks: 2
          attn_resolutions: []
          dropout: 0.0
        lossconfig:
          target: torch.nn.Identity

    cond_stage_config:
      target: src.custom_modules.FrozenCLIPEmbedderWrapper
      params:
        modifier_token: <new1>

data:
  target: train.DataModuleFromConfig
  params:
    batch_size: 4
    num_workers: 4
    wrap: false
    train:
      target: src.finetune_data.MaskBase
      params:
        size: 256
    train2:
      target: src.finetune_data.MaskBase
      params:
        size: 256


lightning:
  callbacks:
    image_logger:
      target: train.ImageLogger
      params:
        batch_frequency: 1000
        max_images: 8
        increase_log_steps: False

  trainer:
    max_steps: 300

다음과 같이 data의 이미지 사이즈와 first stage의 resolution 사이즈 등을 바꿨다

이렇게 바꾸면 다음 사진과 같은 용량을 먹는다

그 다음 위에 있던 코드를 실행하고

위에 과정을 따라했을 경우 명령어는 다음과 같다

bash scripts/finetune_real.sh "cat" data/cat real_reg/samples_cat  cat finetune_addtoken.yaml pretrained_model_path/sd-v1-4.ckpt

상대경로이기에 custom_diffusion 디렉토리에서 실행시켜야 한다.

그 다음은

## save updated model weights
python src/get_deltas.py --path logs/<folder-name> --newtoken 1

를 실행 해서 model weight를 업데이트 해주고

나의 경우는 log 디렉토리를 확인하니 "2023-06-25T20-22-52_cat-sdv4" 가 제일 최신이여서 이걸 <folder-name>에 넣었다

그 다음 아래의 명령어로 이미지 생성시키면 된다

python sample.py --prompt "<new1> cat playing with a ball" --delta_ckpt logs/<folder-name>/checkpoints/delta_epoch\=000004.ckpt --ckpt <pretrained-model-path>

나의 경우

checkpoints 가 이렇게 나와서

python sample.py --prompt "<new1> cat playing with a ball" --delta_ckpt logs/2023-06-25T20-22-52_cat-sdv4/checkpoints/delta_epoch\=last.ckpt --ckpt pretrained_model_path/sd-v1-4.ckpt

이렇게 실행시켰다

그러면 결과를

log/checkpoint에

이렇게 여섯장과 합쳐진 사진 이 생긴것을 볼 수 있다

결과물

이번엔 윤석열 대통령 사진을 학습을 시키고자 다음과 같은 명령어를 사용해봤다

먼저 윤석열 대통령 사진을 data/president/datasets 에 넣고

bash scripts/finetune_real.sh "man" data/president/datasets real_reg/samples_man  man finetune_addtoken.yaml pretrained_model_path/sd-v1-4.ckpt

president 가 아닌 man 으로 사용한 것은 크롤링하는 데이터 셋은 좀 더 일반적인 데이터 셋을 가져오고 president라 하면 자꾸 글자나 치즈 사진이 가져와져서 이다

데이터를 확인해보면

이런 사진들이 regular data에 들어간다 (외국인인 것이 아쉽지만 일단 진행)

학습이 완료되고 logs를 보면 다음과 같이

같은 방법으로 로그를 살펴보면 man 관련해서 생겼다

## save updated model weights
python src/get_deltas.py --path logs/2023-06-25T23-56-39_man-sdv4 --newtoken 1

python sample.py --prompt "<new1> person as Neo in the matrix movie" --delta_ckpt logs/2023-06-25T23-56-39_man-sdv4/checkpoints/delta_epoch\=000000.ckpt --ckpt pretrained_model_path/sd-v1-4.ckpt

논문추천 방법

bash scripts/finetune_real.sh "Asian man" data/president/dataset  Asian finetune_face.yaml pretrained_model_path/sd-v1-4.ckpt

라 하면

File "/opt/.pyenv/versions/3.8.5/lib/python3.8/codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 64: invalid start byte During handling of the above exception, another exception occurred:

라 뜬다 뭔가 띄어쓰기는 안되는 모양이다

bash scripts/finetune_real.sh "Asian_man" data/president/datasets real_reg/samples_Asian Asian finetune_face.yaml pretrained_model_path/sd-v1-4.ckpt

 python src/get_deltas.py --path logs/2023-06-26T01-49-56_Asian-sdv4 --newtoken 1

python sample.py --prompt "A photo of <new1> " --delta_ckpt logs/2023-06-26T01-49-56_Asian-sdv4/checkpoints/delta_epoch\=000004.ckpt --ckpt pretrained_model_path/sd-v1-4.ckpt

추가 정리

세팅을 완료하고 학습을 돌리는데 대충 3가지 정도의 문제점이 발생했다

1. 서버를 백그라운드로 실행

뭔가 계속 컴퓨터를 켜놓기도 좀 그래서 백그라운드로 실행할 방법을 찾았는데

tmux 나 nohup이 좋은 것 같다

추가로 nohup 은 예전에

nohup python train.py --configs swin_dyhead_baseline_a_g_change.py & tail -f nohup.out

이런 식으로 이용한 적이 있다

https://pinedance.github.io/blog/2022/07/29/bash-background-process

nohup을 통해 Bash에서 명령을 실행해 보자

배경

pinedance.github.io

https://blog.naver.com/myohyun/221407089764

(나에게)필요한 리눅스 명령어 정리

서버 작업시에 필요한 명령어들인데.. 이제 많이 쓸게 될 듯. 정리를.. 리눅스 명령어 tmux 여러 쉘을 한꺼...

blog.naver.com

https://chloro.tistory.com/107

[백그라운드 프로세스 - nohup]

nohup 설명백그라운드 프로세스로 작업할 떄 사용하는 명령어.nohup은 실행한 명령을 자동으로 백그라운드로 보내지 않고 , 사용자가 명령행 뒤에 '&'를 붙여야 한다.HUP(Hangup) 시그널을 무시하고 프

chloro.tistory.com

https://lab.naminsik.com/274

lab.naminsik

lab.naminsik - 개발자 남인식 Lab.

lab.naminsik.com

2. 그 다음은 원격서버를 vsc로 띄우기 이다

코드를 수정해야 하는데 이거 해결 안되면 코드 수정이 너무 힘들어지기 때문이다

다음 extension을 다운로드 받고

지정된 경로에

나의 경우 ~/.ssh/config 파일을 만들어서 아래의 내용을 붙였다

Host {호스트 이름}
    HostName {호스트 IP}
    Port {포트 번호} <- key root@ 와 -p 사이의 숫자
    User root
    IdentityFile ~{본인의 key 위치}
    ServerAliveInterval 300
    ServerAliveCountMax 96


Host seungjong
    HostName 27.96.***.**
    Port 22
    User seungjong
    IdentityFile ~/Documents/key
    ServerAliveInterval 300
    ServerAliveCountMax 96

https://learn.microsoft.com/ko-kr/azure-sphere/app-development/ssh-build-vscode

Visual Studio Code 사용하여 SSH를 통한 원격 빌드 및 디버그 - Azure Sphere

SSH 및 Visual Studio Code 사용하여 원격으로 Azure Sphere 애플리케이션을 빌드하고 디버그합니다.

learn.microsoft.com

https://code.visualstudio.com/docs/remote/ssh

Developing on Remote Machines using SSH and Visual Studio Code

Developing on Remote Machines or VMs using Visual Studio Code Remote Development and SSH

code.visualstudio.com

728x90

저작자표시 비영리 변경금지 (새창열림)