Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to fine-tune the pre-trained model on my dataset? #67

Open
felixshing opened this issue Jul 9, 2024 · 8 comments
Open

How to fine-tune the pre-trained model on my dataset? #67

felixshing opened this issue Jul 9, 2024 · 8 comments

Comments

@felixshing
Copy link

Hello, I would like to ask how can I load the pre-trained model and fine-tune it on my self-collected dataset?

@yerfor
Copy link
Owner

yerfor commented Jul 9, 2024

Hi, you can use the init_from_ckpt option.

@felixshing
Copy link
Author

felixshing commented Jul 9, 2024

Hi, you can use the init_from_ckpt option.

Thanks!

I have another question regarding the pre-trained models you provided. Specifically, you included "audio2secc_vae" and "secc2plane_torso_orig". However, in your training guidelines for audio, it is recommended to first train "audio_lm3d_syncnet" and then "audio2motion". Similarly, for motion, the guideline suggests first training "Img-to-Plane" followed by "Motion-to-Video", which includes "secc2plane_head" and "secc2plane_torso".

I am a bit confused about their relationships. Are "audio2secc_vae" equivalent to "audio2motion" and "secc2plane_torso_orig" equivalent to "secc2plane_torso"?

For audio training, should I:

  1. Train "audio_lm3d_syncnet" myself, and then
  2. When training "audio2motion", provide the checkpoints from both my trained "audio_lm3d_syncnet" and the provided "audio2secc_vae"?

Or, do I not have to train "audio_lm3d_syncnet" at all and just provide "audio2secc_vae" for fine-tuning?

Similarly, for Motion-to-Video training, should I:

  1. Train "Img-to-Plane" myself
  2. Train "secc2plane_head" myself, based on trained "Img-to-Plane"
  3. When training "secc2plane_torso", provide the checkpoints from both my trained "secc2plane_head" and the provided "secc2plane_torso_orig"?

But seems we can only set one checkpoint for "init_from_ckp"?

Additionally, does "secc2plane_head" imply inferring only the head area without the torso?

Thank you so much for your help!

@yerfor
Copy link
Owner

yerfor commented Jul 9, 2024

  1. Yes, "audio2secc_vae" equivalent to "audio2motion" and "secc2plane_torso_orig" equivalent to "secc2plane_torso"
  2. For audio training, should I ==> Yes, you need to train a syncnet.
  3. You can skip the image-to-plane pre-training, and go through the init_from_ckpt => secc2plane_head => secc2plane_torso.
  4. does "secc2plane_head" imply inferring only the head area without the torso? ==> Yes

@felixshing
Copy link
Author

felixshing commented Jul 9, 2024

Thank you so much for your response! I am still a bit confused about this step:

  1. You can skip the image-to-plane pre-training, and go through the init_from_ckpt => secc2plane_head => secc2plane_torso.

Where can we get the pre-trained model for image-to-plane? It appears that currently, we only have the pre-trained models for "audio2motion" and "secc2plane_torso".

Additionally, I noticed that during evaluation, the human figure changes each time instead of using the one I provided. Where is this part of the setup, and how can we modify it to use my provided human figure?

image

Thank you for your time!

@yerfor
Copy link
Owner

yerfor commented Jul 9, 2024

you can use the provided pre-trained secc2plane_torso to initialize you own secc2plane_head model, just set strict=False.

For using your provided human figure, please modify the code in validation_steps

@felixshing
Copy link
Author

you can use the provided pre-trained secc2plane_torso to initialize you own secc2plane_head model, just set strict=False.

For using your provided human figure, please modify the code in validation_steps

Thank you for your reply!

I have modified the training logic. However, when I tried to train the secc2plane_head model on my 4090 GPU, I encountered the OOM issue. Is there any way to reduce the GPU memory requirement during training? I tried to reduce "num_workers" but it did not work

@yerfor
Copy link
Owner

yerfor commented Jul 9, 2024

You can reduce the batch_size, or you can try amp=True

@moliq1
Copy link

moliq1 commented Aug 19, 2024

@yerfor Hi, Thank you so much for your wonderful work. I was wondering if you could also release a public avaliable model of the syncnet, so we can finetune on our dataset much easier?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants