Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

examples/text_to_image RuntimeError: "slow_conv2d_cpu" not implemented for 'Half' #3432

Closed
zdwolfe opened this issue May 14, 2023 · 8 comments
Labels
bug Something isn't working stale Issues that haven't received updates

Comments

@zdwolfe
Copy link

zdwolfe commented May 14, 2023

Describe the bug

Following the examples/text_to_image README leads to a reproducible RuntimeError. RuntimeError: "slow_conv2d_cpu" not implemented for 'Half'.

https://github.com/huggingface/diffusers/tree/main/examples/text_to_image

Reproduction

git clone https://github.com/huggingface/diffusers
cd diffusers
pip install .
cd examples/text_to_image/
accelerate config
cat ~/.cache/huggingface/accelerate/default_config.yaml 
compute_environment: LOCAL_MACHINE
distributed_type: 'NO'
downcast_bf16: 'no'
gpu_ids: all
machine_rank: 0
main_training_function: main
mixed_precision: 'no'
num_machines: 1
num_processes: 1
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
huggingface-cli login
...
Login successful
export MODEL_NAME="CompVis/stable-diffusion-v1-4"
export dataset_name="lambdalabs/pokemon-blip-captions"

accelerate launch --mixed_precision="fp16"  train_text_to_image.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --dataset_name=$dataset_name \
  --use_ema \
  --resolution=512 --center_crop --random_flip \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4 \
  --gradient_checkpointing \
  --max_train_steps=15000 \
  --learning_rate=1e-05 \
  --max_grad_norm=1 \
  --lr_scheduler="constant" --lr_warmup_steps=0 \
  --output_dir="sd-pokemon-model" 

Results in

RuntimeError: "slow_conv2d_cpu" not implemented for 'Half'
Steps:   0%|                                                                                                                                                                                                                                                                        | 0/15000 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code

Logs

C:\Users\wolfe\AppData\Local\Programs\Python\Python311\Lib\site-packages\accelerate\accelerator.py:258: FutureWarning: `logging_dir` is deprecated and will be removed in version 0.18.0 of 🤗 Accelerate. Use `project_dir` instead.
  warnings.warn(
05/14/2023 16:15:20 - INFO - __main__ - Distributed environment: DistributedType.NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cpu

Mixed precision type: fp16

{'dynamic_thresholding_ratio', 'prediction_type', 'thresholding', 'sample_max_value', 'variance_type', 'clip_sample_range'} was not found in config. Values will be initialized to default values.
{'norm_num_groups'} was not found in config. Values will be initialized to default values.
{'num_class_embeds', 'conv_out_kernel', 'mid_block_only_cross_attention', 'time_embedding_dim', 'timestep_post_act', 'resnet_time_scale_shift', 'use_linear_projection', 'projection_class_embeddings_input_dim', 'upcast_attention', 'cross_attention_norm', 'time_cond_proj_dim', 'addition_embed_type_num_heads', 'dual_cross_attention', 'class_embed_type', 'resnet_skip_time_act', 'time_embedding_type', 'resnet_out_scale_factor', 'only_cross_attention', 'class_embeddings_concat', 'conv_in_kernel', 'encoder_hid_dim', 'mid_block_type', 'time_embedding_act_fn', 'addition_embed_type'} was not found in config. Values will be initialized to default values.
{'num_class_embeds', 'conv_out_kernel', 'mid_block_only_cross_attention', 'time_embedding_dim', 'timestep_post_act', 'resnet_time_scale_shift', 'use_linear_projection', 'projection_class_embeddings_input_dim', 'upcast_attention', 'cross_attention_norm', 'time_cond_proj_dim', 'addition_embed_type_num_heads', 'dual_cross_attention', 'class_embed_type', 'resnet_skip_time_act', 'time_embedding_type', 'resnet_out_scale_factor', 'only_cross_attention', 'class_embeddings_concat', 'conv_in_kernel', 'encoder_hid_dim', 'mid_block_type', 'time_embedding_act_fn', 'addition_embed_type'} was not found in config. Values will be initialized to default values.
05/14/2023 16:15:24 - WARNING - datasets.builder - Found cached dataset parquet (C:/Users/wolfe/.cache/huggingface/datasets/lambdalabs___parquet/lambdalabs--pokemon-blip-captions-10e3527a764857bd/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec)
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 250.03it/s] 
05/14/2023 16:15:25 - INFO - __main__ - ***** Running training *****
05/14/2023 16:15:25 - INFO - __main__ -   Num examples = 833
05/14/2023 16:15:25 - INFO - __main__ -   Num Epochs = 72
05/14/2023 16:15:25 - INFO - __main__ -   Instantaneous batch size per device = 1
05/14/2023 16:15:25 - INFO - __main__ -   Total train batch size (w. parallel, distributed & accumulation) = 4
05/14/2023 16:15:25 - INFO - __main__ -   Gradient Accumulation steps = 4
05/14/2023 16:15:25 - INFO - __main__ -   Total optimization steps = 15000
Steps:   0%|                                                                                                                                                                                                                                                                        | 0/15000 [00:00<?, ?it/s]Traceback (most recent call last):
  File "C:\Users\wolfe\src\genai\diffusers\examples\text_to_image\train_text_to_image.py", line 959, in <module>
    main()
  File "C:\Users\wolfe\src\genai\diffusers\examples\text_to_image\train_text_to_image.py", line 823, in main
    latents = vae.encode(batch["pixel_values"].to(weight_dtype)).latent_dist.sample()
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\wolfe\AppData\Local\Programs\Python\Python311\Lib\site-packages\diffusers\utils\accelerate_utils.py", line 46, in wrapper
    return method(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\wolfe\AppData\Local\Programs\Python\Python311\Lib\site-packages\diffusers\models\autoencoder_kl.py", line 164, in encode
    h = self.encoder(x)
        ^^^^^^^^^^^^^^^
  File "C:\Users\wolfe\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\wolfe\AppData\Local\Programs\Python\Python311\Lib\site-packages\diffusers\models\vae.py", line 109, in forward
    sample = self.conv_in(sample)
             ^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\wolfe\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\wolfe\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\wolfe\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: "slow_conv2d_cpu" not implemented for 'Half'
Steps:   0%|                                                                                                                                                                                                                                                                        | 0/15000 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\wolfe\AppData\Local\Programs\Python\Python311\Scripts\accelerate.exe\__main__.py", line 7, in <module>
  File "C:\Users\wolfe\AppData\Local\Programs\Python\Python311\Lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
    args.func(args)
  File "C:\Users\wolfe\AppData\Local\Programs\Python\Python311\Lib\site-packages\accelerate\commands\launch.py", line 918, in launch_command
    simple_launcher(args)
  File "C:\Users\wolfe\AppData\Local\Programs\Python\Python311\Lib\site-packages\accelerate\commands\launch.py", line 580, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['C:\\Users\\wolfe\\AppData\\Local\\Programs\\Python\\Python311\\python.exe', 'train_text_to_image.py', '--pretrained_model_name_or_path=CompVis/stable-diffusion-v1-4', '--dataset_name=lambdalabs/pokemon-blip-captions', '--use_ema', '--resolution=512', '--center_crop', '--random_flip', '--train_batch_size=1', '--gradient_accumulation_steps=4', '--gradient_checkpointing', '--max_train_steps=15000', '--learning_rate=1e-05', '--max_grad_norm=1', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--output_dir=sd-pokemon-model']' returned non-zero exit status 1.   

System Info

- `diffusers` version: 0.17.0.dev0
- Platform: Windows-10-10.0.22621-SP0
- Python version: 3.11.3
- PyTorch version (GPU?): 2.0.1+cpu (False)
- Huggingface_hub version: 0.14.1
- Transformers version: 4.29.1
- Accelerate version: 0.19.0
- xFormers version: not installed
- Using GPU in script?: all (RTX 4090)
- Using distributed or parallel set-up in script?: no
@zdwolfe zdwolfe added the bug Something isn't working label May 14, 2023
@tim-tmds
Copy link

ultralytics/yolov5#10379 (comment)
your issue might be similar to above

@zdwolfe
Copy link
Author

zdwolfe commented May 16, 2023

Thank you @tim-tmds. Looking at ultralytics/yolov5#10379 (comment) I don't believe that's the case. I followed the README instructions for installation and am not intending to use a CPU for training.

@sayakpaul
Copy link
Member

From your diffusers-cli env:

PyTorch version (GPU?): 2.0.1+cpu (False)

Could you ensure PyTorch has been installed correctly?

@zdwolfe
Copy link
Author

zdwolfe commented Jun 1, 2023 via email

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Jun 26, 2023
@github-actions github-actions bot closed this as completed Jul 4, 2023
@woniesong92
Copy link

Experiencing the same problem

@sayakpaul
Copy link
Member

You're probably using FP16 on a CPU.

@lskckkvvks
Copy link

lskckkvvks commented Feb 15, 2024

Following the tutorial.ipynb from https://github.com/ultralytics/yolov5, I meet similar problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale Issues that haven't received updates
Projects
None yet
Development

No branches or pull requests

5 participants