Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't train in colab #73

Open
HiImBug opened this issue Oct 5, 2021 · 2 comments
Open

Can't train in colab #73

HiImBug opened this issue Oct 5, 2021 · 2 comments

Comments

@HiImBug
Copy link

HiImBug commented Oct 5, 2021

Hi thank you for sharing
I'm trying to train in colab but a have problem, somebody can help me pls?
I'm sure the folder structure as author.
I have tested in colab the following versions :
CUDA: 11.2
PyTorch 1.6.0
Python 3.8.5

!CUDA_VISIBLE_DEVICES="0" python tools/trainval.py configs/trainval/tinaface/tinaface_r50_fpn_bn.py

  • vedadet - WARNING - EvalHook is not in modes ['train']
  • vedadet - INFO - Loading weights from torchvision://resnet50
  • vedadet - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: backbone.fc.weight, backbone.fc.bias

missing keys in source state_dict: neck.0.lateral_convs.0.conv.weight, neck.0.lateral_convs.0.bn.weight, neck.0.lateral_convs.0.bn.bias, neck.0.lateral_convs.0.bn.running_mean, neck.0.lateral_convs.0.bn.running_var, neck.0.lateral_convs.1.conv.weight, neck.0.lateral_convs.1.bn.weight, neck.0.lateral_convs.1.bn.bias, neck.0.lateral_convs.1.bn.running_mean, neck.0.lateral_convs.1.bn.running_var, neck.0.lateral_convs.2.conv.weight, neck.0.lateral_convs.2.bn.weight, neck.0.lateral_convs.2.bn.bias, neck.0.lateral_convs.2.bn.running_mean, neck.0.lateral_convs.2.bn.running_var, neck.0.lateral_convs.3.conv.weight, neck.0.lateral_convs.3.bn.weight, neck.0.lateral_convs.3.bn.bias, neck.0.lateral_convs.3.bn.running_mean, neck.0.lateral_convs.3.bn.running_var, neck.0.fpn_convs.0.conv.weight, neck.0.fpn_convs.0.bn.weight, neck.0.fpn_convs.0.bn.bias, neck.0.fpn_convs.0.bn.running_mean, neck.0.fpn_convs.0.bn.running_var, neck.0.fpn_convs.1.conv.weight, neck.0.fpn_convs.1.bn.weight, neck.0.fpn_convs.1.bn.bias, neck.0.fpn_convs.1.bn.running_mean, neck.0.fpn_convs.1.bn.running_var, neck.0.fpn_convs.2.conv.weight, neck.0.fpn_convs.2.bn.weight, neck.0.fpn_convs.2.bn.bias, neck.0.fpn_convs.2.bn.running_mean, neck.0.fpn_convs.2.bn.running_var, neck.0.fpn_convs.3.conv.weight, neck.0.fpn_convs.3.bn.weight, neck.0.fpn_convs.3.bn.bias, neck.0.fpn_convs.3.bn.running_mean, neck.0.fpn_convs.3.bn.running_var, neck.0.fpn_convs.4.conv.weight, neck.0.fpn_convs.4.bn.weight, neck.0.fpn_convs.4.bn.bias, neck.0.fpn_convs.4.bn.running_mean, neck.0.fpn_convs.4.bn.running_var, neck.0.fpn_convs.5.conv.weight, neck.0.fpn_convs.5.bn.weight, neck.0.fpn_convs.5.bn.bias, neck.0.fpn_convs.5.bn.running_mean, neck.0.fpn_convs.5.bn.running_var, neck.1.level_convs.0.0.conv.weight, neck.1.level_convs.0.0.bn.weight, neck.1.level_convs.0.0.bn.bias, neck.1.level_convs.0.0.bn.running_mean, neck.1.level_convs.0.0.bn.running_var, neck.1.level_convs.0.1.conv.weight, neck.1.level_convs.0.1.bn.weight, neck.1.level_convs.0.1.bn.bias, neck.1.level_convs.0.1.bn.running_mean, neck.1.level_convs.0.1.bn.running_var, neck.1.level_convs.0.2.conv.weight, neck.1.level_convs.0.2.bn.weight, neck.1.level_convs.0.2.bn.bias, neck.1.level_convs.0.2.bn.running_mean, neck.1.level_convs.0.2.bn.running_var, neck.1.level_convs.0.3.conv.weight, neck.1.level_convs.0.3.bn.weight, neck.1.level_convs.0.3.bn.bias, neck.1.level_convs.0.3.bn.running_mean, neck.1.level_convs.0.3.bn.running_var, neck.1.level_convs.0.4.conv.weight, neck.1.level_convs.0.4.bn.weight, neck.1.level_convs.0.4.bn.bias, neck.1.level_convs.0.4.bn.running_mean, neck.1.level_convs.0.4.bn.running_var, bbox_head.cls_convs.0.conv.weight, bbox_head.cls_convs.0.bn.weight, bbox_head.cls_convs.0.bn.bias, bbox_head.cls_convs.0.bn.running_mean, bbox_head.cls_convs.0.bn.running_var, bbox_head.cls_convs.1.conv.weight, bbox_head.cls_convs.1.bn.weight, bbox_head.cls_convs.1.bn.bias, bbox_head.cls_convs.1.bn.running_mean, bbox_head.cls_convs.1.bn.running_var, bbox_head.cls_convs.2.conv.weight, bbox_head.cls_convs.2.bn.weight, bbox_head.cls_convs.2.bn.bias, bbox_head.cls_convs.2.bn.running_mean, bbox_head.cls_convs.2.bn.running_var, bbox_head.cls_convs.3.conv.weight, bbox_head.cls_convs.3.bn.weight, bbox_head.cls_convs.3.bn.bias, bbox_head.cls_convs.3.bn.running_mean, bbox_head.cls_convs.3.bn.running_var, bbox_head.reg_convs.0.conv.weight, bbox_head.reg_convs.0.bn.weight, bbox_head.reg_convs.0.bn.bias, bbox_head.reg_convs.0.bn.running_mean, bbox_head.reg_convs.0.bn.running_var, bbox_head.reg_convs.1.conv.weight, bbox_head.reg_convs.1.bn.weight, bbox_head.reg_convs.1.bn.bias, bbox_head.reg_convs.1.bn.running_mean, bbox_head.reg_convs.1.bn.running_var, bbox_head.reg_convs.2.conv.weight, bbox_head.reg_convs.2.bn.weight, bbox_head.reg_convs.2.bn.bias, bbox_head.reg_convs.2.bn.running_mean, bbox_head.reg_convs.2.bn.running_var, bbox_head.reg_convs.3.conv.weight, bbox_head.reg_convs.3.bn.weight, bbox_head.reg_convs.3.bn.bias, bbox_head.reg_convs.3.bn.running_mean, bbox_head.reg_convs.3.bn.running_var, bbox_head.retina_cls.weight, bbox_head.retina_cls.bias, bbox_head.retina_reg.weight, bbox_head.retina_reg.bias, bbox_head.retina_iou.weight, bbox_head.retina_iou.bias

Traceback (most recent call last):
File "tools/trainval.py", line 65, in
main()
File "tools/trainval.py", line 61, in main
trainval(cfg, distributed, logger)
File "/content/drive/My Drive/Colab Notebooks/vedadet/vedadet/assembler/trainval.py", line 86, in trainval
looper.start(cfg.max_epochs)
File "/content/drive/My Drive/Colab Notebooks/vedadet/vedacore/loopers/epoch_based_looper.py", line 29, in start
self.epoch_loop(mode)
File "/content/drive/My Drive/Colab Notebooks/vedadet/vedacore/loopers/epoch_based_looper.py", line 15, in epoch_loop
for idx, data in enumerate(dataloader):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 359, in iter
return self._get_iterator()
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 305, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 944, in init
self._reset(loader, first_iter=True)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 975, in _reset
self._try_put_index()
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1209, in _try_put_index
index = self._next_index()
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 512, in _next_index
return next(self._sampler_iter) # may raise StopIteration
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/sampler.py", line 226, in iter
for idx in self.sampler:
File "/content/drive/My Drive/Colab Notebooks/vedadet/vedadet/datasets/samplers/group_sampler.py", line 39, in iter
indices = np.concatenate(indices)
File "<array_function internals>", line 6, in concatenate
ValueError: need at least one array to concatenate

@HiImBug HiImBug changed the title Can't train Can't train in colab Oct 5, 2021
@HiImBug
Copy link
Author

HiImBug commented Oct 5, 2021

I test so there is one folder image and annotations
image
image
image

@HiImBug
Copy link
Author

HiImBug commented Oct 5, 2021

self.group_sizes = []
('self.flag', array([], dtype=int64))
what is self.flag = self.dataset.flag in line 91 /vedadet/vedadet/datasets/samplers/group_sampler.py
i can't find any field flag in file config data tinaface_r50_fpn_bn.py or tinaface_r50_fpn_gn_dcn.py
in File "/content/drive/My Drive/Colab Notebooks/vedadet/vedadet/datasets/samplers/group_sampler.py", line 41, in iter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant