Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions on code (ATL) #10

Open
anjugopinath opened this issue Aug 26, 2021 · 12 comments
Open

Questions on code (ATL) #10

anjugopinath opened this issue Aug 26, 2021 · 12 comments

Comments

@anjugopinath
Copy link

anjugopinath commented Aug 26, 2021

Could you answer the below questions please?

  1. What do the following keywords in lib/ult/ult.py indicate?

i) Neg_select
ii) pos_h_boxes
iii) neg_h_boxes
iv) pattern_type
v) pattern_channel

  1. In which file do you mention the input path for the training dataset? In this case,
    'HOI-CL/Data/hico_20160224_det/images/train2015/' ?
    I executed python tools/Train_ATL_HOCO.py

Thank You.

@zhihou7
Copy link
Owner

zhihou7 commented Aug 26, 2021

Sorry for the confusing variable name.

Following iCAN, we augment the no_interaction samples. E.g., if the image includes a person, an apple and a desk, the Dataset has annotated <person, eat, apple>, but not annotated <person, no_interaction, desk>. Then, we augment the data via adding the pair <person, no_interaction, desk>, i.e., negative samples.

i) Neg_select: This is following the code of iCAN (https://arxiv.org/abs/1808.10437). Neg_select is related to Pos_augment. Pos_augment is the number of pos HOI samples (annotated samples), while Neg_augment is the number of negative HOI samples (augmented no-interaction samples, i.e. the no_interaction samples that do not exist in the annotation). augment means we augment the box via random crop.
ii) pos_h_boxes: This is corresponding to pos_augment.
iii) neg_h_boxes: This is the human boxes of negative interaction samples (augmented no_interaction samples)
iv) pattern_type: This is useless. pattern_type is fix to 0. This is redundant code.
v) pattern_channel: This is useless. In the released code, pattern_channel is fix to 2.

We use the path in utl.py, Test_HICO.py (test), tools.py (co-occurrence matrix). You can search "cfg.DATA_DIR" in these files.

Sorry for the confusing code. Feel free to ask questions if you have any questions.

@anjugopinath
Copy link
Author

Hi,

Thank You for the quick response!
I had some more questions. Thank You in advance.

  1. I put a breakpoint inside this function. But, it wasn't hit. What is it used for?
    def Generate_action_HICO(action_list):
    action_ = np.zeros(600)
    for GT_idx in action_list:
    action_[GT_idx] = 1
    action_ = action_.reshape(1, 600)
    return action_

If I need to train ATL on a new dataset, are the 2 .pkl files listed below the only 2 additional input files that are required apart from the images itself? Also, could you explain what the annotations are?

  1. Trainval_Neg_HICO.pkl
    image

Trainval_GT_HICO.pkl
image

Thank You.

@zhihou7
Copy link
Owner

zhihou7 commented Aug 27, 2021

The two files are annotations.

Trainval_Neg_HICO.pkl is the annotation for negative samples (augment unlabeled no_interaction)
Trainval_GT_HICO.pkl is the annotation for positive samples (annotated instances).

in Trainval_Neg_HICO.pkl, the key is the image id, the value is <image_id, HOI_category, human_box, object_box, ...> . We do not use the other number.

Trainval_GT_HICO.pkl is a list. each item represent an annotation: <image_id, HOI_category, human_box, object_box, ...> We do not use the other number.

@anjugopinath
Copy link
Author

anjugopinath commented Sep 1, 2021

Hi,

In this image:
image
The person is interacting with the glass bottle and the pipe.
There are other bottles, a grater, a scrubber, knife etc. Should I add 'no_interaction' annotations for every item for which there is no interaction? In that case, should I say no_interaction for bottle? Since, there is another bottle that is interacting.

@zhihou7
Copy link
Owner

zhihou7 commented Sep 3, 2021

Yes, for the other bottle (no interaction and no annotation), we currently include this no_interaction in Trainval_Neg_HICO.pkl when we have the object boxes (bottle), that's negative samples.

@anjugopinath
Copy link
Author

Hi,

Thank You for the reply. Can I train the model if I do not have Trainval_Neg annotations for the new dataset I am using?

@zhihou7
Copy link
Owner

zhihou7 commented Sep 9, 2021

You can train the model. But you might suffer from imbalance/label missing problem. For affordance recognition, I find the effect of removing negative samples is limited.

@anjugopinath
Copy link
Author

anjugopinath commented Sep 14, 2021

Hi,
Thank You for the reply.
Can you answer the below questions please?
1.
image

image

Can you explain the contents of the above two .pkl files? Specifically, the mapping, for ex- 0:0, 1:0 etc.
Also, how are they related to the below 5 files?

  1. hico_list_obj.txt
    image

  2. hico_list_vb.txt
    image

  3. hico_list_hoi.txt
    image

  4. 24_verbs.txt
    image

  5. 21_verbs.txt
    image

  6. What is prior_mask.pkl used for?

  7. Is hoi_coco_list_num.txt required when training for ATL?

Thank You.

@zhihou7
Copy link
Owner

zhihou7 commented Sep 14, 2021

hoi_to_obj.pkl and hoi_to_verb.pkl store the co-occurence matrix in HICO-DET, i.e. which object and verb are corresponding to a HOI. the name of the id in hoi_to_obj.pkl and hoi_to_verb.pkl are provides in hico_list_obj.txt, hico_list_vb.txt, hico_list_hoi.txt respectively. Noticeabley, the id in pkl files starts from 0 while the id in txt files starts from 1.

24_verbs.txt and 21_verbs.txt illustrate the name of verbs in V-COCO (HOI-COCO).

prior_mask.pkl is similar to hoi_to_obj.pkl, but is for V-COCO. prior_mask.pkl is providd in previous works.

hoi_coco_list_num.txt is not required for training ATL. hoi_coco_list_num.txt just demonstrates the long-tailed distribution.

@anjugopinath
Copy link
Author

Thank You for your reply. Do I need to create 24_verbs.txt,21_verbs.txt and prior_mask.pkl for training ATL?

@anjugopinath
Copy link
Author

anjugopinath commented Sep 20, 2021

1.What is the difference between
self.num_classes = 600
and
self.compose_num_classes = 600?
in ResNet101_HICO.py inside class ResNet101()

  1. Are the weights in self.HO_weight (size 1 by 600) randomly initialized?

I saw this comment:
"We copy from TIN. calculated by log(1/(n_c/sum(n_c)) c is the category and n_c is
the number of positive samples."
What is TIN?

@zhihou7
Copy link
Owner

zhihou7 commented Sep 21, 2021

"num_classes" are annotated class number, while "self.compose_num_classes" is how many types of HOIs you want to compose (that can be larger than self.num_classes or smaller than self.num_classes).

TIN is Transferable Interactiveness knowledge for Human-Object Interaction Detection. The weights aim at balancing the data. it is a traditional re-balance strategy for imbalance data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants