Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LongModelForMaskedLM should inherit from MODEL #2

Open
eric00hahn opened this issue Aug 4, 2021 · 0 comments
Open

LongModelForMaskedLM should inherit from MODEL #2

eric00hahn opened this issue Aug 4, 2021 · 0 comments

Comments

@eric00hahn
Copy link

eric00hahn commented Aug 4, 2021

Hi,
in scripts/run_long_mlm.py the class LongModelForMaskedLM should inherit from MODEL

class LongModelForMaskedLM(MODEL):
     def __init__(self, config):
         super().__init__(config)
         print(f"\n{color.YELLOW}Converting models to Longformer is currently only tested for RoBERTa like architectures.{color.END}")
         for i, layer in enumerate(self.roberta.encoder.layer):
             layer.attention.self = LongModelSelfAttention(config, layer_id=i)

instead of

class LongModelForMaskedLM():
     def __init__(self, config):
         super().__init__(config)
         print(f"\n{color.YELLOW}Converting models to Longformer is currently only tested for RoBERTa like architectures.{color.END}")
         for i, layer in enumerate(self.roberta.encoder.layer):
             layer.attention.self = LongModelSelfAttention(config, layer_id=i)

Also I think the regexes in

 def is_roberta_based_model(model_name: str) -> str:
     """Validate if the model to pre-train is of roberta architecture."""
     if re.search("(?i)(xlm)\D(roberta)", model_name) == 'xlm-roberta':
         model_name = 'xlm-roberta'
     elif re.search("(?i)(roberta)", model_name) == 'roberta':
         model_name = 'roberta'
     else:
         model_name = 'none'
     return model_name

are broken because for 'xlm-roberta-base' the function returns 'none'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant