Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyperparameters in DrQA - Performance not as described #10

Open
gustavhartz opened this issue Nov 12, 2021 · 0 comments
Open

Hyperparameters in DrQA - Performance not as described #10

gustavhartz opened this issue Nov 12, 2021 · 0 comments

Comments

@gustavhartz
Copy link

Thanks for sharing your work. Tried to run the DrQA notebook, which has excellent descriptions by the way. Just tried to spin up an Azure ML instance Standard_NC6 (6 cores, 56 GB RAM, 380 GB disk) and GPU - 1 x NVIDIA Tesla K80, to see if I could replicate the results you list after 5 epochs, but get terrible performance. I suspect that for your training you might have used a different set of hyperparameters.

The notebook contains the following:

HIDDEN_DIM = 128
EMB_DIM = 300
NUM_LAYERS = 3
NUM_DIRECTIONS = 2
DROPOUT = 0.3

optimizer = torch.optim.Adamax(model.parameters())

I suspect that it might be different LR from the default learning rate of Adamax? Hope that you still remember something about the configuration :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant