Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to customize BaseFeaturizer pbar by passing dict #893

Merged
merged 5 commits into from
Dec 7, 2022
Merged

Add ability to customize BaseFeaturizer pbar by passing dict #893

merged 5 commits into from
Dec 7, 2022

Conversation

janosh
Copy link
Member

@janosh janosh commented Nov 18, 2022

I would like to set

tqdm(lst, position=0, leave=True)

so that slurm log files aren't filled with 1000s of pbar lines but just the last iteration's line showing the total wall time.

With this PR, that'll be possible by passing

df_features = featurizer.featurize_dataframe(
    df, input_col, pbar=dict(position=0, leave=True)
)

Old logs

...
MultipleFeaturizer: 100%|█████████▉| 12859/12874 [1:27:44<00:07,  2.14it/s]
MultipleFeaturizer: 100%|█████████▉| 12860/12874 [1:27:44<00:06,  2.04it/s]
MultipleFeaturizer: 100%|█████████▉| 12861/12874 [1:27:45<00:06,  2.00it/s]
MultipleFeaturizer: 100%|█████████▉| 12862/12874 [1:27:45<00:06,  1.97it/s]
MultipleFeaturizer: 100%|█████████▉| 12863/12874 [1:27:46<00:05,  1.95it/s]
MultipleFeaturizer: 100%|█████████▉| 12864/12874 [1:27:46<00:05,  1.94it/s]
MultipleFeaturizer: 100%|█████████▉| 12865/12874 [1:27:47<00:04,  1.93it/s]
MultipleFeaturizer: 100%|█████████▉| 12866/12874 [1:27:47<00:04,  1.92it/s]
MultipleFeaturizer: 100%|█████████▉| 12867/12874 [1:27:48<00:03,  1.98it/s]
MultipleFeaturizer: 100%|█████████▉| 12868/12874 [1:27:48<00:03,  1.95it/s]
MultipleFeaturizer: 100%|█████████▉| 12869/12874 [1:27:49<00:02,  2.25it/s]
MultipleFeaturizer: 100%|█████████▉| 12870/12874 [1:27:52<00:05,  1.35s/it]
MultipleFeaturizer: 100%|█████████▉| 12871/12874 [1:27:53<00:03,  1.03s/it]
MultipleFeaturizer: 100%|█████████▉| 12872/12874 [1:27:53<00:01,  1.26it/s]
MultipleFeaturizer: 100%|█████████▉| 12873/12874 [1:27:53<00:00,  1.59it/s]
MultipleFeaturizer: 100%|██████████| 12874/12874 [1:27:53<00:00,  1.77it/s]
MultipleFeaturizer: 100%|██████████| 12874/12874 [1:27:53<00:00,  2.44it/s]

New logs

MultipleFeaturizer: 100%|██████████| 12874/12874 [1:27:53<00:00,  2.44it/s]

@ardunn
Copy link
Contributor

ardunn commented Nov 19, 2022

Looks ok to me, think we can get most of the tests passing though?

@janosh
Copy link
Member Author

janosh commented Nov 19, 2022

I think they're all timing out after 6h. Not sure there's anything I can do about that. Don't think any of my changes would break existing tests.

@janosh janosh merged commit 93cde56 into hackingmaterials:main Dec 7, 2022
@janosh janosh deleted the featurizer-customize-pbar branch December 7, 2022 02:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants