Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop script and throw error if used same cli parameter twice #1823

Merged
merged 5 commits into from
Apr 10, 2023

Conversation

pavel-karatsiuba
Copy link
Contributor

When the user accidentally passes to the command line the same parameter more than one time then he can get unexpected scraper behavior.

This PR is checking parameters and if one parameter is used more than one time then the script stops and an error throws.

fix: #1818

@pavel-karatsiuba pavel-karatsiuba force-pushed the sanitize-double-used-cli-parameter branch from 37c3f85 to df9cbe7 Compare April 1, 2023 11:17
@pavel-karatsiuba
Copy link
Contributor Author

Attention: the yargs parser can take more than one same parameter. It is processed such parameters as an Array of values. So Some of mwoffline parameters can be passed more than one time. These are: 'articleList', 'articleListToIgnore', 'addNamespaces'

@codecov
Copy link

codecov bot commented Apr 1, 2023

Codecov Report

Patch and project coverage have no change.

Comparison is base (8674287) 70.90% compared to head (4ff19c6) 70.90%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1823   +/-   ##
=======================================
  Coverage   70.90%   70.90%           
=======================================
  Files          24       24           
  Lines        2622     2622           
  Branches      594      594           
=======================================
  Hits         1859     1859           
  Misses        657      657           
  Partials      106      106           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@kelson42
Copy link
Collaborator

kelson42 commented Apr 2, 2023

@pavel-karatsiuba Why no review is requested here? Are you over?

But I'm not happy with this PR for many reasons:

  • I hardly understand the relation between this PR and the bug it should fix Duplicate in cmd line arguments are not checked (was: --verbose does not accept empty value) #1818
  • If there is an architecture decision to take, I expect you to wait this is validated. Here you have taken on you to chose solution (1) which is IMO the wrong one. Why you don't have taken solution (2), which is the builtin solution (reading Disallow multiple arguments (array) for individual key yargs/yargs#1318 offers many ways)?
  • Where have you read that any of these arguments --articleList, --articleListToIgnore, --addNamespaces allow multiple occurences? Are you mixing "multiple instances" with "comma separated values"?!
  • AFAIK, there is only one argument which is legitim to be duplicated and this is --format... and this is written in the usage(): Each --format argument....
  • In the test, why using execa which is super high level, expensive, etc... instead of the testing yargs with the right configuration?
  • Error message Parameter ... can't take value "["https://bm.wikipedia.org","https://bm.wikipedia.org"] is cryptic for a non-tech user. You should write Parameter ... can only be used once.

@pavel-karatsiuba
Copy link
Contributor Author

  • Duplicate in cmd line arguments are not checked (was: --verbose does not accept empty value) #1818 here you are using mw command. In the other PR I saw that mw is equal to ./node_modules/.bin/ts-node-esm ./src/cli.ts [email protected] --osTmpDir=/dev/shm --verbose --optimisationCacheUrl=http://s3.us-west-1.wasabisys.com/?bucketName=org-kiwix-dev-mwoffliner&keyId=6JE56A5IWY5Z3H3L3KH6&secretAccessKey=.... So the problem is not about --verbose but about using --verbose twice.
  • Those decisions are equal. But if I use duplicate-arguments-array option then doubled parameter will be silently ignored and used last from the command. My decision is to show the error if you use the same parameter twice.
  • You are right. Only format parameter can be used more than one time. Because of this, we can't use duplicate-arguments-array option.
  • Yes, such tests are not acceptable.
  • I will change the message.

@pavel-karatsiuba
Copy link
Contributor Author

cli.ts is written in such a manner that it can't be tested partially. So before changing tests, the cli.ts should be redesigned. It should be split by a couple of files and inside cli.ts should be only executed one function.

@kelson42
Copy link
Collaborator

kelson42 commented Apr 5, 2023

@pavel-karatsiuba

  • PR is still not assigned nor requested for review... I have basically no clue what is current status.
  • Why not using the check()?
  • Unit test sanitizeDoubleUsedParametersfunction not cli.ts
  • Test the case which should not generate an error which is with many --format

@pavel-karatsiuba
Copy link
Contributor Author

The test was changed.

If I use .check() then we did not get any benefits. But we will get another format of the error message.
We can use .check() to sanitize all parameters, then the format of the error will be the same for all sanitizing errors. But this should be another ticket.

@kelson42 kelson42 force-pushed the sanitize-double-used-cli-parameter branch from 2b702ac to b7ed180 Compare April 10, 2023 10:14
@kelson42 kelson42 self-requested a review April 10, 2023 10:14
@kelson42 kelson42 force-pushed the sanitize-double-used-cli-parameter branch from e7335fe to 4ff19c6 Compare April 10, 2023 11:05
@kelson42 kelson42 merged commit 6b9260a into main Apr 10, 2023
@kelson42 kelson42 deleted the sanitize-double-used-cli-parameter branch April 10, 2023 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Duplicate in cmd line arguments are not checked (was: --verbose does not accept empty value)
2 participants