new spider settings to prevent blacklist yet maintain near identical runtimes #198
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
findings from new crawler settings
monday scheduled crawl
w/o new settings ran on Oct 2nd, Duration: 6 minutes and 13 seconds
with new settings, Duration: 6 minutes and 10 seconds
tuesday
current state ran on Oct 3rd, Duration: 69 minutes and 31 seconds
with new settings, Duration: 70 minutes and 22 seconds and CNSS did not have previous hashes, where tuesday did, meaning the time would have been shorter.
Additionally added empty oneoff.txt file to prevent the need to recreate each time a new branch is made for oneoff testing