Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Remove indentation from statistics logging and print the data in tables #322

Merged
merged 5 commits into from
Aug 6, 2024

Conversation

TymeeK
Copy link
Contributor

@TymeeK TymeeK commented Jul 17, 2024

Description

The purpose of the PR is to fix the indentation of statistics logging. It was originally 8 space indentation but now it is changed to be all on one line.

Issues

This fixes issue #306

Testing

I went through the unit test files and logged out what the output would be for the statistics loggin.

Example:

[crawlee.basic_crawler.basic_crawler] INFO  Final request statistics: {"requests_finished": 3, "requests_failed": 0, "retry_histogram": [3], "request_avg_failed_duration": null, "request_avg_finished_duration": 0.000984, "requests_finished_per_minute": 7348, "requests_failed_per_minute": 0, "request_total_duration": 0.002951, "requests_total": 3, "crawler_runtime": 0.024498}

Checklist

  • Changes are described in the CHANGELOG.md
  • CI passed

@B4nan
Copy link
Member

B4nan commented Jul 18, 2024

FYI we had a quick poll about this and in the end we agreed we want to print this as a formatted table instead of JSON, so we won't be merging this one.

Something like the following (but with snake_case instead of camelCase - this snippet was built based on the JS version)

┌──────────────────────────────────┬────────┐
│ requestsFinished                 │ 59     │
│ requestsFailed                   │ 0      │
│ retryHistogram                   │ [ 59 ] │
│ requestAvgFailedDurationMillis   │ null   │
│ requestAvgFinishedDurationMillis │ 1071   │
│ requestsFinishedPerMinute        │ 67     │
│ requestsFailedPerMinute          │ 0      │
│ requestTotalDurationMillis       │ 63215  │
│ requestsTotal                    │ 59     │
│ crawlerRuntimeMillis             │ 53197  │
└──────────────────────────────────┴────────┘

@janbuchar
Copy link
Collaborator

We should be able to print tables using rich.Table - see https://rich.readthedocs.io/en/stable/tables.html

@fnesveda fnesveda added the t-tooling Issues with this label are in the ownership of the tooling team. label Jul 19, 2024
@TymeeK
Copy link
Contributor Author

TymeeK commented Jul 20, 2024

Thanks for your feedback! I am brand new to open source. I did make some furthers changes by using a table to print it out as a formatted table. Should I open a new PR or update this one?

@janbuchar
Copy link
Collaborator

Thanks for your feedback! I am brand new to open source. I did make some furthers changes by using a table to print it out as a formatted table. Should I open a new PR or update this one?

Awesome! Just commit the changes you made and push it to your fork and the PR will be updated automatically

src/crawlee/basic_crawler/basic_crawler.py Outdated Show resolved Hide resolved
src/crawlee/statistics/models.py Outdated Show resolved Hide resolved
@janbuchar janbuchar linked an issue Jul 23, 2024 that may be closed by this pull request
@janbuchar janbuchar changed the title Fix: Removed indentation from statistics logging fix: Removed indentation from statistics logging Aug 6, 2024
@janbuchar janbuchar changed the title fix: Removed indentation from statistics logging fix: Remove indentation from statistics logging Aug 6, 2024
@janbuchar janbuchar requested a review from vdusek August 6, 2024 13:06
@janbuchar janbuchar changed the title fix: Remove indentation from statistics logging fix: Remove indentation from statistics logging and print the data in tables Aug 6, 2024
Copy link
Collaborator

@vdusek vdusek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@janbuchar janbuchar merged commit 359b515 into apify:master Aug 6, 2024
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t-tooling Issues with this label are in the ownership of the tooling team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Better format statistics logging
5 participants