Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

documentation improvements #5

Open
idallen opened this issue Oct 16, 2022 · 2 comments
Open

documentation improvements #5

idallen opened this issue Oct 16, 2022 · 2 comments

Comments

@idallen
Copy link

idallen commented Oct 16, 2022

This is a wonderful tool. I couldn't figure out the whole big Sarracenia thing, but I got PySarra working in minutes. Things that would have enabled me to get things working in seconds instead of minutes, along with some open questions that documentation would help solve:

  1. Explain that this program has the http://dd.weather.gc.ca/ host name hard-coded into it and it won't work for other hosts.
  2. Explain that a topic has to be a folder name, not a file name. You will get notifications for changes to every file in the folder, and you have to pick the ones of interest to you. I presume there is no way to ask for changes to just one file in a folder, and that is why we need the regexp to select the messages we want to see?
  3. Give an example regexp. The ones I've tried don't appear to work. Are they anchored, or not? [Answer: They are anchored to the beginning of the message using Python's regex.match() method.]
  4. Write an example processor that fetches the URL and saves it into a local file. (This is especially helpful for those of us who don't speak Python very well yet.)
    What a great program!
@JohnTheNerd
Copy link
Owner

thanks for your suggestions!

  • I wasn't aware of any other valid Sarracenia servers, I could create an optional config entry for such use cases and default to amqps://dd.weather.gc.ca if nothing is provided?
  • technically, AMQP supports using the # character in the routing key as a wildcard, but I never managed to get it working. hence I do not know of a way to filter for individual files - I'll document that accordingly. if you get it working, please let me know and/or submit a pull request!
  • the regex is simply compiled at startup and passed to re.match() alongside the entire message body - I'll document that too, and I'll create another entry in the example config that uses a regex
  • I'm a little worried about automatically downloading files, for two reasons:
    • where do I save it? if this is not configurable, it's not going to be very useful for most. if it is configurable, it needs to be in the config file, requiring processor-specific configuration
    • since this tool is explicitly meant to abstract away the complexity and make it effort-free to use, misconfiguration is a very real possibility and (if auto-downloading is implemented) could easily result in unnecessary load on the remote server

@idallen
Copy link
Author

idallen commented Oct 17, 2022

I didn't know that there is only one Sarracenia server, or that your code is tied to that system. Yes, putting in an optional config entry some time in the future would be useful for people to experiment with their own servers. No rush.
I thought the regex was floating, which is why mine didn't work until I put .* in the front. I think re.search() would be more intuitive.
Aye, never mind about the downloading of files. I realized that what I really wanted was to be able to parse the currentConditions XML and output a CSV line with selected information each hour, and while I was going to pipe the URL into Perl and do it there, I figured I'd learn some Python and do it all in the process() function. I have a working version that does what I want, but I don't really know Python yet so it's quite ad-hoc.
MSC Datamart has the hourly currentConditions updates, and there are also "bulk" files that contain historical data (which I can't find on MSC Datamart), and there is an awkward one-day gap between the two. I can't find a way to download the past 24 hours of hourly currentConditions, which is why I'm having to use your program to collect it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants