Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic "predicate query" to decide whether to collect a metric? #41

Closed
ringerc opened this issue Jun 11, 2024 · 2 comments
Closed

Generic "predicate query" to decide whether to collect a metric? #41

ringerc opened this issue Jun 11, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@ringerc
Copy link
Collaborator

ringerc commented Jun 11, 2024

I'm looking at adopting pg_exporter as a potential replacement for the CloudNative-PG built-in scraper I currently use. It looks pretty impressive other than the lack of test cover and test facilities.

One of the few things it doesn't do that the CNPG scraper does is permit the use of a generic boolean expression query to decide "should I run this metric query or not" (added by cloudnative-pg/cloudnative-pg#4503). pg_exporter has tags, but the tags must be specified from the outside by whatever provisions the service, and apply to all DBs being scraped.

I'm wondering if you'd be open to submission of a patch to implement that, assuming that my initial PoC with this scraper finds it otherwise a good fit for my needs.

I need to express things like "collect this metric set if the foo extension is installed and is at least version 2.1", "collect this metric only when running on (some specific vendor postgres flavour)" or "only try to collect this if this specific table exists and has this specific column".

This isn't possible in plain postgres SQL dialect due to its lack of bare procedural blocks and the inability of a DO block to return a result set. Hacks with refcursors can be used to get a result from a DO block but only with multiple statements, which this scraper doesn't support (and probably shouldn't as it opens a whole can 'o worms).

I propose an extension to the syntax like:

predicate_queries:
  - predicate_query: |
      SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_schema = 'pg_catalog' AND table_name = 'foo')
query: |
  SELECT a, b, c FROM pg_catalog.some_vendor_extension;

The predicate queries would be run in the order specified. If any returns zero rows or a 1-column boolean false, the metric collection is skipped. If it returns true, collection proceeds to check the next predicate (if any) or to collect the metric. A multi-column result or multi-row result logs an error the first time it runs then disables the metric for the lifetime of the process.

The purpose of multiple queries is to permit things like (a bit contrived):

predicate_queries:
  - predicate_query: |
      SELECT 1 FROM pg_extension WHERE extname = 'foo';
  - predicate_query: |
      SELECT foo_is_enabled FROM foo.foo_is_enabled()

where it's first necessary to know that some database object exists before attempting to access it, since Pg requires that names must resolve at parse/plan time so you can't lazy-bind possible-undefined names in e.g. a CASE.

Predicate query results would support the ttl key and if set, use the cache. If the main metric query reports an error, that would automatically invalidate the predicate query cache. E.g.

predicate_queries:
  - predicate_query: |
      SELECT 1 FROM pg_extension WHERE extname = 'foo';
    ttl: 300

I'm also tempted to add separate positive and negative cache TTLs since it's common to have a metric initially not scrape-able due to an extension not yet being created, but once it's scrape-able once it stays that way. It's probably unnecessary though, and exponential backoff might be better if it proves to be required.

If I get the chance I may be able to add named query indirection so one predicate query set can be used for multiple scrapers to reduce verbosity. I'm not sure if it'll be worth it, will see.

Thoughts? The in-house implementation of this logic turned out to be very simple to do, and should translate well to this scraper.

@Vonng
Copy link
Owner

Vonng commented Jun 15, 2024

I think predicate query is an excellent idea! It could bring a lot of flexibility.

Here are some of my rough thoughts:

  • Should we allow multiple Predicate Queries? I can understand the point of adding multiple check queries. In that case, can we condense all the Predicate Queries into a single one using SQL's Boolean short-circuit logic? Since readability isn't important here, I think using a single Predicate Query might offer better performance and maintainability.
  • The changes to the config file should be forward-compatible. We could add a predicate field to the Collector definition. If this field is present, the corresponding predicate query will be executed, and branch logic will be applied. Personally, I feel that maintaining a separate set of reusable Predicate Queries might be redundant because most default Collectors should use the existing mechanism.
  • TTL is necessary, but we could use a fixed value like 100ms or the same as the main query. The check should not take too long, so a fixed small value could be ok.

Of course, these are just my rough ideas. I’m very open to further discussion, and I would greatly welcome any implementations or contributions. 😊

@Vonng Vonng added the enhancement New feature or request label Jun 15, 2024
@ringerc
Copy link
Collaborator Author

ringerc commented Jun 17, 2024

I've written a proof of concept PR for this: #47

@Vonng Vonng closed this as completed in 03465d7 Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants