Put hydra advances query result in a public s3 bucket somewhere on a cron job #405

lf- · 2024-03-23T06:49:26Z

Is your feature request related to a problem? Please describe.

We have no good way of bisecting hydra anymore since the grahamc service died a year ago. But we have the data, it is just locked behind the hydra API being busted, and can relatively easily simply be obtained from the database. Although there is definitely merit in doing the https://pad.lassul.us/opendata-projects and making an open hydra read replica, this requires effort.

I have written a tool https://github.com/lf-/necktangler, and a ~5s sql query to run against hydra, where the result could simply be put in a s3 bucket on a nightly cron job and we could call it a day.

Describe the solution you'd like

Put this query in a cron job somewhere with hydra DB access, and run it with:

psql $SOME_CONNECTION_DETAILS -v job=tested -v project=nixos -v jobset=trunk-combined --csv -f hydra-query.sql > data-nixos-unstable.csv
psql $SOME_CONNECTION_DETAILS -v job=tested -v project=nixos -v jobset=unstable-small --csv -f hydra-query.sql > data-nixos-unstable-small.csv
psql $SOME_CONNECTION_DETAILS -v job=unstable -v project=nixpkgs -v jobset=trunk --csv -f hydra-query.sql > data-nixpkgs-unstable.csv
psql $SOME_CONNECTION_DETAILS -v job=darwin-tested -v project=nixpkgs -v jobset=nixpkgs-23.11-darwin --csv -f hydra-query.sql > data-nixpkgs-23.11-darwin.csv
psql $SOME_CONNECTION_DETAILS -v job=tested -v project=nixos -v jobset=release-23.11 --csv -f hydra-query.sql > data-nixos-23.11.csv
psql $SOME_CONNECTION_DETAILS -v job=tested -v project=nixos -v jobset=release-23.11-small --csv -f hydra-query.sql > data-nixos-23.11-small.csv
# s3 upload to some bucket somewhere

select
    jsei.revision as sha, b.timestamp as advance_ts, concat(j.project, ':', j.name, ':', b.job) as hydra_job, b.id as hydra_build_id
    from jobsets j
        inner join builds b on j.id = b.jobset_id
        inner join jobsetevalmembers jsem on jsem.build = b.id
        inner join jobsetevals jse on jsem.eval = jse.id
        inner join jobsetevalinputs jsei on jsei.eval = jse.id
    where
    b.job = :'job' and
    j.project = :'project' and
    j.name = :'jobset' and
    jsei.name = 'nixpkgs' and
    b.finished = 1 and
    -- arbitraily chosen start in time; there are some bogus commit IDs from
    -- 2014 or so, which we would rather not deal with
    b.timestamp > extract(epoch from '2018-01-01T00:00:00Z'::timestamptz)
    order by b.timestamp asc

(nixpkgs-unstable is 8571 rows, nixos-unstable is 3000 rows, from several years. imo we have no reason to shard this any time soon)

Describe alternatives you've considered

Going and building a public read replica of hydra (sounds like effort).

Being unable to bisect hydra.

Additional context

The text was updated successfully, but these errors were encountered:

RaitoBezarius · 2024-03-23T14:52:04Z

An alternative could be to add this particular API endpoint in the Hydra code and deploy it fast.

lf- added the enhancement label Mar 23, 2024

lf- changed the title ~~Put hydra advances query result in a public s3 bucket somewhere~~ Put hydra advances query result in a public s3 bucket somewhere on a cron job Mar 23, 2024

mweinelt added the new-service Request for a new service to be ran on the NixOS infra label Mar 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Put hydra advances query result in a public s3 bucket somewhere on a cron job #405

Put hydra advances query result in a public s3 bucket somewhere on a cron job #405

lf- commented Mar 23, 2024 •

edited

Loading

RaitoBezarius commented Mar 23, 2024

Put hydra advances query result in a public s3 bucket somewhere on a cron job #405

Put hydra advances query result in a public s3 bucket somewhere on a cron job #405

Comments

lf- commented Mar 23, 2024 • edited Loading

RaitoBezarius commented Mar 23, 2024

lf- commented Mar 23, 2024 •

edited

Loading