Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd,server: Send stream health events to a metadata queue #1946

Merged
merged 26 commits into from
Aug 13, 2021

Conversation

victorges
Copy link
Member

@victorges victorges commented Jul 6, 2021

What does this pull request do? Explain your changes. (required)
This is the first step for implementing a stream health dashboard. The idea is
to send events about stream transcoding to an AMQP queue (exchange). This
queue will later be processed by another (go) service which will aggregate the
information so it can be displayed to the final user.

Specific updates (required)
In the order of the commits:

  • Add some code on broadcaster.go to gather transcode attempts information
  • Create a higher level RabbitMQ (producer) client [1] under server/event package [2]
  • Configure an instance of a producer depending on CLI flags
  • Actually send stream healh transcode events

[1] In the ideal world, we would just use an existing library that handled
all the protocol complexity, bur I couldn't find any other library with enough
developer activity to be safe enough to use.
[2] Is it ok to create sub-packages like that, or should it all be under server? Or
maybe somewhere else and have server depend on it?

How did you test each of these updates (required)

  • Currently I've only tested that the existing tests don't break with ./test.sh
  • Still pending to actually add some tests to this new queue producer/stream
    health code, as well as actually test the whole thing running.
  • (Is it runnable locally?)

Does this pull request close any open issues?
Implements #1939

Checklist:

@victorges victorges changed the base branch from master to vg/fix/webhook-logs July 6, 2021 00:57
Base automatically changed from vg/fix/webhook-logs to master July 6, 2021 20:44
@victorges victorges force-pushed the vg/feat/metadata-queue branch 2 times, most recently from bdf9c65 to b104da0 Compare July 7, 2021 22:18
@victorges victorges requested a review from yondonfu July 7, 2021 22:33
@victorges victorges requested a review from jailuthra July 7, 2021 22:36
@victorges victorges force-pushed the vg/feat/metadata-queue branch 13 times, most recently from 05c14bf to cafd5d4 Compare July 10, 2021 00:33
@victorges victorges changed the title WIP: feat/metadata queue cmd,server: Send stream health events to a metadata queue Jul 10, 2021
@victorges victorges marked this pull request as ready for review July 10, 2021 00:43
@victorges victorges requested a review from iameli July 10, 2021 00:45
Copy link
Member

@iameli iameli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about this the more I think that we should probably be pushing an event for every transcode attempt and failure. It'll be good for the Stream Health service to be able to evaluate how things are performing in real-time, not only after multiple retries are happening. (Though there oughta be some sort of correlation ID so it can figure out which attempts go with which successes, I suppose.)

I'll leave review of the actual code structure here to @yondonfu and @jailuthra

cmd/livepeer/livepeer.go Outdated Show resolved Hide resolved
@iameli
Copy link
Member

iameli commented Jul 12, 2021

It occurs to me that we'll probably want to develop this in conjunction with the service that consumes it — inevitably we'll want to make tweaks as everything comes together. So maybe we'll keep this branch open for a little bit if folks don't object.

@victorges victorges force-pushed the vg/feat/metadata-queue branch 3 times, most recently from d03fe63 to 503ca09 Compare July 13, 2021 22:42
 - Include both orch eth address and transcoder URI
 - transcodeAttemptInfo as an out parameter of transcodeSegment
 - named returns+defer trick to avoid repetitions of time&error setting
 - Remove defer from retry loop
 - Fix error field type on transcode info
 - Make transcode attempt info a ret-val instead of ptr

Still keep the named-rets+defer logic though which
I guess was the real deal
The library closes that channel when the channel
is closed as well, so we need to handle it in case
that msg comes before the actual on closed channel.
@victorges
Copy link
Member Author

victorges commented Aug 13, 2021

Yeah weird, let me try rebasing to see if it works.

Edit: Apparently it did 🤔

victorges and others added 15 commits August 13, 2021 18:28
Final timestamp so it doesn`t need to be derived
and byte size of segments.
Also add first character of manifest ID as separate topic segment
It's not really a queue per se, but it's just the easier
way to refer and reason about it.
 - Create the helper stubTestTranscoder to avoid repeating logic
 - Repeat the pattern on other tests to create a single rtmpConnection
   and just keep changing the sessManager on it instead.
 - Some additional guarantees in metadata queue tests
@victorges victorges merged commit 67db52b into master Aug 13, 2021
@victorges victorges deleted the vg/feat/metadata-queue branch August 13, 2021 22:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants