Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WX-1333 Improve logging visibility for load management #7253

Merged
merged 12 commits into from
Nov 16, 2023
Merged

Conversation

aednichols
Copy link
Collaborator

  • Add load status logging where previously there was none for PipelinesApiRequestManager.scala
  • Add high load logging to IOActor, which previously only had back-to-normal logging
  • Add load logging to ServiceRegistryActor which collects the load messages from their various sources and routes them to the sinks like JobTokenDispenserActor

@aednichols aednichols requested a review from a team as a code owner November 15, 2023 03:13
val load = if (workQueue.size > LoadConfig.PAPIThreshold) HighLoad else NormalLoad
val load = if (workQueue.size > LoadConfig.PAPIThreshold) {
log.warning(s"PAPI Request Manager notifying HighLoad with queue size ${workQueue.size} exceeding limit of ${LoadConfig.PAPIThreshold}")
HighLoad
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is run regularly and not just when the status changes. Do you think it would be possible to log this only when the state changes? That could reduce the number of messages to where we could log the NormalLoad message as INFO, to see those without deliberately turning on debug logging.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid idea, implemented.

@@ -96,7 +96,13 @@ class PipelinesApiRequestManager(val qps: Int Refined Positive, requestWorkers:
}

def monitorQueueSize() = {
val load = if (workQueue.size > LoadConfig.PAPIThreshold) HighLoad else NormalLoad
val load = if (workQueue.size > LoadConfig.PAPIThreshold) {
log.warning(s"PAPI Request Manager notifying HighLoad with queue size ${workQueue.size} exceeding limit of ${LoadConfig.PAPIThreshold}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why this is WARN and the new IoActor message is INFO - is this one scarier?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think PAPI backups are much less common and therefore more suspicious. That said, I am making the IoActor one a warning too for consistency.

@@ -107,6 +110,15 @@ class ServiceRegistryActor(globalConfig: Config) extends Actor with ActorLogging
sender() ! ServiceRegistryFailure("Message is not a ServiceRegistryMessage: " + fool)
}

private def debugLogLoadMessages(msg: ServiceRegistryMessage, sender: ActorRef): Unit = {
msg match {
Copy link
Contributor

@THWiseman THWiseman Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional thought: Maybe we'd prefer to be logging the LoadMetrics themselves (or some toString method on them) rather than making custom messages based on their loadLevel here. Slightly more robust in the event that we add new LoadMetric.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I wrote the code this way when I wanted different log levels for the two, but they're both debug now. Implemented.

> amm
Loading...
Welcome to the Ammonite Repl 2.5.11 (Scala 3.2.2 Java 11.0.20.1)
@ trait LoadLevel 
defined trait LoadLevel

@ case object NormalLoad extends LoadLevel 
defined object NormalLoad

@ val x = NormalLoad 
x: ammonite.$sess.cmd1.NormalLoad.type = NormalLoad

@ x.toString 
res3: String = "NormalLoad"

@aednichols aednichols merged commit 14c3184 into develop Nov 16, 2023
33 of 35 checks passed
@aednichols aednichols deleted the aen_wx_1333 branch November 16, 2023 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants