Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contextual logging #2121

Merged
merged 16 commits into from
Dec 14, 2021
185 changes: 185 additions & 0 deletions clog/clog.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
/*
Package clog provides Context with logging information.
*/
package clog
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about creating a dedicated unit test for this file, like clog_test.go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What it will test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should at least:

  1. Create this context
  2. Use your functions to update it (AddManifestID(), AddOrchSessionID())
  3. Execute messageFromContext() and assert the result

You can think if something else can be tested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks!


import (
"context"
"fmt"
"strconv"
"strings"
"sync"

"github.com/golang/glog"
)

// unique type to prevent assignment.
type clogContextKeyT struct{}

var clogContextKey = clogContextKeyT{}

const (
// standard keys
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the meaning of a standard key? Why orchSessionID is not a standard key?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the meaning of a standard key?

These constants exists only for one reason - to be able to print them in sorted order.

Why orchSessionID is not a standard key?

It is. I just forgot to add it to the stdKeysOrder array.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, I see, then I think you can remove this comment // standard keys and maybe rename stdKeys, because this std has apparently no meaning, which can be confusing.

You can just call them clogKeys or keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are 'standard' because we define them in the package, and have separate AddXXX methods for them - as opposed to general AddVal method which allows user to add any key.

manifestID = "manifestID"
sessionID = "sessionID"
nonce = "nonce"
seqNo = "seqNo"
orchSessionID = "orchSessionID" // session id generated on orchestrator for broadcaster
)

// Verbose is a boolean type that implements Infof (like Printf) etc.
type Verbose bool

var stdKeys map[string]bool
var stdKeysOrder = []string{manifestID, sessionID, nonce, seqNo, orchSessionID}

func init() {
stdKeys = make(map[string]bool)
for _, key := range stdKeysOrder {
stdKeys[key] = true
}
}

type values struct {
mu sync.RWMutex
vals map[string]string
}

func newValues() *values {
return &values{
vals: make(map[string]string),
}
}

// Clone creates new context with parentCtx as parent and
// logging details from logCtx
func Clone(parentCtx, logCtx context.Context) context.Context {
cmap, _ := logCtx.Value(clogContextKey).(*values)
newCmap := newValues()
if cmap != nil {
cmap.mu.RLock()
for k, v := range cmap.vals {
newCmap.vals[k] = v
}
cmap.mu.RUnlock()
}
return context.WithValue(parentCtx, clogContextKey, newCmap)
}

func AddManifestID(ctx context.Context, val string) context.Context {
return AddVal(ctx, manifestID, val)
}

func AddSessionID(ctx context.Context, val string) context.Context {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this function is never used, why don't we set session ID anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. We have vague definition of session id, and we don't it go-livepeer codebase, but we have one returned from webhook, so I've used it 02fd269

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

return AddVal(ctx, sessionID, val)
}

func AddNonce(ctx context.Context, val uint64) context.Context {
return AddVal(ctx, nonce, strconv.FormatUint(val, 10))
}

func AddSeqNo(ctx context.Context, val uint64) context.Context {
return AddVal(ctx, seqNo, strconv.FormatUint(val, 10))
victorges marked this conversation as resolved.
Show resolved Hide resolved
}

func AddOrchSessionID(ctx context.Context, val string) context.Context {
return AddVal(ctx, orchSessionID, val)
}

func AddVal(ctx context.Context, key, val string) context.Context {
cmap, _ := ctx.Value(clogContextKey).(*values)
if cmap == nil {
cmap = newValues()
ctx = context.WithValue(ctx, clogContextKey, cmap)
}
Comment on lines +89 to +94
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that bothers me about the "lazy init" approach is that every interface we give needs to receive & return a context.Context instance. I'm not 100% which is my preference, but have you considered/WDYT of a separate function to initialize the context like

ctx = clog.InitFields(ctx)

and then it can just be used to add fields without worrying about re-setting the ctx value like

clog.AddVal(ctx, "a", "b")

or

clog.Fields(ctx).Add("a", "b")

or even ⛓️

clog.Fields(ctx)
    .Add("a", "b")
    .Add("c", "d")

In which clog.Fields would panic if the fields have not been initialized. Also maybe InitFields could even be the same function as "clone": if there are already fields there, it would create new ones and clone the values, for a "sub-log-context".

Note that we could also allow both interfaces to be used, with the disclaimer that, if you use Fields you must know that your ctx has already been initalized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, after context is initialised, you can just do clog.AddVal(ctx, "a", "b") - this will work just fine because anyway it will return same context. I didn't want explicit initialisation because I didn't want to have accidental panic in production.

cmap.mu.Lock()
cmap.vals[key] = val
cmap.mu.Unlock()
Comment on lines +95 to +97
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great if we could avoid this lock somehow, as logging ideally shouldn't be something you need to be too cautious about performance issues. I wonder if no lock at all wouldn't be sufficient here, since that also considering the current interface which returns a context.Context on every added field, doesn't look like something that we would/should be calling concurrently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't look like something that we would/should be calling concurrently.

I thought so, but was forced to add lock after panics 😄 Turned out we do use same context from different go-routines.
Anyway, when I've first created this package, I didn't know how much information will end up in the logging context.
Turned out not much - right now I'm using only one non-standard key, and probably we will not add many more.
So I think we can get rid of map completely and just use pre-allocated array containing key/value pairs.
Or keep map and locks, but pre-format message on every AddXXX call. Or something else - but it is discussion for the next issue/PR 😄

return ctx
}

func Warningf(ctx context.Context, format string, args ...interface{}) {
glog.WarningDepth(1, formatMessage(ctx, false, format, args...))
}

func Errorf(ctx context.Context, format string, args ...interface{}) {
glog.ErrorDepth(1, formatMessage(ctx, false, format, args...))
}

func Fatalf(ctx context.Context, format string, args ...interface{}) {
glog.FatalDepth(1, formatMessage(ctx, false, format, args...))
}

func Infof(ctx context.Context, format string, args ...interface{}) {
infof(ctx, false, format, args...)
}

// Infofe if last argument is not nil it will be printed as " err=%q"
func Infofe(ctx context.Context, format string, args ...interface{}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any known convention with this e at the end logging functions? If not, then maybe we don't need this Infofe() and (v Verbose) Infofe() at all? I see they are not used in many places and maybe we can add err=%v there in the string message. What do you think? To me, this looks confusing.

Copy link
Contributor Author

@darkdarkdragon darkdarkdragon Dec 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is convention that I've just invented 😄
This exists to avoid having err=<nil> in the logs. We will get it if we're doing Infof("err=%v", err) in case err could be nil.
So to avoid this I need to write

if err != nil {
  glog.Infof("message var1=%s err=%v", var1, err)
} else {
  glog.Infof("message var1=%s", var1)
}`

Infofe helps to avoid this duplication.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I see. I'd probably prefer having err=<nil> since it's kind-of magic behavior of the library you create. In other words, as a user of clog you can be confused that suddenly err= is automatically added at the end. Anyway, it's minor and probably opinion-based, so I leave it up to you (and @victorges) to decide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would prefer not to have err=<nil> in the logs once you will start digging through actual logs 😄
When searching through logs using err= as a query it's much better to see lines with actual errors 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When searching through logs using err= as a query it's much better to see lines with actual errors 😄

This will be so much less insane now that we have an actual tool for querying logs by their actual fields and not just text search 😄
But I think it's nice to avoid the err=<nil> lines anyway :)

It is convention that I've just invented 😄

Since it is a new convetion and thus not something that is self-explanatory, WDYT of using a more explicit name for the function? I like Infoferr for example.

And another approach is providing an AddErr function that only adds the error to the log if it is not nil. On your example, it could be:

ctx = clog.AddErr(ctx, err)
clog.Infof(ctx, "message var1=%s", var1)

or with the chaining idea in the other comment:

clog.Fields(ctx).AddErr(err)
    .Infof("message var1=%s", var1)

infof(ctx, true, format, args...)
}

func V(level glog.Level) Verbose {
return Verbose(glog.V(level))
}

// Infof is equivalent to the global Infof function, guarded by the value of v.
// See the documentation of V for usage.
darkdarkdragon marked this conversation as resolved.
Show resolved Hide resolved
func (v Verbose) Infof(ctx context.Context, format string, args ...interface{}) {
if v {
infof(ctx, false, format, args...)
}
}

func (v Verbose) Infofe(ctx context.Context, format string, args ...interface{}) {
if v {
infof(ctx, true, format, args...)
}
}

func infof(ctx context.Context, lastErr bool, format string, args ...interface{}) {
glog.InfoDepth(2, formatMessage(ctx, lastErr, format, args...))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already asked about it, but it gets lost somewhere during the refactor ;) Why is the depth here 2 while in Warningf, Fatalf and Infof you use 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Warningf call stack looks like this: calling code->clog.Warningf->glog.WarningDepth
For Infof: calling code->clog.Infof->infof->glog.InfoDepth
As you can see there is one more function in the chain, so call stack is deeper by one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I see. Consider adding a comment to the code to explain it.

}

func messageFromContext(ctx context.Context, sb *strings.Builder) {
if ctx == nil {
return
}
cmap, _ := ctx.Value(clogContextKey).(*values)
if cmap == nil {
return
}
cmap.mu.RLock()
for _, key := range stdKeysOrder {
if val, ok := cmap.vals[key]; ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you print the standard keys in some predefined order. However, not standard keys are printed in random order. Is that intended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that intended?

It is consequence of go's map being random. I could overcome it by different ways, but I've decided that it's not worth it. And anyway, right now only standard keys are used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we prepare the clog only for standard keys now, then let's maybe print only them? And skip printing the random keys? I think we should make it clear if the user of clog is allowed to create custom keys or not. If yes, then be consistent and print them in some order (i.e. alphabetic) if not, then just do not log the custom keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It is allowed to create custom keys.
  2. I don't think it's worth it. To do that we will need to write more code and suffer runtime efficiency. We can change it later if we will decide that this is a problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It is allowed to create custom keys.

Do we use it anywhere? Why do we need it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be quite annoying to go through logs where the keys have a random order on every line. Especially when for example going through a stream of the same log type, I normally get used to find the data I want on the same position on the line (cause fields generally have have the same length) and just move around.

Totally something that we can change in implementation later without breaking compatibility, so not necessary now, but giving my +1 here that we probably want to do some deterministic key order here soon. We could create an auxiliary keys slice, sort that, and then print the fields the same way we do for the stdKeysOrder above. That's exactly how logrus does it (src) and I'd guess how json package does it too for serialization.

sb.WriteString(key)
sb.WriteString("=")
sb.WriteString(val)
sb.WriteString(" ")
}
}
for key, val := range cmap.vals {
if _, ok := stdKeys[key]; !ok {
sb.WriteString(key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Consider extracting to a function, these 4 lines are the same as in line 166.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could, but I don't see much value in it.

sb.WriteString("=")
sb.WriteString(val)
sb.WriteString(" ")
}
}
cmap.mu.RUnlock()
}

func formatMessage(ctx context.Context, lastErr bool, format string, args ...interface{}) string {
var sb strings.Builder
messageFromContext(ctx, &sb)
var err interface{}
if lastErr && len(args) > 0 {
err = args[len(args)-1]
args = args[:len(args)-1]
}
sb.WriteString(fmt.Sprintf(format, args...))
if err != nil {
sb.WriteString(fmt.Sprintf(" err=%q", err))
}
return sb.String()
}
victorges marked this conversation as resolved.
Show resolved Hide resolved
39 changes: 39 additions & 0 deletions clog/clog_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
package clog

import (
"context"
"errors"
"testing"

"github.com/stretchr/testify/assert"
)

func TestStdKeys(t *testing.T) {
assert := assert.New(t)
ctx := AddManifestID(context.Background(), "manID")
ctx = AddSessionID(ctx, "sessionID")
ctx = AddNonce(ctx, 1038)
ctx = AddOrchSessionID(ctx, "orchID")
ctx = AddSeqNo(ctx, 9427)
ctx = AddVal(ctx, "customKey", "customVal")
msg := formatMessage(ctx, false, "testing message num=%d", 452)
assert.Equal("manifestID=manID sessionID=sessionID nonce=1038 seqNo=9427 orchSessionID=orchID customKey=customVal testing message num=452", msg)
ctxCloned := Clone(context.Background(), ctx)
ctxCloned = AddManifestID(ctxCloned, "newManifest")
msgCloned := formatMessage(ctxCloned, false, "testing message num=%d", 4521)
assert.Equal("manifestID=newManifest sessionID=sessionID nonce=1038 seqNo=9427 orchSessionID=orchID customKey=customVal testing message num=4521", msgCloned)
// old context shouldn't change
msg = formatMessage(ctx, false, "testing message num=%d", 452)
assert.Equal("manifestID=manID sessionID=sessionID nonce=1038 seqNo=9427 orchSessionID=orchID customKey=customVal testing message num=452", msg)
}

func TestLastErr(t *testing.T) {
assert := assert.New(t)
ctx := AddManifestID(context.Background(), "manID")
var err error
msg := formatMessage(ctx, true, "testing message num=%d", 452, err)
assert.Equal("manifestID=manID testing message num=452", msg)
err = errors.New("test error")
msg = formatMessage(ctx, true, "testing message num=%d", 452, err)
assert.Equal("manifestID=manID testing message num=452 err=\"test error\"", msg)
}
12 changes: 6 additions & 6 deletions cmd/livepeer/livepeer.go
Original file line number Diff line number Diff line change
Expand Up @@ -276,7 +276,7 @@ func main() {
if *testTranscoder {
err := core.TestNvidiaTranscoder(devices)
if err != nil {
glog.Fatalf("Unable to transcode using Nvidia gpu=%s err=%v", strings.Join(devices, ","), err)
leszko marked this conversation as resolved.
Show resolved Hide resolved
glog.Fatalf("Unable to transcode using Nvidia gpu=%q err=%q", strings.Join(devices, ","), err)
}
}
// FIXME: Short-term hack to pre-load the detection models on every device
Expand Down Expand Up @@ -1079,19 +1079,19 @@ func getServiceURI(n *core.LivepeerNode, serviceAddr string) (*url.URL, error) {
// TODO probably should put this (along w wizard GETs) into common code
resp, err := http.Get("https://api.ipify.org?format=text")
if err != nil {
glog.Errorf("Could not look up public IP err=%v", err)
glog.Errorf("Could not look up public IP err=%q", err)
return nil, err
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
glog.Errorf("Could not look up public IP err=%v", err)
glog.Errorf("Could not look up public IP err=%q", err)
return nil, err
}
addr := "https://" + strings.TrimSpace(string(body)) + ":" + RpcPort
inferredUri, err := url.ParseRequestURI(addr)
if err != nil {
glog.Errorf("Could not look up public IP err=%v", err)
glog.Errorf("Could not look up public IP err=%q", err)
return nil, err
}
if n.Eth == nil {
Expand All @@ -1102,12 +1102,12 @@ func getServiceURI(n *core.LivepeerNode, serviceAddr string) (*url.URL, error) {
// On-chain lookup and matching with inferred public address
addr, err = n.Eth.GetServiceURI(n.Eth.Account().Address)
if err != nil {
glog.Errorf("Could not get service URI; orchestrator may be unreachable err=%v", err)
glog.Errorf("Could not get service URI; orchestrator may be unreachable err=%q", err)
return nil, err
}
ethUri, err := url.ParseRequestURI(addr)
if err != nil {
glog.Errorf("Could not parse service URI; orchestrator may be unreachable err=%v", err)
glog.Errorf("Could not parse service URI; orchestrator may be unreachable err=%q", err)
ethUri, _ = url.ParseRequestURI("http://127.0.0.1:" + RpcPort)
}
if ethUri.Hostname() != inferredUri.Hostname() || ethUri.Port() != inferredUri.Port() {
Expand Down
2 changes: 1 addition & 1 deletion common/db.go
Original file line number Diff line number Diff line change
Expand Up @@ -766,7 +766,7 @@ func (db *DB) SelectEarliestWinningTicket(sender ethcommon.Address, minCreationR
)
if err := row.Scan(&senderString, &recipient, &faceValue, &winProb, &senderNonce, &recipientRand, &recipientRandHash, &sig, &creationRound, &creationRoundBlockHash, &paramsExpirationBlock); err != nil {
if err.Error() != "sql: no rows in result set" {
return nil, fmt.Errorf("could not retrieve earliest ticket err=%v", err)
return nil, fmt.Errorf("could not retrieve earliest ticket err=%q", err)
}
// If there is no result return no error, just nil value
return nil, nil
Expand Down
3 changes: 2 additions & 1 deletion common/types.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package common

import (
"context"
"encoding/json"
"math/big"
"net/url"
Expand Down Expand Up @@ -74,7 +75,7 @@ type OrchestratorPool interface {
// GetInfo gets info for specific orchestrator
GetInfo(uri string) OrchestratorLocalInfo
GetInfos() []OrchestratorLocalInfo
GetOrchestrators(int, Suspender, CapabilityComparator, ScorePred) ([]*net.OrchestratorInfo, error)
GetOrchestrators(context.Context, int, Suspender, CapabilityComparator, ScorePred) ([]*net.OrchestratorInfo, error)
Size() int
SizeWith(ScorePred) int
}
Expand Down
2 changes: 1 addition & 1 deletion core/capabilities_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -282,7 +282,7 @@ func (os *stubOS) GetInfo() *net.OSInfo {
return &net.OSInfo{StorageType: net.OSInfo_StorageType(os.storageType)}
}
func (os *stubOS) EndSession() {}
func (os *stubOS) SaveData(string, []byte, map[string]string, time.Duration) (string, error) {
func (os *stubOS) SaveData(context.Context, string, []byte, map[string]string, time.Duration) (string, error) {
return "", nil
}
func (os *stubOS) IsExternal() bool { return false }
Expand Down
12 changes: 6 additions & 6 deletions core/core_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -57,14 +57,14 @@ func TestTranscode(t *testing.T) {
md := &SegTranscodingMetadata{Profiles: videoProfiles, AuthToken: stubAuthToken()}

// Check nil transcoder.
tr, err := n.sendToTranscodeLoop(md, ss)
tr, err := n.sendToTranscodeLoop(context.TODO(), md, ss)
if err != ErrTranscoderAvail {
t.Error("Error transcoding ", err)
}

// Sanity check full flow.
n.Transcoder = NewLocalTranscoder(tmp)
tr, err = n.sendToTranscodeLoop(md, ss)
tr, err = n.sendToTranscodeLoop(context.TODO(), md, ss)
if err != nil {
t.Error("Error transcoding ", err)
}
Expand Down Expand Up @@ -106,18 +106,18 @@ func TestTranscodeSeg(t *testing.T) {

// Test offchain mode
require.Nil(n.Eth) // sanity check the offchain precondition of a nil eth
res := n.transcodeSeg(conf, seg, md)
res := n.transcodeSeg(context.TODO(), conf, seg, md)
assert.Nil(res.Err)
assert.Nil(res.Sig)
// sanity check results
resBytes, _ := n.Transcoder.Transcode(md)
resBytes, _ := n.Transcoder.Transcode(context.TODO(), md)
for i, trData := range res.TranscodeData.Segments {
assert.Equal(resBytes.Segments[i].Data, trData.Data)
}

// Test onchain mode
n.Eth = &eth.StubClient{}
res = n.transcodeSeg(conf, seg, md)
res = n.transcodeSeg(context.TODO(), conf, seg, md)
assert.Nil(res.Err)
assert.NotNil(res.Sig)
// check sig
Expand Down Expand Up @@ -147,7 +147,7 @@ func TestTranscodeLoop_GivenNoSegmentsPastTimeout_CleansSegmentChan(t *testing.T

mid := ManifestID(md.AuthToken.SessionId)

_, err := n.sendToTranscodeLoop(md, ss)
_, err := n.sendToTranscodeLoop(context.TODO(), md, ss)
require.Nil(err)
segChan := getSegChan(n, mid)
require.NotNil(segChan)
Expand Down
Loading