Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eth,pm: avoid watchPoolSizeChange nil pointer error #1833

Merged
merged 2 commits into from
Apr 12, 2021

Conversation

kyriediculous
Copy link
Contributor

@kyriediculous kyriediculous commented Apr 8, 2021

What does this pull request do? Explain your changes. (required)
This issue might be caused by an unexpected return value from the ETH provider.

The error is caused by the following expression

if poolSize := sm.tm.GetTranscoderPoolSize(); poolSize.Cmp(lastPoolSize) != 0

Both values are fetched from TimeWatcher.GetTranscoderPoolSize()

The value of TimeWatcher.transcoderPoolSize is initialised on node startup and refetched from the ETH provider when events are received.

The contract binding BondingManagerSession.GetTranscoderPoolSize() returns the following return *(new(*big.Int)), err.
This means that even though the return value (a pointer) can't be nil because it is initialised , but value the pointer refers to can still be and this is what is returned. If for some reason there is a nil error and an unexpected value from the contract call a nil pointer could still occur.

Specific updates (required)

  • Wrap the contract call in case it returns nil, nil

How did you test each of these updates (required)

Does this pull request close any open issues?
Fixes #1828

Checklist:

@yondonfu
Copy link
Member

yondonfu commented Apr 8, 2021

This means that even though the return value (a pointer) can't be nil because it is initialised , but value the pointer refers to can still be and this is what is returned. If for some reason there is a nil error and an unexpected value from the contract call a nil pointer could still occur.

Did you actually confirm that the contract binding is returning a nil value even when the error is nil?

The nil pointer error in #1828 is being trigger in this line. I believe poolSize could be nil or lastPoolSize could be nil. The latter makes more sense to me because the first time that the pool size in the TimeWatcher is set is in the Watch() method which is called in a goroutine. So, if lastPoolSize is initialized in LocalSenderMonitor before the pool size is initialized in the TimeWatcher then lastPoolSize would be nil.

I think this indicates another problem - LocalSenderMonitor depends on state in TimeWatcher that might not be initialized yet. One way to address this particular problem could be to move the state initialization logic (such as this line) into the TimeWatcher constructor which is called synchronously before Watch() is called in a goroutine. Additionally, it may make sense to default the return value of TimeWatcher.GetTranscoderPoolSize() to some non-nil value to guard against nil pointer errors just to be safe (i.e. 0).

@kyriediculous
Copy link
Contributor Author

Did you actually confirm that the contract binding is returning a nil value even when the error is nil?

No but it could be possible if a ethereum client is used with potential non-default behaviour here I guess (or some erroneous code). You're probably right that a properly implemented client likely doesn't return nil, nil here.

So yeah a race condition is more likely to be the cause here.

@yondonfu
Copy link
Member

No but it could be possible if a ethereum client is used with potential non-default behaviour here I guess (or some erroneous code). You're probably right that a properly implemented client likely doesn't return nil, nil here.

I think the contract binding implementation should guarantee that the big.Int value is always non-nil.

@kyriediculous kyriediculous force-pushed the nv/nil-poolsize branch 2 times, most recently from 023b229 to 0777ed9 Compare April 12, 2021 15:23
@kyriediculous kyriediculous marked this pull request as ready for review April 12, 2021 15:23
Copy link
Member

@yondonfu yondonfu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good. Just 2 minor comments.

eth/stubclient.go Show resolved Hide resolved
eth/watchers/timewatcher.go Outdated Show resolved Hide resolved
Copy link
Member

@yondonfu yondonfu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after squashing

@kyriediculous kyriediculous merged commit 218e13f into master Apr 12, 2021
@kyriediculous kyriediculous deleted the nv/nil-poolsize branch April 12, 2021 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Nil pointer in LocalSenderMonitor watchPoolSizeChange()
2 participants