Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce SessionState size better with MessagePack + LZ4 #632

Merged
merged 16 commits into from
Jul 13, 2020

Conversation

NickMeves
Copy link
Member

@NickMeves NickMeves commented Jun 25, 2020

The MessagePack + Compression portion of overhauling sessions encoding. (#523)

Description

(Controversial) Makes the compression stage mandatory and always happens. I think we either do LZ4 compression or we don't, no need for optional command line flag bloat.

This assumes ciphers are mandatory moving forward: with #414

Encode/Decode codepath itself is rather light. Most of the added code is supporting detecting a legacy V5 JSON session coming in and handling the fallback.

This PR also streamlines the RedisSessionStore rogue encryption of sessions to use the GCMCipher that is now available.

Motivation and Context

Lots of users having issues with large cookie sessions, particularly with Azure.

This moves to encoding with MessagePack to give a binary encoding format (JSON required the encrypted cookie to be Base64 encoded to be a valid string for JSON, increasing cookie size by 33%)

Additionally, it moves encryption to be on the full session state object once - instead of on a per field basis. This saves the 16 bytes where all encrypted fields store the initialization vector for each field.

LZ4 compression selected due to its super fast decompress performance. Since we likely mint a session cookie once, and then constantly read it (and lz4 decompress) for its lifespan -- lz4 had the best profile of various compression algorithms.

How Has This Been Tested?

Unit tests added for session state.

NOTE: leaving this PR in draft until I address oauth2proxy_test.go issues. Lots of unit tests rans without a cipher since that was a previously allowed codepath. This PR firmly removes the ability to have no cipher in SessionState code (matching what our options validator now does as well). Just need to update the affected unit tests.

Checklist:

  • My change requires a change to the documentation or CHANGELOG.
  • I have updated the documentation/CHANGELOG accordingly.
  • I have created a feature (non-master) branch for my PR.

@JoelSpeed
Copy link
Member

@NickMeves It would be good to see some comparison of compressed vs non-compressed sessions to justify the forced LZ4 compression, do you have any notes to show how much benefit we gain from this?

@NickMeves
Copy link
Member Author

NickMeves commented Jun 25, 2020

@NickMeves It would be good to see some comparison of compressed vs non-compressed sessions to justify the forced LZ4 compression, do you have any notes to show how much benefit we gain from this?

@JoelSpeed Here is the data I got in testing a few months back with my earlier POC. I think this data is still valid with this iteration of the design:

Some rough numbers after testing various sessions with a generic OIDC provider:
Current _oauth2_proxy cookie size: ~5000 bytes
Compressed: ~2700 bytes
Compressed (without LZ4 compression, just base64 tricks + messagePack): ~3500 bytes

I also tested configurable compression levels since our compress:decompress ratio is 1:many -- so I was hoping using compression level 9 (slowest but most space savings in theory) would be useful to our execution profile.

But it saved like 10 bytes vs lowest level, so that was a dud 😄

@NickMeves
Copy link
Member Author

If I recall, for the hot second I was base64 decoding the segments in the 3 tokens types if able to (header & payload usually) to yield better compression with like fields inside the base64 data got it down to like 2400 characters. So very minor improvement from the LZ4 of base64 tokens that got us to ~2700.

LZ4 does pretty well even on the base64 tokens since they all have such similar headers and payload leading characters that they turn into the same runs of base64 encoded data. JWT headers especially.

@NickMeves
Copy link
Member Author

I currently have the compression only happening on cookie sessions -- I made it an encode/decode parameter and didn't do it in the Redis store (since large sessions don't really matter there)

@JoelSpeed
Copy link
Member

Ok, sounds like an approximate 25% saving, sounds good to me! Do you want me to review this PR as is or wait until it's ready?

@NickMeves
Copy link
Member Author

You can start taking a peek -- I think everything that's here is decently ready. I just need to look at how much damage I did to the oauth2proxy_test.go by making ciphers mandatory in the encode/decode functions.

Another things I'm curious your thoughts @JoelSpeed -

Do we support legacy JSON sessions trickling in from V5 & V6.0 versions? Or do we rip the bandaid and only deal with messagepack moving forward and force a user reauthentication? I think most the complexity in this PR is from supporting a JSON fallback and support Redis's former custom encryption logic.

@JoelSpeed
Copy link
Member

Do we support legacy JSON sessions trickling in from V5 & V6.0 versions? Or do we rip the bandaid and only deal with messagepack moving forward and force a user reauthentication? I think most the complexity in this PR is from supporting a JSON fallback and support Redis's former custom encryption logic.

I will have to take a look and see how complex it is, ideally we would have migration logic so that we don't have to consider this a breaking change, but if it's particularly complex (or happens to land with a bunch of other breaking changes), then we may decide to drop it 🤔

@NickMeves
Copy link
Member Author

@JoelSpeed I did a big cleanup of oauthproxy_test.go in this commit: b0a1d64

Tests are passing now. I just want to leave it in draft for a bit longer until I have a chance to do some manual testing against a dummy httpbin to do some analysis on the cookies.

@NickMeves
Copy link
Member Author

NickMeves commented Jun 27, 2020

@JoelSpeed I did a big cleanup of oauthproxy_test.go in this commit: b0a1d64

Tests are passing now. I just want to leave it in draft for a bit longer until I have a chance to do some manual testing against a dummy httpbin to do some analysis on the cookies.

And I see you nearly did the same thing here after the fact 😄

https://github.com/oauth2-proxy/oauth2-proxy/pull/577/files#diff-0e9edf6c1eec1500bb43dfb219d10c52R1758

Great minds think alike!

Copy link
Member

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Added a few suggestions but nothing major I don't think

Please make sure to manually test the upgrades for this to see if they work, preferably for cookie and redis if you can

A lot of the test changes look very familiar, we are definitely going to have some clashes with some of my PRs 😅 Do you have time to review #577, I'd like to get that in and I think that might change some bits here

// Compress the packed encoding
// The Compress:Decompress ratio is 1:Many. LZ4 gives fastest decompress speeds
buf := new(bytes.Buffer)
zw := lz4.NewWriter(nil)
zw.Header = lz4.Header{
BlockMaxSize: 65536,
CompressionLevel: 0,
}
zw.Reset(buf)

reader := bytes.NewReader(packed)
_, err = io.Copy(zw, reader)
if err != nil {
return nil, err
}
err = zw.Close()
if err != nil {
return nil, err
}

compressed, err := ioutil.ReadAll(buf)
if err != nil {
return nil, err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to extract this to a function called something like compress? (or alternate because that would clash with the function parameter) Would make the flow of the function nicer I think
This function could look like

// Marshal to MessagePack
packed, err := msgpack.Marshal(s)
if err != nil {
	return nil, err
}

if compress {
	packed, err = compress(packed)
	if err != nil {
		return nil, err
	}
}

return c.Encrypt(packed)

Comment on lines 120 to 97
// LZ4 Decompress
reader := bytes.NewReader(decrypted)
buf := new(bytes.Buffer)
zr := lz4.NewReader(nil)
zr.Reset(reader)
_, err = io.Copy(buf, zr)
if err != nil {
return nil, err
}

packed, err = ioutil.ReadAll(buf)
if err != nil {
return nil, err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, would be good to see a decompress helper so the if statement can shrink down to

if compressed {
  packed, err = decompress(packed)
  if err != nil {
    return nil, err
  }
}

Comment on lines 154 to 155
// Backward compatibility with using unencrypted Email or User
// Decryption errors will leave original string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should drop this now, since we are dropping support for unencrypted sessions in the next release? Everyone should be upgrading via v6 so this should be redundant I think

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do it! -- I have no qualms about using this as the opportunity to purge unencrypted sessions. We should roll this & #575 together then (after V6 gives a rollout window of encrypted sessions + SHA256 HMAC)

That would likely make the forced reauth for stragglers cleanest and result in no potential errors since their session will be tossed out with the SHA1 cookie failing to validate.

if tc.Error {
for name, tc := range testCases {
t.Run(name, func(t *testing.T) {
var err error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May as well declare this inline on 205 _, err := ...

@@ -353,3 +274,24 @@ func TestIntoEncryptAndIntoDecrypt(t *testing.T) {
})
}
}

func compareSessionStates(t *testing.T, left *SessionState, right *SessionState) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice extraction to a helper 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Those pesky Time based fields made a pure object compare not work as I had hoped.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than left, right, we should call these expected and got? Or expected and actual? That would tie up with the assert right? I think in assert the parameter order is t testing.T, expected interface{}, actual interface{}?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At one point I use this helper to compare if taking the encode/decode route with 2 different paths still results in sessions states that are equal:

compareSessionStates(t, decoded, decodedCompressed)

That is the only reason I didn't use the expected terminology. But I suppose I am expecting the decoded without compression to equal the decoded with compression -- so I can switch to that naming convention

@NickMeves NickMeves marked this pull request as ready for review June 27, 2020 19:56

// legacyV5LoadSessionFromTicket loads the session based on the ticket value
// This falls uses V5 style encryption of Base64 + AES CFB
func (store *SessionStore) legacyV5LoadSessionFromTicket(ctx context.Context, value string) (*sessions.SessionState, error) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I think I need to circle back on this one later. I don't think I have test coverage for this legacy fallback.

@JoelSpeed
Copy link
Member

@NickMeves can you do a rebase on latest master and fix the CI issue? I'd like to see the tests passing on this. Gonna give another review now anyway

Comment on lines 126 to 132
if err != nil || !utf8.ValidString(*s) {
return nil, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the decryption succeeded but the string isn't valid UTF8, err is nil here so we return nil, nil, I don't think that's the intention, perhaps this should be split into two if statements?

Comment on lines 172 to 189
compressed, err := ioutil.ReadAll(buf)
if err != nil {
return nil, err
}

return compressed, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're not embellishing the error message with any extra detail, you could just return ioutil.ReadAll(buf)

func DecodeSessionState(data []byte, c encryption.Cipher, compressed bool) (*SessionState, error) {
decrypted, err := c.Decrypt(data)
if err != nil {
return nil, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth embellishing the errors with some extra data so we can tell where the errors came from, eg:

Suggested change
return nil, err
return nil, fmt.Errorf("error decrypting data: %v, err")

I would suggest doing that for the three error returns in this method as a minimum, preferably also in the encode too

Comment on lines 190 to 208
payload, err := ioutil.ReadAll(buf)
if err != nil {
return nil, err
}

return payload, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, could just return, WDYT?

Comment on lines 100 to 149
for name, ss := range testCases {
t.Run(name, func(t *testing.T) {
encoded, err := ss.EncodeSessionState(c, false)
assert.NoError(t, err)
encodedCompressed, err := ss.EncodeSessionState(c, true)
assert.NoError(t, err)
assert.GreaterOrEqual(t, len(encoded), len(encodedCompressed))

decoded, err := DecodeSessionState(encoded, c, false)
assert.NoError(t, err)
decodedCompressed, err := DecodeSessionState(encodedCompressed, c, true)
assert.NoError(t, err)

compareSessionStates(t, decoded, decodedCompressed)
compareSessionStates(t, decoded, &ss)
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth possibly checking that trying to decode with a different cipher results in an error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might be sufficient covered already by these tests in the cipher section, what do you think?

func TestGCMtoCFBErrors(t *testing.T) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the middle encryption layer in these tests is sorta passed over and we never explicitly test changing the cipher breaks the functions (vs our compression size & different output checks in the middle stage).

I'll give it a whirl adding it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up doing this -- and took the opportunity to test with all valid AES secret lengths and ran the tests through all our potential ciphers. Before I was using the const secret & just the AES-CFB cipher.

@@ -353,3 +274,24 @@ func TestIntoEncryptAndIntoDecrypt(t *testing.T) {
})
}
}

func compareSessionStates(t *testing.T, left *SessionState, right *SessionState) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than left, right, we should call these expected and got? Or expected and actual? That would tie up with the assert right? I think in assert the parameter order is t testing.T, expected interface{}, actual interface{}?

@NickMeves NickMeves requested a review from a team as a code owner July 3, 2020 17:18
@NickMeves
Copy link
Member Author

a4ee514 adds the unit tests for the legacy redis session decoding.

I tried to get the common test cases unified under tests/legacy_helpers.go. But I'm not in sold that me trying to DRY myself didn't result in more headaches... had to change a bunch of stuff to then avoid cyclic imports.

@NickMeves
Copy link
Member Author

For the codeclimate issues - I want to fix those in a separate PR that moves the persistent store serialization + cookie ticket logic to a central place so all stores can use it (and just need to worry about how to Save/Load/Del key + bytes)

@NickMeves
Copy link
Member Author

NickMeves commented Jul 10, 2020

@JoelSpeed This is ready for the finish line when you have time for a final look.

Ran the following interactive tests with the docker-compose environment to confirm the legacy fallback worked as planned:
Cookies: v5.1.0 & v6.0.0 -> this 👍
Redis: v5.1.0 & v6.0.0 -> this 👍

Cookie v5.1.0 settings that didn't result in tokens being sent (ie no cookie_refresh or pass_authorization_header) caused an error that resulted in a reauthentication (as per our discussion in this PR to remove support for that legacy case)

As far as compression with the docker-compose contrib environment:
Cookie went from about ~3300 -> ~1800

Copy link
Member

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic all looks good, left some stylistic comments

- [#632](https://github.com/oauth2-proxy/oauth2-proxy/pull/632) There is backwards compatibility to sessions from v5
- Any unencrypted sessions from before v5 that only contained a Username & Email will trigger a reauthentication

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this count as a breaking change or an important note? It's not breaking from v6.0.0 is it? Would this require a major version bump or minor?

Copy link
Member Author

@NickMeves NickMeves Jul 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll move to Important Notes -- it is only semi-breaking (is a reauth breaking...?) from a subset of v5 users that had a valid v6 cookie-secret but didn't have the 4 options that triggered a cipher to be set. It will cause a reauthentication for those JSON sessions that come in with only an unencrypted user/email.

All other v5 use cases had options that had full sessions that would be handled with the fallback or other issues that other v6 changes forced them to re-authentication anyway

func (s *SessionState) EncodeSessionState(c encryption.Cipher, compress bool) ([]byte, error) {
packed, err := msgpack.Marshal(s)
if err != nil {
return nil, errors.Wrap(err, "error marshalling session state to msgpack")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the built in errors package if you are going to wrap errors, preferred style is fmt.Errorf("<descr>: %w", err). Is there any reason to wrap this error? Are we unwrapping it later somewhere?

var ss SessionState
err = msgpack.Unmarshal(packed, &ss)
if err != nil {
return nil, errors.Wrap(err, "error unmarshalling data to session state")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment re error wrapping as above

reader := bytes.NewReader(payload)
_, err := io.Copy(zw, reader)
if err != nil {
return nil, errors.Wrap(err, "error copying lz4 stream to buffer")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above re error wrap

}
err = zw.Close()
if err != nil {
return nil, errors.Wrap(err, "error closing lz4 writer")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above re error wrap

@@ -1,6 +1,9 @@
package redis
package redis_test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a benefit to making this part of the test package? I never really understood why this helps, I assume the compiler ignores all the test code anyway?

Copy link
Member Author

@NickMeves NickMeves Jul 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna revisit the legacy_helper -- I was thinking trying to consolidate the legacy test cases into 1 area would help DRY up the code... but it caused more headaches than I think its worth for tests we'll likely fully deprecate in V7 and rip out.

It caused circular imports -- which made me need to change the package of the test code that consumed them to *_test to avoid the circular import compile error.

)

func TestLegacyV5DecodeSession(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the rest of this test suite uses Ginkgo and Gomega, have you considered using either of those?

You can use gomega assertions withiout Ginkgo by initialising a new Gomega

g := NewWithT(t)
err := ...
g.Expect(err).ToNot(HaveOccurred())

t.Run("", func (t *testing.T) {
  // Use a subtest gomega to fail just the subtest
  gs := NewWithT(t)
  ...
})

Not a blocker, but I would prefer the consistency

ciphertext := make([]byte, len(value))
block, err := aes.NewCipher(ticket.Secret)
if err != nil {
return nil, fmt.Errorf("error initiating cipher block %s", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return nil, fmt.Errorf("error initiating cipher block %s", err)
return nil, fmt.Errorf("error initiating cipher block: %v", err)

@@ -0,0 +1,102 @@
package tests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these only used for redis? Should we keep them in the redis package if so?

Comment on lines 1466 to 1484
if result == hmacauth.ResultNoSignature {
w.Write([]byte("no signature received"))
_, err = w.Write([]byte("no signature received"))
} else if result == hmacauth.ResultMatch {
w.Write([]byte("signatures match"))
_, err = w.Write([]byte("signatures match"))
} else if result == hmacauth.ResultMismatch {
w.Write([]byte("signatures do not match:" +
_, err = w.Write([]byte("signatures do not match:" +
"\n received: " + headerSig +
"\n computed: " + computedSig))
} else {
panic("Unknown result value: " + result.String())
panic("unknown result value: " + result.String())
}
if err != nil {
panic(err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could make this a switch that sets the value of a string, then handle the error writing afterwards, WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give it a quick whirl! But I think this will get a much bigger overhaul when I tackle all this with this issue: #648

Thinking back, I feel like a few PRs in the past have added headers to the SignatureHeaders yet didn't mention that this would be a breaking change requiring any users of this functionality to update how the backend code validated the signature.

So I'm really starting to think no one uses this -- or any few users are very advanced and know to check the code itself to see what headers they need to hmacauth on the backend (since we don't document it).

@NickMeves
Copy link
Member Author

NickMeves commented Jul 11, 2020

@JoelSpeed This commit takes care of the circular import issue and gets us back to the old way where our session & redis tests can be in those packages and have access to unit test the internal methods again:

61e0f2a

Only downside is it sits in the sessions package directly so the CreateLegacyV5TestCases function is exposes publicly. But the moment I don't have that in the sessions package, using a SessionState in the Output of the test table would need a sessions import -- making a circular imports when sessions tests import this test creator method 🤷‍♂️

This feels like the least bad option while still DRYing up these repeated legacy V5 tests?

@JoelSpeed
Copy link
Member

There's a linter issue, need to make the type it's complaining about public, which is ok as this will be temporary. Other than that, and the changelog needing a conflict resolution, all looks good, let's get those fixed and get this merged!

Nick Meves added 16 commits July 13, 2020 12:04
Assumes ciphers are now mandatory per oauth2-proxy#414. Cookie & Redis sessions
can fallback to V5 style JSON in error cases. TODO: session_state.go
unit tests & new unit tests for Legacy fallback scenarios.
More aggressively assert.NoError on all
validation.Validate(opts) calls to enforce legal
options in all our tests.
Add additional NoError checks wherever error return
values were ignored.
Invalid CFB decryptions can result in garbage data
that 1/100 times might cause message pack unmarshal
to not fail and instead return an empty session.
This adds more rigor to make sure legacy sessions
cause appropriate errors.
Refactor common legacy JSON test cases to a
legacy helpers area under session store tests.
Placing these helpers under the sessions pkg removed
all the circular import uses in housing it under the
session store area.
@NickMeves
Copy link
Member Author

There's a linter issue, need to make the type it's complaining about public, which is ok as this will be temporary. Other than that, and the changelog needing a conflict resolution, all looks good, let's get those fixed and get this merged!

@JoelSpeed Ready to roll! 🎉

Copy link
Member

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants