Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to read from bepop from file content #46

Closed
matejsp opened this issue Jul 1, 2024 · 12 comments
Closed

Unable to read from bepop from file content #46

matejsp opened this issue Jul 1, 2024 · 12 comments

Comments

@matejsp
Copy link

matejsp commented Jul 1, 2024

Given the following schema and using: github.com/200sc/bebop v0.6.0:

message TestBugInline {
    1 -> bool OnOff;
}

message TestBug {
    1 -> TestBugInline Bla;
}

And code to write and read from schema:

func TestUnit_SnapshotFlow(t *testing.T) {

	t.Run("test bug", func(t *testing.T) {
		tmpFile, err := os.Create("test.bepop")

		onOff := true
		inline := encoding.TestBugInline{
			OnOff: &onOff,
		}
		snapshotLE := encoding.TestBug{
			Bla: &inline,
		}
		err = snapshotLE.EncodeBebop(tmpFile)
		assert.NoError(t, err)

		// reading to []byte works as expected
		reader, err := os.Open(tmpFile.Name())
		buf := make([]byte, 1024)
		n, err := reader.Read(buf)
		assert.NoError(t, err)
		buf = buf[:n]
		snapshotLE = encoding.TestBug{}
		err = snapshotLE.UnmarshalBebop(buf)
		assert.NoError(t, err)
		assert.True(t, *snapshotLE.Bla.OnOff)

                 // reading from Reader interface return error
		reader, err = os.Open(tmpFile.Name())
		snapshotLE = encoding.TestBug{}
		err = snapshotLE.DecodeBebop(reader)
		assert.True(t, *snapshotLE.Bla.OnOff)
		assert.NoError(t, err)
	})
}

I get:

        	Error:      	Received unexpected error:
        	            	EOF

Now the file content is as follows:
image

@matejsp
Copy link
Author

matejsp commented Jul 1, 2024

09 00 00 00 -> length of TestBug mesage

01 -> index of Bla field

03 00 00 00 -> length of TestBugInline
01 -> index of OnOff
01 -> boolean value of OnOff
00 -> end of TestBugInline message

00 -> end of TestBug message

@matejsp
Copy link
Author

matejsp commented Jul 1, 2024

I have debugged bepop lib a bit more and noticed that what actually happens is that ErrorReader is not wrapped again in case it is already ErrorReader (case for nested messages).

func NewErrorReader(r io.Reader) *ErrorReader {
	if er, ok := r.(*ErrorReader); ok {
		return er
	}
	return &ErrorReader{
		Reader: r,
		buffer: make([]byte, 8),
	}
}

So basically the following lines corrupts ErrorReader from parent DecodeBebop call.

r := iohelp.NewErrorReader(ior)
bodyLen := iohelp.ReadUint32(r)
r.Reader = &io.LimitedReader{R: r.Reader, N: int64(bodyLen)}

I fixed the autogenerated code by unwrapping the ErrorReader in case it is already ErrorReader:

if er, ok := ior.(*iohelp.ErrorReader); ok {
	ior = er.Reader
}
r := iohelp.NewErrorReader(ior)
bodyLen := iohelp.ReadUint32(r)
r.Reader = &io.LimitedReader{R: r.Reader, N: int64(bodyLen)}

or another possible solution is:

r := iohelp.NewErrorReader(nil)
r.Reader = ior
bodyLen := iohelp.ReadUint32(r)
r.Reader = &io.LimitedReader{R: r.Reader, N: int64(bodyLen)}

@200sc
Copy link
Owner

200sc commented Jul 2, 2024

I can run the test and reproduce the failure. But, we do not want nested messages to allocate their own error reader. I'll look into it.

@200sc
Copy link
Owner

200sc commented Jul 2, 2024

@matejsp This is working as intended right now; the EOF is accurate, the file reached its end. Many libraries that work with readers return EOF in success cases; in Bebop's case we return io.ErrUnexpectedEOF if there's a problem.

I'm open to discarding EOF errors from Decode methods if the method is at a valid endpoint, if you want to write that PR.

@matejsp
Copy link
Author

matejsp commented Jul 2, 2024

@200sc
Problem is that EOF is not accurate because LimitedReader from nested message is leaking into parent (since you are changing the reader instance inside ErrorReader. This LimitedReader is corrupted after the call with having N=0 remaining bytes left (should be 1 for 0 byte) in the parent decode.

TestBug decode:
ErrorReader {
   LimitedReader{N: 7, R: reader}
}

Entering nested message call:
ErrorReader {
   LimitedReader{N: 4, R: LimitedReader(7, R: reader)
}

Reading couple of bytes from reader

Exiting nested message decode call:
ErrorReader {
   LimitedReader{N: 0, R: LimitedReader(N: 1, R: reader)
}

-> error in TestBug decode because N = 0 ... so we hit EOF error.

The correct ErrorReader struct after nested call would be:
ErrorReader {
   LimitedReader(N: 1, R: reader)
}

If you don't want to nested message to have its own ErrorReader ...
we need to remove outer LimitedReader after exit from inner message.

Perhaps alternative solution would be to return Reader instance back to its original value using defer:

func (bbp *TestBug) DecodeBebop(ior io.Reader) (err error) {
	r := iohelp.NewErrorReader(ior)
	bodyLen := iohelp.ReadUint32(r)
	r.Reader = &io.LimitedReader{R: r.Reader, N: int64(bodyLen)}
	defer func() { r.Reader = ior }()
	for {
		switch iohelp.ReadByte(r) {
		case 1:
			...
		default:
			r.Drain()
			return r.Err
		}
	}
}

or after draining the outer limited reader:

func (bbp *TestBug) DecodeBebop(ior io.Reader) (err error) {
	r := iohelp.NewErrorReader(ior)
	bodyLen := iohelp.ReadUint32(r)
	r.Reader = &io.LimitedReader{R: r.Reader, N: int64(bodyLen)}
	for {
		switch iohelp.ReadByte(r) {
		case 1:
			...
		default:
			r.Drain()
                        r.Reader = ior 
			return r.Err
		}
	}
}

@200sc 200sc mentioned this issue Jul 2, 2024
@200sc
Copy link
Owner

200sc commented Jul 2, 2024

@matejsp Please review the linked PR with a patch.

@matejsp
Copy link
Author

matejsp commented Jul 2, 2024

@200sc I have tested and yes by reseting limitReader on each for loop before reading ReadByte) is preserving the limited reader and it works in our case so this resolves this bug.

@200sc
Copy link
Owner

200sc commented Jul 2, 2024

Merged, v0.6.1 including this patch is released. Thank you for the report!

@200sc 200sc closed this as completed Jul 2, 2024
@matejsp
Copy link
Author

matejsp commented Jul 2, 2024

@200sc It is still not ok for one case when you read in a for loop:

5 -> Instruction[] Instructions;

Code:

	limitReader := &io.LimitedReader{R: r.Reader, N: int64(bodyLen)}
	for {
		r.Reader = limitReader
		switch iohelp.ReadByte(r) {
		case 1:
			bbp.KillSwitchOn = new(bool)
			*bbp.KillSwitchOn = iohelp.ReadBool(r)
		case 2:
			bbp.Instructions = new([]Instruction)
			*bbp.Instructions = make([]Instruction, iohelp.ReadUint32(r))
			for i := range *bbp.Instructions {
                                // missing another r.Reader = limitReader
				((*bbp.Instructions)[i]), err = MakeInstruction(r)
				if err != nil {
					return err
				}
			}

@matejsp
Copy link
Author

matejsp commented Jul 2, 2024

@200sc Can you reopen the issue please.

@matejsp
Copy link
Author

matejsp commented Jul 2, 2024

I was thinking about the solution. What I don't like is that you are fixing the Reader from call sites instead of making DecodeBepop function non destructive.

I tried the following solution and it works for both cases (single message and slices):
image

@200sc 200sc reopened this Jul 2, 2024
@200sc 200sc mentioned this issue Jul 2, 2024
@matejsp
Copy link
Author

matejsp commented Jul 2, 2024

@200sc
I have regenerated the code with the following changes:

image

=
And run our tests and all pass now. So it seems that now we have a perfect fix :D

@200sc 200sc closed this as completed Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants