write blob based on session id #4615

butonic · 2024-04-05T21:15:04Z

This PR will now always use the session blobid to determine the target destination.

I also changed the signature of the Blobstore interfaces to get rid of the *node.Node struct in the Parameters.

However, I no longer pass the expected blobsize to Download(). Only the s3 blobstore would use it to check that it got the correct amount of bytes. IMO that should be the responsibility of the caller.

And for that Blobstore interface change I had to first fix the mock generation: #4614

fun ... all day long ...

I tried to grasp what happens now when two uploads are finished in reverse order. AFAICT the size diff needs to be tested. maybe we can even determine the size change later ... some thoughts are currently in the code:

in pkg/storage/utils/decomposedfs/upload/session.go

// NodeExists returns wether or not the node existed during InitiateUpload.
// FIXME If two requests try to write the same file they both will store a new
// random node id in the session and try to initialize a new node when
// finishing the upload. The second request will fail with an already exists
// error when trying to create the symlink for the node in the parent directory.
// A node should be created as part of InitiateUpload. When listing a directory
// we can decide if we want to skip the entry, or expose uploed progress
// information. But that is a bigger change and might involve client work.
func (s *OcisSession) NodeExists() bool {
	return s.info.Storage["NodeExists"] == "true"
}

and again in pkg/storage/utils/decomposedfs/upload/store.go

	if session.NodeExists() { // TODO this is wrong. The node should be created when the upload starts, the revisions should be created independently of the node
		// we do not need to propagate a change when a node is created, only when the upload is ready.
		// that still creates problems for desktop clients because if another change causes propagation it will detects an empty file
		// so the first upload has to point to the first revision with the expected size. The file cannot be downloaded, but it can be overwritten (which will create a new revision and make the node reflect the latest revision)
		// any finished postprocessing will not affect the node metadata.
		// *thinking* but then initializing an upload will lock the file until the upload has finished. That sucks.
		// so we have to check if the node has been created meanwhile (well, only in case the upload does not know the nodeid ... or the NodeExists array that is checked by session.NodeExists())
		// FIXME look at the disk again to see if the file has been created in between, or just try initializing a new node and do the update existing node as a fallback. <- the latter!

		f, err = store.updateExistingNode(ctx, session, n, session.SpaceID(), uint64(session.Size()))
		if f != nil {
			appctx.GetLogger(ctx).Info().Str("lockfile", f.Name()).Interface("err", err).Msg("got lock file from updateExistingNode")
		}
	} else {
		f, err = store.initNewNode(ctx, session, n, uint64(session.Size()))
		if f != nil {
			appctx.GetLogger(ctx).Info().Str("lockfile", f.Name()).Interface("err", err).Msg("got lock file from initNewNode")
		}
	}

That is why this PR is in draft. I think it is less wrong, but I don't want to merge it yet, because I havent fully understood the implications of this change.

I'd like to better understand and actaully the corner cases when two uploads are trying to create the same new file and how they can be safely handled without causing data loss. That has a lot of variability, eg. when the request was made with or without preconditions ...

update-docs · 2024-04-05T21:15:08Z

Thanks for opening this pull request! The maintainers of this repository would appreciate it if you would create a changelog item based on your changes.

pkg/storage/fs/s3ng/blobstore/blobstore.go

Signed-off-by: Jörn Friedrich Dreyer <[email protected]>

pkg/storage/utils/decomposedfs/upload/upload.go

butonic · 2024-04-09T12:33:23Z

When fixing this it opens up the possibility of metadata corruption. The following test will fail when listing the directory after the first upload has finished postprocessing with outcome delete (or abort) because the file is completely deleted.

		It("correctly calculates the size when the second upload is finishes, even if first is deleted", func() {
			// finish postprocessing of second upload
			con <- events.PostprocessingFinished{
				UploadID: secondUploadID,
				Outcome:  events.PPOutcomeContinue,
			}
			// wait for upload to be ready
			ev, ok := (<-pub).(events.UploadReady)
			Expect(ok).To(BeTrue())
			Expect(ev.Failed).To(BeFalse())

			// check processing status
			resources, err := fs.ListFolder(ctx, rootRef, []string{}, []string{})
			Expect(err).ToNot(HaveOccurred())
			Expect(len(resources)).To(Equal(1))

			item := resources[0]
			Expect(item.Path).To(Equal(ref.Path))
			Expect(utils.ReadPlainFromOpaque(item.Opaque, "status")).To(Equal(""))

			// size should match the second upload
			Expect(item.Size).To(Equal(uint64(len(file2Content))))

			// finish postprocessing of first upload
			con <- events.PostprocessingFinished{
				UploadID: uploadID,
				//				Outcome:  events.PPOutcomeDelete, // This will completely delete the file
				Outcome: events.PPOutcomeAbort, // This as well ... fck
			}
			// wait for upload to be ready
			ev, ok = (<-pub).(events.UploadReady)
			Expect(ok).To(BeTrue())
			Expect(ev.Failed).To(BeTrue())

			// check processing status
			resources, err = fs.ListFolder(ctx, rootRef, []string{}, []string{})
			Expect(err).ToNot(HaveOccurred())
			Expect(len(resources)).To(Equal(1))

			// size should still match the second upload
			Expect(item.Size).To(Equal(uint64(len(file2Content))))
		})

I need to add logic that prevents that ... and correctly calculates the size. I'm still worried the sizeDiff is unapplied wrong in corner cases.

Signed-off-by: Jörn Friedrich Dreyer <[email protected]>

butonic · 2024-04-09T13:53:51Z

@micbar we should backport this to ocis 5

aduffeck · 2024-04-10T06:24:35Z

pkg/storage/fs/posix/blobstore/blobstore.go

-func (bs *Blobstore) path(node *node.Node) (string, error) {
-	if node.BlobID == "" {
-		return "", fmt.Errorf("blobstore: BlobID is empty")
+func (bs *Blobstore) path(spaceID, blobID string) (string, error) {


@butonic This changed signature is actually a problem for the posixfs driver as it doesn't store blobs separately from the actual nodes. In the feature branch it now uses node.InternalPath() to get the path to read the blob from, in contrast to what this copy-pasta stub implementatin in edge does.

I guess we could set the blob id to the node id and use that to look up the path, but you could also argue that the blobstore interface works with Nodes and shouldn't assume that blob ids are gonna be used.

IMO the blobstore interface is way to constrained. We already have a hack in the ocis filestore implementation that 'moves' the file. s3ng could use a reader. posix might also want to move .. well it has to copy the bytes because of the user id.

Taking downloads into account we may want to forward range request headers to the actual s3 backend.

So maybe a parameter struct that carries the different properties makes sense?

aduffeck · 2024-04-10T06:53:28Z

pkg/storage/utils/decomposedfs/decomposedfs.go

+				if err != nil {
+					log.Error().Err(err).Str("node", n.ID).Str("uploadID", ev.UploadID).Msg("reading node for session failed")
+				}
+				if latestSession == session.ID() {


Hm, why do we only revert the size diff if the failed session is the latest session? At a first glance we seem to unconditionally apply the size diff for all uploads, no?

oh, I see. The size diff that is propagated before the postprocessing starts is calculated against the current size of the node.

hm... which could be either the original size or the size of the currently being processed upload, depending on the timing. Ugh...

When initiating an upload we always update the file size. If a new upload is initiated after postprocessing of the first started but before it ended we must not change the size back, or the size of the node will be wrong because the new upload session has a size diff that is based on the size of the node. so we must not revert it.

Signed-off-by: Jörn Friedrich Dreyer <[email protected]>

micbar · 2024-04-23T13:50:23Z

@butonic Please backport, that was not released yet.

butonic requested a review from aduffeck April 5, 2024 21:15

butonic self-assigned this Apr 5, 2024

butonic mentioned this pull request Apr 5, 2024

ContentLength and Body length mismatch owncloud/ocis#8606

Closed

butonic commented Apr 8, 2024

View reviewed changes

pkg/storage/fs/s3ng/blobstore/blobstore.go Outdated Show resolved Hide resolved

write blob based on session id

6c12ece

Signed-off-by: Jörn Friedrich Dreyer <[email protected]>

butonic force-pushed the fix-blobstore branch from 8b53e0d to 6c12ece Compare April 8, 2024 11:40

butonic commented Apr 8, 2024

View reviewed changes

pkg/storage/utils/decomposedfs/upload/upload.go Outdated Show resolved Hide resolved

butonic added 2 commits April 9, 2024 15:42

add async size diff test

59aea35

Signed-off-by: Jörn Friedrich Dreyer <[email protected]>

only propagate sizediff if no other upload was started

bf7c385

Signed-off-by: Jörn Friedrich Dreyer <[email protected]>

butonic force-pushed the fix-blobstore branch from 515f81b to bf7c385 Compare April 9, 2024 13:43

add changelog

b592649

Signed-off-by: Jörn Friedrich Dreyer <[email protected]>

butonic marked this pull request as ready for review April 9, 2024 13:53

butonic requested review from a team, labkode and glpatcern as code owners April 9, 2024 13:53

aduffeck reviewed Apr 10, 2024

View reviewed changes

butonic added 2 commits April 10, 2024 11:05

add fixme for known bug

f6f30f2

Signed-off-by: Jörn Friedrich Dreyer <[email protected]>

revert blobstore signature changes

753dae0

Signed-off-by: Jörn Friedrich Dreyer <[email protected]>

butonic force-pushed the fix-blobstore branch from 4a9a79c to 753dae0 Compare April 10, 2024 11:11

aduffeck approved these changes Apr 10, 2024

View reviewed changes

butonic merged commit 767b090 into cs3org:edge Apr 10, 2024
9 checks passed

butonic deleted the fix-blobstore branch April 10, 2024 14:58

butonic mentioned this pull request Apr 12, 2024

several postprocessing cornercases are broken owncloud/ocis#8848

Open

butonic mentioned this pull request Apr 23, 2024

stable2.19 - Bugfix: write blob based on session id #4654

Merged

micbar mentioned this pull request Jun 19, 2024

6.0.0 Rolling Release owncloud/ocis#9391

Closed

24 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

write blob based on session id #4615

write blob based on session id #4615

butonic commented Apr 5, 2024 •

edited

Loading

update-docs bot commented Apr 5, 2024

butonic commented Apr 9, 2024

butonic commented Apr 9, 2024

aduffeck Apr 10, 2024

butonic Apr 10, 2024

aduffeck Apr 10, 2024

aduffeck Apr 10, 2024

butonic Apr 10, 2024

micbar commented Apr 23, 2024

write blob based on session id #4615

write blob based on session id #4615

Conversation

butonic commented Apr 5, 2024 • edited Loading

update-docs bot commented Apr 5, 2024

butonic commented Apr 9, 2024

butonic commented Apr 9, 2024

aduffeck Apr 10, 2024

Choose a reason for hiding this comment

butonic Apr 10, 2024

Choose a reason for hiding this comment

aduffeck Apr 10, 2024

Choose a reason for hiding this comment

aduffeck Apr 10, 2024

Choose a reason for hiding this comment

butonic Apr 10, 2024

Choose a reason for hiding this comment

micbar commented Apr 23, 2024

butonic commented Apr 5, 2024 •

edited

Loading