Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We must block on all pending work during flush() #39

Open
dwlsalmeida opened this issue Jul 11, 2023 · 1 comment
Open

We must block on all pending work during flush() #39

dwlsalmeida opened this issue Jul 11, 2023 · 1 comment

Comments

@dwlsalmeida
Copy link
Collaborator

This appears to be a bug in our implementation. If we flush, we must

a) submit any leftover work
b) block on said work before returning

Notice that after we return from flush, a lot of our state is (or will be) invalid, so we must complete the work before returning.

For h.264, for example, the current code will not check cur_pic_opt, nor will it block on it. This means that if (for whatever reason), cur_pic_opt is Some, that picture will be lost.

@dwlsalmeida
Copy link
Collaborator Author

dwlsalmeida commented Jul 27, 2023

This part seems to be fixed:

For h.264, for example, the current code will not check cur_pic_opt, nor will it block on it. This means that if (for whatever reason), cur_pic_opt is Some, that picture will be lost.

        // Finish the current picture if there is one pending.
        if let Some(cur_pic) = self.codec.current_pic.take() {
            self.finish_picture(cur_pic)?;
        }

But the core issue here still remains. Actually, the more I think about this, the more I see that the problem is in a lower layer.

We should try to avoid this scenario:

let a = <some PictureSync>
let b = <some PictureEnd, uses A as reference>
drop(a); // Race condition: was B actually done with the picture before we reclaimed A's surface?

With the current code, there is no way to tell Rust of B's relationship on A. In particular, as we're simply sending their surface IDs to the libva driver using an unsafe block, no guarantees apply.

A way to solve this is to patch this guy:

/// Inner type for [`Picture`], that is, the part that exists in all states.
struct PictureInner<T> {
    /// Timestamp of the picture.
    timestamp: u64,
    /// A context associated with this picture.
    context: Rc<Context>,
    /// Contains the buffers used to decode the data.
    buffers: Vec<Buffer>,
    /// Contains the actual decoded data. Note that the surface may be shared in
    /// interlaced decoding.
    surface: Rc<T>,
    /// The list of surfaces we depend on. These will not be dropped nor reused until we release our reference.
    dependencies: Vec<Rc<T>>,
}

We can populate dependencies because we get access to the Picture backing the references in all backends:

    fn build_pic_param(
        _: &Slice<impl AsRef<[u8]>>,
        current_picture: &PictureData,
        current_surface_id: libva::VASurfaceID,
        dpb: &Dpb<VADecodedHandle<M>>,
        rps: &RefPicSet<VADecodedHandle<M>>,
        sps: &Sps,
        pps: &Pps,
    ) -> anyhow::Result<(BufferType, [PictureHEVC; 15])> {
        let curr_pic = Self::fill_va_hevc_pic(current_picture, current_surface_id, rps);

        let mut reference_frames = vec![];

        for ref_pic in dpb.get_all_references() {
            let surface_id = ref_pic.1.borrow().surface_id();  <--------- we can get the Picture associated with ref_pic here if we want
            let ref_pic = Self::fill_va_hevc_pic(&ref_pic.0.borrow(), surface_id, rps);
            reference_frames.push(ref_pic);
        }

Under my proposed solution, we could, for example:

for ref_pic in dpb.get_all_references() {
            let surface_id = ref_pic.1.borrow().surface_id();
            current_picture.add_dependency(&ref_pic.1.borrow()); // Ok, we hold a reference and this surface will not be reused/dropped
            let ref_pic = Self::fill_va_hevc_pic(&ref_pic.0.borrow(), surface_id, rps);
            reference_frames.push(ref_pic);
        }

Now, after we transition from PictureEnd to PictureSync we do not care about dependencies anymore because all data has been processed by the driver already. So in the transition from PictureEnd to PictureSync we can clear them:

impl<T> Picture<PictureEnd, T> {
    /// Syncs the picture, ensuring that all pending operations are complete when this call returns.
    pub fn sync<D: SurfaceMemoryDescriptor>(
        self,
    ) -> Result<Picture<PictureSync, T>, (VaError, Self)>
    where
        T: Borrow<Surface<D>>,
    {
        let res = self.surface().sync();
        self.dependencies.clear(); // OK, surfaces are free to be used again

        match res {
            Ok(()) => Ok(Picture { // Or alternatively, we should clear them here, because the user may retry on Err
                inner: self.inner,
                phantom: PhantomData,
            }),
            Err(e) => Err((e, self)),
        }
    }
}

@Gnurou WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant