Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We can possibly get rid of a lot of generics through Cow<'_, [u8]> #49

Open
dwlsalmeida opened this issue Aug 11, 2023 · 7 comments
Open
Labels
enhancement New feature or request

Comments

@dwlsalmeida
Copy link
Collaborator

I am experimenting with a new design for the AV1 code. One that uses Cow<'_, [u8]> in place of T: AsRef<[u8]>. This means that we can remove this T: AsRef<[u8]> noise from a lot of functions and traits, while making it possible to still build Slices, Frames and etc out of both borrowed and owned data without copying.

What's more, Cow<'_, [u8]> dereferences to &[u8] so nothing fundamentally changes in the code as far as I can see. Not only that, but we de facto do not even support anything other than &[u8] in the decoders, as introducing T in our backend traits would make them not object safe. Using Cow<'_, [u8]> circumvents this problem, as there's no generics involved and lifetimes do not break object safety.

If the current AV1 design proves successful, we should maybe backport these changes to the other codecs as a way to clean up the codebase by a considerable amount.

@dwlsalmeida dwlsalmeida added the enhancement New feature or request label Aug 11, 2023
@dwlsalmeida dwlsalmeida changed the title We can possibly get rid of a lot of generics through Cow We can possibly get rid of a lot of generics through Cow<'_, [u8]> Aug 11, 2023
@Gnurou
Copy link
Collaborator

Gnurou commented Sep 25, 2023

IIRC the point of using AsRef<[u8]> was to allow any input type that can provide a reference to the stream as input. Switching to Cow would mean that the data now must be in a Vec<u8>, which is probably going to be the case 99% of the time indeed, but might force us to perform a copy if that condition is not met.

Would using Cow allow us to avoid a copies in cases where AsRef<[u8]> makes us do one? That would be the case for using Cow in the code. Otherwise I suspect we can remove all the generics if we force NaluReader to with with a &[u8] and remove its own generic. Since all NaluReaders are build and used locally, I think that should work. #51 does this, is that all the simplification you had in mind?

@dwlsalmeida
Copy link
Collaborator Author

dwlsalmeida commented Sep 25, 2023

@Gnurou Hi Alex!

My idea is

pub data: Cow<'a, [u8]>,

By switching to Cow, we remove the T as you did in #51, but users can switch to an owned variant of the slice/frame/obu etc through to_owned if they so desire by paying for the copy.

Note that the input is still &[u8], so they can still use the borrowed versions cost free, which is what we currently do anyways.

pub fn parse_obu<'a>(&mut self, data: &'a [u8]) -> anyhow::Result<Option<Obu<'a>>> {

This will also remove this <[u8]> from many parts of the code such as

    /// Called to dispatch a decode operation to the backend.
    #[allow(clippy::too_many_arguments)]
    fn decode_slice(
        &mut self,
        picture: &mut Self::Picture,
        slice: &Slice<&[u8]>, <-----------------------
        sps: &Sps,
        pps: &Pps,
        dpb: &Dpb<Self::Handle>,
        ref_pic_list0: &[DpbEntry<Self::Handle>],
        ref_pic_list1: &[DpbEntry<Self::Handle>],
    ) -> StatelessBackendResult<()>;

To summarize:

a) we keep the number of copies equal
b) we provide a convenience for a price (as the parsers can be used by any other project, in theory)
c) we clean up our code

@Gnurou
Copy link
Collaborator

Gnurou commented Oct 3, 2023

Thanks for the clarifications. A few remarks:

The current AV1 parser does not need to use Cow - i.e. line 107 that you quoted can just as well be

pub data: &'a [u8], 

And the code builds just as fine. Users that need their own version of the data can use e.g. Vec's From<&[u8]> to achieve the same effect. Cow is typically used in scenarios where you might need to change some borrowed data (and thus create your own copy) at runtime. This is not the scenario we are facing with the parsers, where the data is always static.

That being said, the base idea is still valid and we should consider using it in other parsers - only we probably just need to replace AsRef<[u8]> by &'a [u8].

Both Cow<'a, [u8]> and &'a [u8] introduce a lifetime, so we would need to make sure that this works well with the existing parsers, but if AV1 can accomodate it I suppose the others could too.

@Gnurou
Copy link
Collaborator

Gnurou commented Oct 3, 2023

So I tried to implement this a bit and I think I understand what you want - I have updated PR #51 to include the change to &'a [u8]. Please let me know if that's what you had in mind.

Introducing Cow<'a, [u8]> would not simplify things any further IMHO since the lifetime would still be there anyway.

@dwlsalmeida
Copy link
Collaborator Author

dwlsalmeida commented Oct 6, 2023

@Gnurou in terms of removing the generics, that was mostly what I had in mind, yes. As for Cow:

The current AV1 parser does not need to use Cow

Oh it doesn't need to, yes. It's just an added convenience for users. I explain better below.

Users that need their own version of the data can use e.g. Vec's From<&[u8]> to achieve the same effect.

Not sure I understand. For example:

pub struct Frame<'a> {
     /// The bitstream data for this frame.
     bitstream: &'a [u8],

You can't have an owned version of Frame. Even Using Vec::From<&[u8]> would not help, because

a) that will create an owned version of the bitstream, not an owned version of Frame.
b) the resulting Vec can contain data for multiple frames.

The above assumes that you mean using Vec::from passing in bitstream here:

   /// Parse a single frame from the chunk in `data`.
    pub fn parse_frame<'a>(&mut self, bitstream: &'a [u8]) -> anyhow::Result<Frame<'a>> {
Introducing Cow<'a, [u8]> would not simplify things any further IMHO since the lifetime would still be there anyway.

True you can't get rid of lifetimes, but you can get owned Frames/Slices/Tiles etc with Cow if you want through to_owned().

To make my point a bit more clear (sorry for being a bit verbose here)

let frame: Cow<'static, u8> = Frame {
  bitstream: frame.bitstream.to_owned(),
  ..frame
  };

With the design I introduced in the AV1 parser, this will also only clone the part of the bitstream that generated the parsed Frame/slice/tile/etc, as we subslice the bitstream when creating the Cow:

        Ok(Some(Obu {
             header,
             data: Cow::from(&data[..start_offset + obu_size]),
             start_offset,
             size: obu_size,
         }))
     }

@Gnurou
Copy link
Collaborator

Gnurou commented Oct 13, 2023

Mmm one thing I don't quite understand is, since the lifetime remain (and thus an owned Frame will be subject the the same lifetime as a borrowed one), and we don't modify the bitstream anyway, what is the point of making a copy of the bitstream data?

@dwlsalmeida
Copy link
Collaborator Author

dwlsalmeida commented Oct 13, 2023

@Gnurou

since the lifetime remain (and thus an owned Frame will be subject the the same lifetime as a borrowed one)

Correct me if I am wrong, but if you call to_owned, you get Frame<'static>, so no.

Playground link to corroborate my point above: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=950ef57fd86788da7a22901353696124

and we don't modify the bitstream anyway, what is the point of making a copy of the bitstream data?

Our code still creates a borrowed version e.g.:

             data: Cow::from(&data[..start_offset + obu_size]),

So no copies are being made. Again, a mere convenience if anyone wants to have an owned Frame/slice/tile, at the cost of copying the data. Frankly, #51 also works, because it does away with the generic parameter already. If you want to keep it at that (i.e.: if you feel Cow is not worth the trouble here), then I am also OK with that.

One thing that comes to mind is this excerpt from the h.265 VAAPI code:

#[derive(Default)]
pub struct BackendData {
    // We are always one slice behind, so that we can mark the last one in
    // submit_picture()
    last_slice: Option<(
        SliceParameterBufferHEVC,
        Option<SliceParameterBufferHEVCRext>,
        Vec<u8>, <---------
    )>,

    va_references: [PictureHEVC; 15],
}

Look how we have to keep a Vec instead of the slice, which is very cumbersome, e.g.:

       let slice_data = Vec::from(slice.nalu().as_ref());

        self.replace_last_slice(slice_param, slice_param_ext, slice_data);

Which, again, stems from our inability to have an owned Slice type. I think that would benefit from the Cow approach, for example.

bgrzesik added a commit to bgrzesik/cros-codecs that referenced this issue Mar 5, 2024
bgrzesik added a commit to bgrzesik/cros-codecs that referenced this issue Mar 8, 2024
bgrzesik added a commit to bgrzesik/cros-codecs that referenced this issue Mar 8, 2024
This changes will enable the decoder to copy Nalu/Slice structs to internal
state instead of using Vec.

Addresses partially chromeos#49
Gnurou pushed a commit that referenced this issue Mar 13, 2024
This changes will enable the decoder to copy Nalu/Slice structs to internal
state instead of using Vec.

Addresses partially #49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants