Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

occasional big jump in twcc reference time #2841

Closed
yin-zhang opened this issue Dec 23, 2021 · 10 comments
Closed

occasional big jump in twcc reference time #2841

yin-zhang opened this issue Dec 23, 2021 · 10 comments
Labels

Comments

@yin-zhang
Copy link

yin-zhang commented Dec 23, 2021

Occasionally WebRTC reports

[471:567] [38699] (transport_feedback_adapter.cc:187): Unexpected feedback timestamp received.

I suspect it is caused by the following code fragment in rtcp.c:

/* (use only 23 bits of reference_time) */
janus_set3(data, reference_time_pos, (reference_time & 0x007FFFFF));

According the the spec, the reference time should be a 24-bit signed integer. Maybe we should simply change the above code to:

/* (use 24 bits of reference_time) */
janus_set3(data, reference_time_pos, (reference_time & 0x00FFFFFF));

Otherwise, when the last 24 bit goes from 0x007FFFFF to 0x00800000, webrtc will complain unexpected timestamp.

@cppdev-1
Copy link

cppdev-1 commented Jan 1, 2022

@yin-zhang Could you reply to the last message there?

@lminiero
Copy link
Member

@yin-zhang apologies for this very late reply, I see the issue was opened during my holiday break and I must have missed this...
I think that the 0x007FFFFF mask is correct, exactly because as you pointed out the time is a signed integer, and since this is a report we generate ourselves, IIRC we always generate positive values there. Besides, reference_time_pos is defined as a size_t, which means it's always positive, and so using 0x00FFFFFF would actually definitely generate broken results at times (the signed but might be set incorrectly if the value is too high).

That said, we won't know if this is the cause until we can have a look at a packet it complains about. Any chance you can get an unencrypted capture of a message that causes this issue for you? Not sure if browsers (Chrome?) have a way to do that, but in Janus you can do a pcap capture of all the incoming/outoing RTP and RTCP traffic for a handle using the Admin API: you can refer to this blogpost for some info on how to capture the traffic.

@lminiero
Copy link
Member

lminiero commented Feb 9, 2022

@yin-zhang any update?

@yin-zhang
Copy link
Author

Sorry for the delay in my reply! After changing the mask to 0x00FFFFFF, the "Unexpected feedback timestamp received." went away completely ... So I just made the change and didn't bother to investigate it further.

@yin-zhang
Copy link
Author

If we use 0x007FFFFF as the mask, then the reference time becomes a 23-bit positive integer. Whenever the last 24 bit goes from 0x007FFFFF to 0x00800000, the reference time will go from 0x007FFFFF to 0. For a 24-bit signed integer, it should go from 0x007FFFFF to 0x00800000.

Note that no matter which mask we use, reference_time_pos is positive.

@lminiero
Copy link
Member

lminiero commented Mar 9, 2022

The problem is that the 0x00FFFFFF would be improper, here, because reference_time is defined in the code as guint64 (so uint64_t), which means it can never be negative. Since it's assumed it cannot be negative, it's right that we enforce a 23-bit positive integer in our code, and the first bit will always be 0.

@yin-zhang
Copy link
Author

yin-zhang commented Apr 15, 2022

Please check the WebRTC source code src/modules/rtp_rtcp/source/rtcp_packet/transport_feedback.cc
where

constexpr int64_t kTimeWrapPeriodUs = (1ll << 24) * kBaseScaleFactor;

void TransportFeedback::SetBase(uint16_t base_sequence, int64_t ref_timestamp_us) {
  RTC_DCHECK_EQ(num_seq_no_, 0);`
  RTC_DCHECK_GE(ref_timestamp_us, 0);`
  base_seq_no_ = base_sequence;
  base_time_ticks_ = (ref_timestamp_us % kTimeWrapPeriodUs) / kBaseScaleFactor;
  last_timestamp_us_ = GetBaseTimeUs();
}

int64_t TransportFeedback::GetBaseDeltaUs(int64_t prev_timestamp_us) const {
  int64_t delta = GetBaseTimeUs() - prev_timestamp_us;

  // Detect and compensate for wrap-arounds in base time.
  if (std::abs(delta - kTimeWrapPeriodUs) < std::abs(delta)) {
    delta -= kTimeWrapPeriodUs;  // Wrap bactkwards.
  } else if (std::abs(delta + kTimeWrapPeriodUs) < std::abs(delta)) {
    delta += kTimeWrapPeriodUs;  // Wrap forwards.
  }
  return delta;
}

You see everything is using (1<<24)*64ms as the wrap around period. If you enforce a 23-bit positive integer, then the wrap around period should be (1<<23)*64ms. And the spec shouldn't really state that the reference time is a "reference time: 24 bits Signed integer indicating ..."

@yin-zhang
Copy link
Author

If the reference time is initialized once for an RTCP context and then hold constant afterwards, then there is no issue. But right now, the reference time is computed for each TWCC packet. Hence the mismatch in wrap around period would cause a big jump when the last 23 bits wraps around.

@lminiero
Copy link
Member

Sorry for this late response.

I'm still dubious about this. If you see the Chrome code you shared, they use an int64_t variable as a reference, and so they indeed use a signed integer as part of their computations: in that case, it makes sense for them to write the whole 24 bits. In our case, we use an unsigned variable instead:

https://github.com/meetecho/janus-gateway/blob/master/src/rtcp.c#L1582

since it's always going to be positive. In fact, for each packet it's computed out of the stat->timestamp property:

https://github.com/meetecho/janus-gateway/blob/master/src/rtcp.c#L1631-L1637

and timestamp is an unsigned 64-bit integer as well:

https://github.com/meetecho/janus-gateway/blob/master/src/rtcp.h#L293
https://github.com/meetecho/janus-gateway/blob/master/src/ice.c#L2647

As a consequence, since we only work with unsigned integers, it makes no sense for us to write all the 24-bits, since that field is for a signed integer instead, and so in our case the first bit is always 0. That's why we need to only write the other 23 bits instead.

Not sure if there's anything I'm missing, but that's my understanding of it. Pinging @murillo128 as well since he's the one who contributed the TWCC code at the time, and so he may correct me if I'm wrong.

@lminiero
Copy link
Member

Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants