-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: BigEndian/Network integer types #95207
Comments
Tagging subscribers to this area: @dotnet/area-system-numerics Issue DetailsBackground and motivationOften when we need to parse some data coming from network - packet header, IP addresses, etc.. we have to use one of the API: API Proposalnamespace System.Net;
[StructLayout(LayoutKind.Sequential)]
public readonly struct NetInt<T> // Or BEInt
: IEquatable<NetInt<T>>,
IComparable<NetInt<T>>,
IComparisonOperators<NetInt<T>, NetInt<T>, bool>,
IBitwiseOperators<NetInt<T>, NetInt<T>, NetInt<T>>
// And maybe some other numeric interfaces
where T : unmanaged, IBinaryInteger<T> // maybe also IUnsignedNumber<T>?
{
private readonly T _value;
private T ConvertedValue
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
get => Convert(_value);
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private NetInt(T value) => _value = value;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public override int GetHashCode() => _value.GetHashCode();
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public bool Equals(NetInt<T> other) => _value == other._value;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public override bool Equals(object? obj) => obj is NetInt<T> other && Equals(other);
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public int CompareTo(NetInt<T> other) => ConvertedValue.CompareTo(other.ConvertedValue);
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool operator ==(NetInt<T> left, NetInt<T> right) => left._value == right._value;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool operator !=(NetInt<T> left, NetInt<T> right) => left._value != right._value;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool operator <(NetInt<T> left, NetInt<T> right) => left.ConvertedValue < right.ConvertedValue;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool operator <=(NetInt<T> left, NetInt<T> right) => left.ConvertedValue <= right.ConvertedValue;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool operator >(NetInt<T> left, NetInt<T> right) => left.ConvertedValue > right.ConvertedValue;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool operator >=(NetInt<T> left, NetInt<T> right) => left.ConvertedValue >= right.ConvertedValue;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static NetInt<T> operator &(NetInt<T> left, NetInt<T> right) => new(left._value & right._value);
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static NetInt<T> operator |(NetInt<T> left, NetInt<T> right) => new(left._value | right._value);
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static NetInt<T> operator ^(NetInt<T> left, NetInt<T> right) => new(left._value ^ right._value);
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static NetInt<T> operator ~(NetInt<T> value) => new(~value._value);
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static explicit operator T(NetInt<T> value) => value.ConvertedValue;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static explicit operator NetInt<T>(T value) => new(Convert(value));
[MethodImpl(MethodImplOptions.AggressiveInlining)]
// I benchmarked this and it has the same performance as direct call of BinaryPrimitives.ReverseEndianness for specific types
// and it doesn't do any memory allocations
private static T Convert(T value)
{
if (!BitConverter.IsLittleEndian)
return value;
if (typeof(T) == typeof(byte) ||
typeof(T) == typeof(sbyte))
return value;
if (typeof(T) == typeof(short))
return (T)(object)BinaryPrimitives.ReverseEndianness((short)(object)value);
if (typeof(T) == typeof(ushort))
return (T)(object)BinaryPrimitives.ReverseEndianness((ushort)(object)value);
if (typeof(T) == typeof(int))
return (T)(object)BinaryPrimitives.ReverseEndianness((int)(object)value);
if (typeof(T) == typeof(uint))
return (T)(object)BinaryPrimitives.ReverseEndianness((uint)(object)value);
if (typeof(T) == typeof(nint))
return (T)(object)BinaryPrimitives.ReverseEndianness((nint)(object)value);
if (typeof(T) == typeof(nuint))
return (T)(object)BinaryPrimitives.ReverseEndianness((nuint)(object)value);
if (typeof(T) == typeof(long))
return (T)(object)BinaryPrimitives.ReverseEndianness((long)(object)value);
if (typeof(T) == typeof(ulong))
return (T)(object)BinaryPrimitives.ReverseEndianness((ulong)(object)value);
if (typeof(T) == typeof(Int128))
return (T)(object)BinaryPrimitives.ReverseEndianness((Int128)(object)value);
if (typeof(T) == typeof(UInt128))
return (T)(object)BinaryPrimitives.ReverseEndianness((UInt128)(object)value);
if (typeof(T) == typeof(char))
return (T)(object)BinaryPrimitives.ReverseEndianness((char)(object)value);
throw new NotSupportedException();
}
} API UsageHandling network headers: [StructLayout(LayoutKind.Sequential)]
public readonly struct UDPHeader
{
public readonly NetInt<ushort> SourcePort;
public readonly NetInt<ushort> DestinationPort;
public readonly NetInt<ushort> Length;
public ushort Checksum
}
...
UDPHeader header = ...
var port = (ushort)header.SourcePort;
... IPAddress/IPNetwork operations, e.g. IPNetwork.Contains could probably be simplified: public bool Contains(IPAddress address)
{
ArgumentNullException.ThrowIfNull(address);
if (address.AddressFamily != BaseAddress.AddressFamily)
{
return false;
}
// This prevents the mask computation and checking
if (PrefixLength == 0)
{
return true;
}
static bool Contains<T>(IPAddress baseAddress, IPAddress otherAddress) where T : unmanaged, IBinaryInteger<T>
{
Unsafe.SkipInit(out NetInt<T> baseAddressValue);
Unsafe.SkipInit(out NetInt<T> otherAddressValue);
baseAddress.TryWriteBytes(MemoryMarshal.AsBytes(new Span<NetInt<T>>(ref baseAddressValue)), out int bytesWritten);
Debug.Assert(bytesWritten == Unsafe.SizeOf<T>());
otherAddress.TryWriteBytes(MemoryMarshal.AsBytes(new Span<NetInt<T>>(ref otherAddressValue)), out bytesWritten);
Debug.Assert(bytesWritten == Unsafe.SizeOf<T>());
NetInt<T> mask = (NetInt<T>)(T.AllBitsSet << (Unsafe.SizeOf<T>() * 8 - PrefixLength));
return baseAddressValue == (otherAddressValue & mask);
}
return address.AddressFamily == AddressFamily.InterNetwork
? Contains<uint>(BaseAddress, address)
: Contains<UInt128>(BaseAddress, address);
} I believe IPAddress implementation can also be re-worked as well utilizing these new types Alternative Designsnamespace System.Net;
public interface INetworkInteger<T, TBase> // or IBigEndianInteger
: IEquatable<T>,
IComparable<T>,
IComparisonOperators<T, T, bool>,
IBitwiseOperators<T, T, T>
where T : unmanaged, INetworkInteger<T, TBase>
where TBase : unmanaged, IBinaryInteger<TBase> // maybe also IUnsignedNumber<TBase>?
{
static abstract explicit operator T(TBase value);
static abstract explicit operator TBase(T value);
}
[StructLayout(LayoutKind.Sequential)]
public readonly struct NetInt32 : INetworkInteger<NetInt32, UInt32> // or BEInt32
{
private readonly UInt32 _value;
private UInt32 ConvertedValue
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
get => Convert(_value);
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private NetInt32(UInt32 value) => _value = value;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public override int GetHashCode() => _value.GetHashCode();
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public bool Equals(NetInt32 other) => _value == other._value;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public override bool Equals(object? obj) => obj is NetInt32 other && Equals(other);
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public int CompareTo(NetInt32 other) => ConvertedValue.CompareTo(other.ConvertedValue);
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool operator ==(NetInt32 left, NetInt32 right) => left._value == right._value;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool operator !=(NetInt32 left, NetInt32 right) => left._value != right._value;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool operator <(NetInt32 left, NetInt32 right) => left.ConvertedValue < right.ConvertedValue;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool operator <=(NetInt32 left, NetInt32 right) => left.ConvertedValue <= right.ConvertedValue;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool operator >(NetInt32 left, NetInt32 right) => left.ConvertedValue > right.ConvertedValue;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool operator >=(NetInt32 left, NetInt32 right) => left.ConvertedValue >= right.ConvertedValue;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static NetInt32 operator &(NetInt32 left, NetInt32 right) => new(left._value & right._value);
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static NetInt32 operator |(NetInt32 left, NetInt32 right) => new(left._value | right._value);
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static NetInt32 operator ^(NetInt32 left, NetInt32 right) => new(left._value ^ right._value);
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static NetInt32 operator ~(NetInt32 value) => new(~value._value);
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static explicit operator UInt32(NetInt32 value) => value.ConvertedValue;
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static explicit operator NetInt32(UInt32 value) => new(Convert(value));
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static UInt32 Convert(UInt32 value) => BitConverter.IsLittleEndian ? BinaryPrimitives.ReverseEndianness(value) : value;
}
public struct NetInt16 ...
public struct NetInt64 ...
public struct NetInt128 ... RisksNo response
|
This is ultimately along the same lines as #86084. Having a type like this doesn't really "fix" anything, has many of the same underlying issues, and adds needless overhead that would cause it to not be used in the .NET Libraries or ASP.NET Core. The underlying storage format of a type doesn't truly matter, it's an implementation detail. We could store the data as big-endian, as little-endian, or even as some completely different layout. On most hardware, Therefore, something like What really matters is that on serialization/deserialization boundaries you read and write your data correctly. For something that is Code that is just arbitrarily handling values generally shouldn't be doing endianness swapping. It should only be done at the boundaries where you're reading/writing the value to a byte stream. It should be done to follow the spec'd format, so that network packets are written using big-endian and something like reading an The APIs built into the .NET Libraries or ASP.NET largely handle this for you already, especially in things like the HTTP headers it exposes types and properties for. The places it doesn't handle it is around the raw content (which could itself be encoded in little-endian, if the content type spec's it to be that way) and where the user is meant to use the dedicated APIs for reading/writing the parts of that data given the raw stream. |
I read the thread you mentioned but not quite convinced. How do you explain Linux kernel having _beXX, _leXX types? |
Wrapper types are not always "transparent". They can change the ABI, cause more copying or JIT work to be done, and more. The JIT often makes them "free", but that isn't always possible.
This would solve part of the problem, but also reduce the usability and utility of such types.
These are typedefs, effectively a You're not getting anything that an XML documentation comment (IntelliSense) won't already surface to you or that shouldn't be handled transparently by the backend. The Linux Kernel is a very lowlevel codebase and a different domain entirely from what .NET typically targets. API review has already had significant discussion on exposing types like this. Our stance remains that endianness is a concern of serialization and is not something that warrants its own type being exposed. There are too many potential pitfalls, holes, bugs, and other issues that can be introduced by such types. The clearest and easiest thing is for APIs to take the singular type and for the serialization/deserialization layer to do the conversion if the underlying specification expects or provides the data in a particular format. |
Then what would be the "recommended" way to deserialize the UDPHeader struct for my example above? Lets say I need to marshal from some unmanaged code or read from Span? |
Likely the same way endianness or serialization is handled for other types in general. You'd expose some explicitly named methods for reading/writing the data to/from a stream or span or other destination buffer. The write API would take care of taking the data from machine endianness format and ensuring it is written with the format expected for the raw byte sequence. The read API would do the inverse and take it from the spec'd format and convert it to the destination format. This is how serialization works in most languages and scenarios. It reduces the costs/overheads, ensures the data is in a regular and efficient format for programmatic use, and helps avoid or minimize many types of bugs or other issues that could occur. |
This is notably also how HTTP headers are basically handled today. It's how the PE file readers work today, it's how the underlying image or video file format/codec reader/writers work (both in .NET and in native), and so on. It is essentially a long solved problem, with most languages, operating systems, and frameworks handling it the same way. |
Background and motivation
Often when we need to parse some data coming from network - packet header, IP addresses, etc.. we have to use one of the API:
IPAddress.NetworkToHostOrder
orBinaryPrimitives.ReverseEndianness
together withBitConverter.IsLittleEndian
or some other similar API, and often there can be confusion what the endianness does current value have in our code. But if we'd have a dedicated data type that would eliminate the confusion and simplify the codeAPI Proposal
API Usage
Handling network headers:
IPAddress/IPNetwork operations, e.g. IPNetwork.Contains could probably be simplified:
I believe IPAddress implementation can also be re-worked as well utilizing these new types
Alternative Designs
Risks
No response
The text was updated successfully, but these errors were encountered: