Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal : Full RFC 4122 for Guids #23868

Closed
sandorfr opened this issue Oct 17, 2017 · 21 comments
Closed

Proposal : Full RFC 4122 for Guids #23868

sandorfr opened this issue Oct 17, 2017 · 21 comments
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Runtime
Milestone

Comments

@sandorfr
Copy link

Description

Provide full implementation of RFC 4211 for UUID as part of the Guid type.

Having hash based predictible Guids (v5) which could be derived from string and don't collide with random based guids (v4) can be extremely useful in some scenarios. RFC4122 proposes a standard around UUIDs to address that.

Possible Use Cases

Here are some of the use cases I see:

  • Slug projection to Guid primary key (eliminate the need to maintain an index for that matter, allows using the primary key which is usually the clustered index to get the data)
  • Mixing potentially identical ids but namespaced in the same keyspace.
  • Reducing a composite key to a single Guid.
  • More broadly every scenario where a hash is a good fit (not from a security perspective) but where we need guids.

Proposed API

public partial struct Guid {
    public static readonly Guid DnsNamespace // 6ba7b810-9dad-11d1-80b4-00c04fd430c8
    public static readonly Guid UrlNamespace // 6ba7b811-9dad-11d1-80b4-00c04fd430c8

    public static Guid NewGuid();

    public static Guid NewUuidV1();
    public static Guid NewUuidV4();
    public static Guid NewUuidV5(string value, Guid namespace);
    public static Guid NewUuidV5(byte[] value, Guid namespace);
}

Details

We would obviously keep the existing NewGuid() which in windows calls UuidCreate. It seems that it uses UUID v4 under the hood.

In addition we would add NewGuidV1, NewGuidV4 et NewGuidV5 methods. They would map to the adequate UUID generation method.

Related Issues

Open Questions

Currently the byte order from the constructor and ToByteArray is a bit unatural as it just reflect the underlying struct as we could expect a more natural byte order. It could be worth it to add methods overload to read and write guid using the natural order.

See also

Updates

  • December 12, 2017: Provide a first API proposal
  • January 18, 2017: Add use cases
@joperezr
Copy link
Member

@sandorfr are you interested on writing up a formal api proposal so that we can take a look at it?

@sandorfr
Copy link
Author

sandorfr commented Nov 21, 2017 via email

@sandorfr
Copy link
Author

Just updated the description accordingly.

@joperezr
Copy link
Member

@sandorfr from your proposal can be extremely useful in some scenarios can you please add some sample scenarios where this would be useful to help on the review?

@sandorfr
Copy link
Author

@joperezr I've added a section with possible use cases

@shmuelie
Copy link
Contributor

shmuelie commented Sep 11, 2019

In a project I had a need for a V4 UUID (implementation of an internal communication protocal). In my research I found that while System.Guid is a V4 UUID there is no guarantee that this will always be true. I therefore built my own Uuid type. While I can't say it's fully optimized, it is full compliant with RFC 4122 and provides logic for creating versions 2, 3, 4, and 5. I've copied the API below. (The implementation uses unsafe code in some places to prevent creating arrays.)

    /// <summary>
    ///     An Universally Unique IDentifier
    /// </summary>
    /// <remarks>
    ///     <para>Represents a RFC-4122 compliant Universally Unique IDentifier (UUID). An UUID is a 128-bit integer (16 bytes) that can be used across all computers and networks wherever a unique identifier is required. Such an identifier has a very low probability of being duplicated.</para>
    ///     <para>A <see cref="Guid"/> is a <see cref="Uuid"/>, version 4.</para>
    /// </remarks>
    /// <seealso cref="IEquatable{T}" />
    /// <seealso cref="IComparable{T}"/>
    /// <seealso cref="IComparable"/>
    /// <seealso cref="Guid"/>
    /// <seealso href="https://tools.ietf.org/html/rfc4122">RFC-4122</seealso>
    [StructLayout(LayoutKind.Sequential)]
    [Serializable]
    [DebuggerDisplay("{DebuggerDisplay,nq}")]
    public readonly struct Uuid : IEquatable<Uuid>, IComparable<Uuid>, IComparable
    {
        /// <summary>
        ///     A UUID with all bits set to zero.
        /// </summary>
        /// <value>00000000-0000-0000-0000-000000000000</value>
        /// <remarks><para>Same as <see cref="Uuid.Uuid()"/></para></remarks>
        public static readonly Uuid Nil = new Uuid();

        /// <summary>
        ///     Name string is a fully-qualified domain name
        /// </summary>
        /// <value>6ba7b810-9dad-11d1-80b4-00c04fd430c8</value>
        public static readonly Uuid DNS = new Uuid(0x6ba7b810, 0x9dad, 0x11d1, 0x80, 0xb4, 0x00c0, 0x4fd430c8);

        /// <summary>
        ///     Name string is a URL.
        /// </summary>
        /// <value>6ba7b811-9dad-11d1-80b4-00c04fd430c8</value>
        public static readonly Uuid Url = new Uuid(0x6ba7b811, 0x9dad, 0x11d1, 0x80, 0xb4, 0x00c0, 0x4fd430c8);

        /// <summary>
        ///     Name string is an ISO OID.
        /// </summary>
        /// <value>6ba7b812-9dad-11d1-80b4-00c04fd430c8</value>
        public static readonly Uuid Oid = new Uuid(0x6ba7b812, 0x9dad, 0x11d1, 0x80, 0xb4, 0x00c0, 0x4fd430c8);

        /// <summary>
        ///     Name string is an X.500 DN (in DER or a text output format).
        /// </summary>
        /// <value>6ba7b814-9dad-11d1-80b4-00c04fd430c8</value>
        public static readonly Uuid X500 = new Uuid(0x6ba7b814, 0x9dad, 0x11d1, 0x80, 0xb4, 0x00c0, 0x4fd430c8);
        
        /// <summary>
        ///     Initializes a new instance of the <see cref="Uuid"/> struct.
        /// </summary>
        /// <param name="time_low">The low field of the timestamp.</param>
        /// <param name="time_mid">The middle field of the timestamp.</param>
        /// <param name="time_hi_and_version">The high field of the timestamp multiplexed with the version number.</param>
        /// <param name="clock_seq_hi_and_reserved">The high field of the clock sequence multiplexed with the variant.</param>
        /// <param name="clock_seq_low">The low field of the clock sequence.</param>
        /// <param name="node_hi">The first two bytes of the spatially unique node identifier.</param>
        /// <param name="node_low">The last four bytes of the spatially unique node identifier.</param>
        private Uuid(uint time_low, ushort time_mid, ushort time_hi_and_version, byte clock_seq_hi_and_reserved, byte clock_seq_low, ushort node_hi, uint node_low);

        /// <summary>
        ///     Initializes a new instance of the <see cref="Uuid"/> struct.
        /// </summary>
        /// <param name="time_low">The low field of the timestamp.</param>
        /// <param name="time_mid">The middle field of the timestamp.</param>
        /// <param name="time_hi_and_version">The high field of the timestamp multiplexed with the version number.</param>
        /// <param name="clock_seq_hi_and_reserved">The high field of the clock sequence multiplexed with the variant.</param>
        /// <param name="clock_seq_low">The low field of the clock sequence.</param>
        /// <param name="node">The spatially unique node identifier.</param>
        private unsafe Uuid(uint time_low, ushort time_mid, ushort time_hi_and_version, byte clock_seq_hi_and_reserved, byte clock_seq_low, byte[] node);

        /// <summary>
        ///     Initializes a new instance of the <see cref="Uuid"/> struct.
        /// </summary>
        /// <param name="bytes">A 16-element byte array containing values with which to initialize the UUID.</param>
        private unsafe Uuid(byte[] bytes);

        /// <summary>
        ///     Initializes a new instance of the <see cref="Uuid"/> struct.
        /// </summary>
        /// <param name="bytes">A 16-element byte array containing values with which to initialize the UUID.</param>
        /// <param name="time_hi_and_version">The high field of the timestamp multiplexed with the version number.</param>
        /// <param name="clock_seq_hi_and_reserved">The high field of the clock sequence multiplexed with the variant.</param>
        /// <remarks><para>Bytes 6-8 (zero indexed) of <paramref name="bytes"/> are ignored. <paramref name="clock_seq_hi_and_reserved"/> and <paramref name="time_hi_and_version"/> are used instead.</para></remarks>
        private unsafe Uuid(byte[] bytes, ushort time_hi_and_version, byte clock_seq_hi_and_reserved);

        /// <summary>
        ///     Creates a new truly-random or pseudo-random (version 4) UUID.
        /// </summary>
        /// <returns>A new UUID.</returns>
        public static Uuid CreateRandom();

        /// <summary>
        ///     Creates a new time-based (version 1) UUID.
        /// </summary>
        /// <returns>A new UUID.</returns>
        public static Uuid CreateTimeBased();

        /// <summary>
        ///     Returns a 16-element byte array that contains the value of this instance.
        /// </summary>
        /// <returns>A 16-element byte array.</returns>
        /// <remarks><para>Note that the order of bytes returned is network order, regardless of the system.</para></remarks>
        public unsafe byte[] ToByteArray();

        /// <summary>
        ///     Creates a new MD5 name-based (version 3) UUID.
        /// </summary>
        /// <param name="namespaceID">The namespace UUID.</param>
        /// <param name="name">The name for the new UUID.</param>
        /// <returns>A new UUID.</returns>
        public static Uuid CreateMd5NameBased(in Uuid namespaceID, string name);

        /// <summary>
        ///     Creates a new SHA1 name-based (version 5) UUID.
        /// </summary>
        /// <param name="namespaceID">The namespace UUID.</param>
        /// <param name="name">The name for the new UUID.</param>
        /// <returns>A new UUID.</returns>
        public static Uuid CreateSha1NameBased(in Uuid namespaceID, string name);

        /// <summary>
        ///     Gets the version.
        /// </summary>
        /// <value>
        ///     The version.
        /// </value>
        /// <remarks>
        ///     <list type="table">
        ///         <listheader>
        ///             <term>Version</term>
        ///             <description>Description</description>
        ///         </listheader>
        ///         <item>
        ///             <term>1</term>
        ///             <description>The time-based version.</description>
        ///         </item>
        ///         <item>
        ///             <term>2</term>
        ///             <description>DCE Security version, with embedded POSIX UIDs.</description>
        ///         </item>
        ///         <item>
        ///             <term>3</term>
        ///             <description>The name-based version that uses MD5 hashing.</description>
        ///         </item>
        ///         <item>
        ///             <term>4</term>
        ///             <description>The randomly or pseudo-randomly generated version.</description>
        ///         </item>
        ///         <item>
        ///             <term>5</term>
        ///             <description>The name-based version that uses SHA-1 hashing.</description>
        ///         </item>
        ///     </list>
        /// </remarks>
        public int Version { get; }

        /// <summary>
        ///     Gets the timestamp.
        /// </summary>
        /// <value>
        ///     The timestamp.
        /// </value>
        /// <remarks>
        ///     <para>For UUID version 1, this is represented by Coordinated Universal Time (UTC) as a count of 100-nanosecond intervals since 00:00:00.00, 15 October 1582 (the date of Gregorian reform to the Christian calendar) in a 60-bit value.</para>
        ///     <para>For UUID version 3 or 5, the timestamp is a 60-bit value constructed from a name.</para>
        ///     <para>For UUID version 4, the timestamp is a randomly or pseudo-randomly generated 60-bit value.</para>
        /// </remarks>
        public long Timestamp { get; }

        /// <summary>
        ///     Gets the clock sequence.
        /// </summary>
        /// <value>
        ///     The clock sequence.
        /// </value>
        /// <remarks>
        ///     <para>For UUID version 1, the clock sequence is used to help avoid duplicates that could arise when the clock is set backwards in time or if the node ID changes.</para>
        ///     <para>For UUID version 3 or 5, the clock sequence is a 14-bit value constructed from a name.</para>
        ///     <para>For UUID version 4, clock sequence is a randomly or pseudo-randomly generated 14-bit value.</para>
        /// </remarks>
        public short ClockSequence { get; }

        /// <summary>
        ///     Gets the node.
        /// </summary>
        /// <value>
        ///     The node.
        /// </value>
        /// <remarks>
        ///     <para>For UUID version 1, the node field consists of an IEEE 802 MAC address, usually the host address.  For systems with multiple IEEE 802 addresses, any available one can be used.  The lowest addressed octet (octet number 10) contains the global/local bit and the unicast/multicast bit, and is the first octet of the address transmitted on an 802.3 LAN.</para>
        ///     <para>For systems with no IEEE address, a randomly or pseudo-randomly generated value may be used.  The multicast bit must be set in such addresses, in order that they will never conflict with addresses obtained from network cards.</para>
        ///     <para>For UUID version 3 or 5, the node field is a 48-bit value constructed from a name.</para>
        ///     <para>For UUID version 4, the node field is a randomly or pseudo-randomly generated 48-bit value.</para>
        /// </remarks>
        public long Node { get; }

        /// <summary>
        ///     Converts to string.
        /// </summary>
        /// <returns>
        ///     A <see cref="string" /> that represents this instance.
        /// </returns>
        public override string ToString();

        /// <summary>
        ///     Indicates whether the current object is equal to another object of the same type.
        /// </summary>
        /// <param name="other">An object to compare with this object.</param>
        /// <returns>
        ///     <see langword="true"/> if the current object is equal to the <paramref name="other"/> parameter; otherwise, <see langword="false"/>.
        /// </returns>
        public bool Equals(Uuid other);

        /// <summary>
        ///     Determines whether the specified <see cref="object" />, is equal to this instance.
        /// </summary>
        /// <param name="obj">The <see cref="object" /> to compare with this instance.</param>
        /// <returns>
        ///     <see langword="true" /> if the specified <see cref="object" /> is equal to this instance; otherwise, <see langword="false" />.
        /// </returns>
        public override bool Equals(object obj);

        /// <summary>
        ///     Returns a hash code for this instance.
        /// </summary>
        /// <returns>
        ///     A hash code for this instance, suitable for use in hashing algorithms and data structures like a hash table.
        /// </returns>
        public override int GetHashCode();

        /// <summary>
        ///     Converts the string representation of a UUID to the equivalent <see cref="Uuid"/> structure.
        /// </summary>
        /// <param name="input">The UUID to convert.</param>
        /// <param name="result">The structure that will contain the parsed value. If the method returns <see langword="true"/>, <paramref name="result"/> contains a valid <see cref="Uuid"/>. If the method returns <see langword="false"/>, <paramref name="result"/> equals <see cref="Nil"/>.</param>
        /// <returns><see langword="true"/> if the parse operation was successful; otherwise, <see langword="false"/>.</returns>
        public static bool TryParse(string input, out Uuid result);

        /// <summary>
        ///     Implements the operator ==.
        /// </summary>
        /// <param name="left">The left.</param>
        /// <param name="right">The right.</param>
        /// <returns>
        ///     The result of the operator.
        /// </returns>
        public static bool operator ==(Uuid left, Uuid right);

        /// <summary>
        ///     Implements the operator !=.
        /// </summary>
        /// <param name="left">The left.</param>
        /// <param name="right">The right.</param>
        /// <returns>
        ///     The result of the operator.
        /// </returns>
        public static bool operator !=(Uuid left, Uuid right);

        /// <summary>
        ///     Compares the current instance with another object of the same type and returns an integer that indicates whether the current instance precedes, follows, or occurs in the same position in the sort order as the other object.
        /// </summary>
        /// <param name="other">An object to compare with this instance.</param>
        /// <returns>
        ///     <para>A value that indicates the relative order of the objects being compared. The return value has these meanings:</para>
        ///     <list type="table">
        ///         <listheader>
        ///             <term>Value</term>
        ///             <description>Meaning</description>
        ///         </listheader>
        ///         <item>
        ///             <term>Less than zero</term>
        ///             <description>This instance precedes <paramref name="other"/> in the sort order.</description>
        ///         </item>
        ///         <item>
        ///             <term>Zero</term>
        ///             <description>This instance occurs in the same position in the sort order as <paramref name="other"/>.</description>
        ///         </item>
        ///         <item>
        ///             <term>Greater than zero</term>
        ///             <description>This instance follows <paramref name="other"/> in the sort order.</description>
        ///         </item>
        ///     </list>
        /// </returns>
        /// <remarks><para>Lexical ordering is not temporal ordering!</para></remarks>
        public int CompareTo(Uuid other);

        /// <summary>
        ///     Compares this instance to a specified object and returns an indication of their relative values.
        /// </summary>
        /// <param name="obj">An object to compare, or <see langword="null"/>.</param>
        /// <returns>
        ///     <para>A signed number indicating the relative values of this instance and <paramref name="obj"/>.</para>
        ///     <list type="table">
        ///         <listheader>
        ///             <term>Return value</term>
        ///             <description>Description</description>
        ///         </listheader>
        ///         <item>
        ///             <term>A negative integer</term>
        ///             <description>This instance is less than <paramref name="obj"/>.</description>
        ///         </item>
        ///         <item>
        ///             <term>Zero</term>
        ///             <description>This instance is equal to <paramref name="obj"/>.</description>
        ///         </item>
        ///         <item>
        ///             <term>A positive integer</term>
        ///             <description>This instance is greater than <paramref name="obj"/>, or <paramref name="obj"/> is <see langword="null"/>.</description>
        ///         </item>
        ///     </list>
        /// </returns>
        /// <exception cref="ArgumentException"><paramref name="obj"/> is not a <see cref="Uuid"/>.</exception>
        public int CompareTo(object obj);

        /// <summary>
        ///     Implements the operator &gt;.
        /// </summary>
        /// <param name="left">The left.</param>
        /// <param name="right">The right.</param>
        /// <returns>
        ///     The result of the operator.
        /// </returns>
        public static bool operator >(Uuid left, Uuid right);

        /// <summary>
        ///     Implements the operator &lt;.
        /// </summary>
        /// <param name="left">The left.</param>
        /// <param name="right">The right.</param>
        /// <returns>
        ///     The result of the operator.
        /// </returns>
        public static bool operator <(Uuid left, Uuid right);

        /// <summary>
        ///     Implements the operator &gt;=.
        /// </summary>
        /// <param name="left">The left.</param>
        /// <param name="right">The right.</param>
        /// <returns>
        ///     The result of the operator.
        /// </returns>
        public static bool operator >=(Uuid left, Uuid right);

        /// <summary>
        ///     Implements the operator &lt;=.
        /// </summary>
        /// <param name="left">The left.</param>
        /// <param name="right">The right.</param>
        /// <returns>
        ///     The result of the operator.
        /// </returns>
        public static bool operator <=(Uuid left, Uuid right);

        /// <summary>
        ///     Performs an implicit conversion from <see cref="Guid"/> to <see cref="Uuid"/>.
        /// </summary>
        /// <param name="guid">The unique identifier.</param>
        /// <returns>
        ///     The result of the conversion.
        /// </returns>
        public static implicit operator Uuid(Guid guid);

        /// <summary>
        ///     Performs an explicit conversion from <see cref="Uuid"/> to <see cref="Guid"/>.
        /// </summary>
        /// <param name="uuid">The UUID.</param>
        /// <returns>
        ///     The result of the conversion.
        /// </returns>
        /// <exception cref="InvalidCastException">Only random (version 4) UUIDs can be converted to GUIDs.</exception>
        public static explicit operator Guid(Uuid uuid);

        /// <summary>
        ///     Gets the debugger display text.
        /// </summary>
        /// <value>
        ///     The debugger display text.
        /// </value>
        /// <remarks><para>Gives nicer display for pre-defined UUIDs.</para></remarks>
        [DebuggerBrowsable(DebuggerBrowsableState.Never)]
        private string DebuggerDisplay { get; }
    }
}

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 5.0 milestone Jan 31, 2020
@om-ha
Copy link

om-ha commented Feb 17, 2020

For reference, found a good implementation here

@om-ha
Copy link

om-ha commented Feb 17, 2020

Please note version 4 requires specific bitwise manipulation. Also to be cryptographically secure it must not be generated with Guid.NewGuid();

@sandorfr
Copy link
Author

For reference, found a good implementation here

Nice. The real problem here is to get the proposal/API approved to make this a thing :)

@om-ha
Copy link

om-ha commented Feb 18, 2020

@sandorfr In fact I ended up using GuidOne. Which accounts for Endianness unlike other libraries. It completed my Stackoverflow answer here.

@shmuelie
Copy link
Contributor

I'm happy to provide my implementation if wanted.

@maryamariyan maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 23, 2020
@joperezr joperezr modified the milestones: 5.0.0, Future Jul 6, 2020
@joperezr joperezr removed the untriaged New issue has not been triaged by the area owner label Jul 6, 2020
@GrabYourPitchforks
Copy link
Member

Thanks all for the discussion. Some random thoughts on this:

  • I don't think we should add any surface area to the Guid type. If we add factory methods, it's going to add confusion. "What's the difference between all of these, and which one should I use?" is going to come up.

  • For the same reason, I wouldn't add a Uuid type. The range of valid Guid and Uuid values would be identical, so IMO there's no reason to have two exchange types. It would again lead to confusion.

  • Creation of a v5 UUID involves using the SHA-1 hash algorithm. Introducing new code paths that leverage SHA-1 is generally forbidden across Microsoft codebases unless it's needed for some core compatibility or interop scenario. The demand for version 5 UUIDs is not high enough to justify an exception here.

Given all of these I'm inclined to say that we should close this issue. If there's evidence that there's significant community need for this functionality and that it's too onerous for devs to roll the implementation themselves, then we should reconsider. But so far I don't think either threshold is met.

@PSanetra
Copy link

I don't think we should add any surface area to the Guid type. If we add factory methods, it's going to add confusion. "What's the difference between all of these, and which one should I use?" is going to come up.

I think that is just a reason to document those new methods.

For the same reason, I wouldn't add a Uuid type. The range of valid Guid and Uuid values would be identical, so IMO there's no reason to have two exchange types. It would again lead to confusion.

I agree

Creation of a v5 UUID involves using the SHA-1 hash algorithm. Introducing new code paths that leverage SHA-1 is generally forbidden across Microsoft codebases unless it's needed for some core compatibility or interop scenario. The demand for version 5 UUIDs is not high enough to justify an exception here.

I guess the reason behind avoiding SHA-1 is that it is partially broken for cryptographic use cases. In case of creating UUID v5 this seems to be irrelevant as the hash is truncated to 128 bit anyway for creating the UUID.

@shmuelie
Copy link
Contributor

I don't think we should add any surface area to the Guid type. If we add factory methods, it's going to add confusion. "What's the difference between all of these, and which one should I use?" is going to come up.

Personally, I agree

For the same reason, I wouldn't add a Uuid type. The range of valid Guid and Uuid values would be identical, so IMO there's no reason to have two exchange types. It would again lead to confusion.

I disagree. This is a square and rectangle case. All Guids are valid Uuids but not all Uuids are valid Guids.

Creation of a v5 UUID involves using the SHA-1 hash algorithm. Introducing new code paths that leverage SHA-1 is generally forbidden across Microsoft codebases unless it's needed for some core compatibility or interop scenario. The demand for version 5 UUIDs is not high enough to justify an exception here.

v3 UUID is created using MD5, another cracked algorithm. I would argue that since in both cases it is not being used for cryptography but ID usage.

Given all of these I'm inclined to say that we should close this issue. If there's evidence that there's significant community need for this functionality and that it's too onerous for devs to roll the implementation themselves, then we should reconsider. But so far I don't think either threshold is met.

How often this is needed I can't say, but implementing correctly can be a pain, even more so if you must do so in a performant way.

@sandorfr
Copy link
Author

I don't think we should add any surface area to the Guid type. If we add factory methods, it's going to add confusion. "What's the difference between all of these, and which one should I use?" is going to come up.

I strongly disagree with this one. That's a terrible reason not to offer an API. Yes there will be people who can't read the docs, it's not for them that you should design things.

Introducing new code paths that leverage SHA-1 is generally forbidden across Microsoft codebases unless it's needed for some core compatibility or interop scenario.

SHA-1 vulnerability in the context of cryptography really is irrelevant here.

If there's evidence that there's significant community need for this functionality and that it's too onerous for devs to roll the implementation themselves, then we should reconsider. But so far I don't think either threshold is met.

Given how long it's been opened and the low activity and reactions on this, I agree with this one.
Also there's some decent implementations in some packages mentioned above.

@GrabYourPitchforks
Copy link
Member

I would argue that since in both cases it is not being used for cryptography but ID usage.

SHA-1 vulnerability in the context of cryptography really is irrelevant here.

@SamuelEnglard @sandorfr I understand the arguments here, but "it's not used for cryptographic purposes" is not a valid argument for granting an exception. "Broken" algorithms like SHA1 are forbidden for all purposes across Microsoft code bases unless it's needed for legacy compat reasons or to support some core scenario for the product. (That last category is generally for allowing things like networking protocols.)

Unless we can demonstrate clearly that usage of MD5 or SHA1 in UUID generation is a critical scenario that we absolutely must support, we would not get signoff from release management to ship this functionality.

@shmuelie
Copy link
Contributor

shmuelie commented Aug 2, 2020

I understand the arguments here, but "it's not used for cryptographic purposes" is not a valid argument for granting an exception. "Broken" algorithms like SHA1 are forbidden for all purposes across Microsoft code bases unless it's needed for legacy compat reasons or to support some core scenario for the product. (That last category is generally for allowing things like networking protocols.)

I can't say I agree but at the same time I'm not the person who implemented these rules.

Also there's some decent implementations in some packages mentioned above.

Getting approval to include a dependency for exactly one type isn't always easy. If it comes as part of .NET though, no arguing needed.

@KalleOlaviNiemitalo
Copy link

.NET Runtime already implements a SHA-1 GUID algorithm as part of EventSource, albeit with a hardcoded namespace. I guess that is justified by legacy compat.

/// <summary>
/// Implements the SHA1 hashing algorithm. Note that this
/// implementation is for hashing public information. Do not
/// use this code to hash private data, as this implementation does
/// not take any steps to avoid information disclosure.
/// </summary>
private struct Sha1ForNonSecretPurposes
{
private long length; // Total message length in bits
private uint[] w; // Workspace
private int pos; // Length of current chunk in bytes
/// <summary>
/// Call Start() to initialize the hash object.
/// </summary>
public void Start()
{
this.w ??= new uint[85];
this.length = 0;
this.pos = 0;
this.w[80] = 0x67452301;
this.w[81] = 0xEFCDAB89;
this.w[82] = 0x98BADCFE;
this.w[83] = 0x10325476;
this.w[84] = 0xC3D2E1F0;
}
/// <summary>
/// Adds an input byte to the hash.
/// </summary>
/// <param name="input">Data to include in the hash.</param>
public void Append(byte input)
{
this.w[this.pos / 4] = (this.w[this.pos / 4] << 8) | input;
if (64 == ++this.pos)
{
this.Drain();
}
}
/// <summary>
/// Adds input bytes to the hash.
/// </summary>
/// <param name="input">
/// Data to include in the hash. Must not be null.
/// </param>
#if ES_BUILD_STANDALONE
public void Append(byte[] input)
#else
public void Append(ReadOnlySpan<byte> input)
#endif
{
foreach (byte b in input)
{
this.Append(b);
}
}
/// <summary>
/// Retrieves the hash value.
/// Note that after calling this function, the hash object should
/// be considered uninitialized. Subsequent calls to Append or
/// Finish will produce useless results. Call Start() to
/// reinitialize.
/// </summary>
/// <param name="output">
/// Buffer to receive the hash value. Must not be null.
/// Up to 20 bytes of hash will be written to the output buffer.
/// If the buffer is smaller than 20 bytes, the remaining hash
/// bytes will be lost. If the buffer is larger than 20 bytes, the
/// rest of the buffer is left unmodified.
/// </param>
public void Finish(byte[] output)
{
long l = this.length + 8 * this.pos;
this.Append(0x80);
while (this.pos != 56)
{
this.Append(0x00);
}
unchecked
{
this.Append((byte)(l >> 56));
this.Append((byte)(l >> 48));
this.Append((byte)(l >> 40));
this.Append((byte)(l >> 32));
this.Append((byte)(l >> 24));
this.Append((byte)(l >> 16));
this.Append((byte)(l >> 8));
this.Append((byte)l);
int end = output.Length < 20 ? output.Length : 20;
for (int i = 0; i != end; i++)
{
uint temp = this.w[80 + i / 4];
output[i] = (byte)(temp >> 24);
this.w[80 + i / 4] = temp << 8;
}
}
}
/// <summary>
/// Called when this.pos reaches 64.
/// </summary>
private void Drain()
{
for (int i = 16; i != 80; i++)
{
this.w[i] = BitOperations.RotateLeft(this.w[i - 3] ^ this.w[i - 8] ^ this.w[i - 14] ^ this.w[i - 16], 1);
}
unchecked
{
uint a = this.w[80];
uint b = this.w[81];
uint c = this.w[82];
uint d = this.w[83];
uint e = this.w[84];
for (int i = 0; i != 20; i++)
{
const uint k = 0x5A827999;
uint f = (b & c) | ((~b) & d);
uint temp = BitOperations.RotateLeft(a, 5) + f + e + k + this.w[i]; e = d; d = c; c = BitOperations.RotateLeft(b, 30); b = a; a = temp;
}
for (int i = 20; i != 40; i++)
{
uint f = b ^ c ^ d;
const uint k = 0x6ED9EBA1;
uint temp = BitOperations.RotateLeft(a, 5) + f + e + k + this.w[i]; e = d; d = c; c = BitOperations.RotateLeft(b, 30); b = a; a = temp;
}
for (int i = 40; i != 60; i++)
{
uint f = (b & c) | (b & d) | (c & d);
const uint k = 0x8F1BBCDC;
uint temp = BitOperations.RotateLeft(a, 5) + f + e + k + this.w[i]; e = d; d = c; c = BitOperations.RotateLeft(b, 30); b = a; a = temp;
}
for (int i = 60; i != 80; i++)
{
uint f = b ^ c ^ d;
const uint k = 0xCA62C1D6;
uint temp = BitOperations.RotateLeft(a, 5) + f + e + k + this.w[i]; e = d; d = c; c = BitOperations.RotateLeft(b, 30); b = a; a = temp;
}
this.w[80] += a;
this.w[81] += b;
this.w[82] += c;
this.w[83] += d;
this.w[84] += e;
}
this.length += 512; // 64 bytes == 512 bits
this.pos = 0;
}
}
private static Guid GenerateGuidFromName(string name)
{
#if ES_BUILD_STANDALONE
if (namespaceBytes == null)
{
namespaceBytes = new byte[] {
0x48, 0x2C, 0x2D, 0xB2, 0xC3, 0x90, 0x47, 0xC8,
0x87, 0xF8, 0x1A, 0x15, 0xBF, 0xC1, 0x30, 0xFB,
};
}
#else
ReadOnlySpan<byte> namespaceBytes = new byte[] // rely on C# compiler optimization to remove byte[] allocation
{
0x48, 0x2C, 0x2D, 0xB2, 0xC3, 0x90, 0x47, 0xC8,
0x87, 0xF8, 0x1A, 0x15, 0xBF, 0xC1, 0x30, 0xFB,
};
#endif
byte[] bytes = Encoding.BigEndianUnicode.GetBytes(name);
Sha1ForNonSecretPurposes hash = default;
hash.Start();
hash.Append(namespaceBytes);
hash.Append(bytes);
Array.Resize(ref bytes, 16);
hash.Finish(bytes);
bytes[7] = unchecked((byte)((bytes[7] & 0x0F) | 0x50)); // Set high 4 bits of octet 7 to 5, as per RFC 4122
return new Guid(bytes);
}

@GrabYourPitchforks
Copy link
Member

I guess that is justified by legacy compat.

Correct. You have no idea the sheer amount of paperwork we have to deal with every single release because of this and other compat-only usages. There's an increasing amount of bureaucracy every year. I think the goal is to make it so obnoxious that at a certain point we relent and say "fine, we're yanking it!" :)

@mdonoughe
Copy link
Contributor

Note: the EventSource implementation of this algorithm is not compliant with the RFC. It generates UUIDs that match ETW, which is what matters for EventSource, but it does not generate UUIDs that match standard implementations. The problems are related to endianness of the byte arrays and not setting the RFC variant flag.

@GrabYourPitchforks
Copy link
Member

Triaged this morning with the area owners. We don't believe this belongs in the BCL, and as mentioned in this thread there are good fully-featured packages in the ecosystem which provide this capability.

@ghost ghost locked as resolved and limited conversation to collaborators Dec 20, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-needs-work API needs work before it is approved, it is NOT ready for implementation area-System.Runtime
Projects
None yet
Development

No branches or pull requests

10 participants