Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docstrings + Process XOR/ROL optimisations #10

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

arekbulski
Copy link
Member

@arekbulski arekbulski commented Apr 5, 2018

Analog implementation to Python PR.

@GreyCat There are 2 methods that remain without docstrings (marked ???). If you would please tell me what to write there, I would include it.

Benchmarks for loops:

using System;
using System.Diagnostics;
using System.Runtime;

namespace demo
{
    class MainClass
    {
        static bool ByForArrayLength (byte[] data)
        {
            for (int i = 0; i < data.Length; i++)
                if (data [i] != 0)
                    return false;
            return true;
        }

        static bool ByForLocalLength (byte[] data)
        {
            int len = data.Length;
            for (int i = 0; i < len; i++)
                if (data [i] != 0)
                    return false;
            return true;
        }

        static bool ByForeach (byte[] data)
        {
            foreach (byte b in data)
                if (b != 0)
                    return false;
            return true;
        }

        static void Measure (Action work, string description)
        {
            var watch = Stopwatch.StartNew ();
            work.Invoke ();
            Console.WriteLine ("{0,-40}: {1} ms", description, watch.Elapsed.TotalMilliseconds);
        }

        public static void Main (string[] args)
        {
            GCSettings.LatencyMode = GCLatencyMode.LowLatency;
            byte[] data = new byte[256 * 1024 * 1024];

            Measure (() => ByForArrayLength (data), "For with byte[].Length property");
            Measure (() => ByForLocalLength (data), "For with Length in local variable");
            Measure (() => ByForeach (data), "Foreach over byte[]");
        }
    }
}
$ mcs Program.cs -optimize
$ ./Program.exe 
For with byte[].Length property         : 427,2671 ms
For with Length in local variable       : 331,5512 ms
Foreach over byte[]                     : 272,0288 ms

Benchmarks for rotations:

using System;
using System.Diagnostics;
using System.Runtime;

namespace demo
{
    class MainClass
    {
        static void ByShifts (byte[] data)
        {
            int len = data.Length;
            int amount = 1;
            int anti = 7;

            for (int i = 0; i < len; i++)
            {
                byte x = data[i];
                data[i] = (byte)((x << amount) | (x >> anti));
            }
        }

        static void ByLookup (byte[] data)
        {
            int len = data.Length;
            byte[] translate = new byte[256];

            for (int i = 0; i < len; i++)
                data[i] = translate[data[i]];
        }

        static void Measure (Action work, string description)
        {
            var watch = Stopwatch.StartNew ();
            work.Invoke ();
            Console.WriteLine ("{0,-40}: {1} ms", description, watch.Elapsed.TotalMilliseconds);
        }

        public static void Main (string[] args)
        {
            GCSettings.LatencyMode = GCLatencyMode.LowLatency;
            byte[] data = new byte[256 * 1024 * 1024];

            Measure (() => ByShifts (data), "Loop with << >> |");
            Measure (() => ByLookup (data), "Loop with translate[...]");
        }
    }
}
$ mcs Program.cs -optimize
$ ./Program.exe 
Loop with << >> |                       : 808,9077 ms
Loop with translate[...]                : 513,9755 ms

Requesting review from @koczkatamas @LogicAndTrick @Arlorean @KOLANICH .

/// It's based off a <code>BinaryReader</code>, which is a little-endian reader.
/// </summary>
public partial class KaitaiStream : BinaryReader
{
#region Constructors

/// <summary>
/// Create a KaitaiStream backed by abstract stream. It could be in-memory buffer or open file.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an abstract stream
an in-memory buffer
an open file

KaitaiStream.cs Outdated
for (int i = 0, j = 0; i < value.Length; i++, j = (j + 1) % keyLen)
if (key.Length == 1)
return ProcessXor(data, key[0]);
if (key.Length <= 64 && ByteArrayZero(key))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimisations are based on conjecture/assumption that keys are much shorter than data. That makes this check cheap in comparison.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also ByteArrayZero is short evaluation, it breaks loop on first non-zero.

Copy link

@KOLANICH KOLANICH Apr 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimisations are based on conjecture/assumption that keys are much shorter than data. That makes this check cheap in comparison.

We don't need the assumption you are using ( 1 < key.Length <= 64 <=> Pr( key === zero bit string ) > key.Length/value.Length ) to be an inherent part of the runtime.

KaitaiStream.cs Outdated
///</summary>
public KaitaiStream(string file) : base(File.Open(file, FileMode.Open, FileAccess.Read, FileShare.Read))
/// <summary>
/// Create a KaitaiStream backed by a closed file (in read-only binary-mode).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it's "closed file"? It actually remains open.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its a less known Python terminology. In general, functions that work on files are divided into those that operate on (currently) closed files and take filenames and those that operate on (currently) open files and take integer file descriptors/handles.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

KaitaiStream.cs Outdated
/// <summary>
/// Used internally.
/// Caches the current plaftorm endianness and allows emited bytecode to be optimised. Thanks to @Arlorean.
/// </summary>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably warrants a <remarks>...</remarks> section with a longer explanation of how this works, or at least a pointer to that discussion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added link, thanks, fixed.

KaitaiStream.cs Outdated
private int BitsLeft = 0;

/// <summary>
/// Used internally.
/// Caches the current plaftorm endianness and allows emited bytecode to be optimised. Thanks to @Arlorean.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plaftorm => platform, emited => emitted; arguably, optimised => optimized, but that's UK vs US...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I'm from the UK but I try to make all my spelling US so it's consistent with the rest of .NET and other libraries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also speak US English. Sloppy typing, thanks. Fixed.

KaitaiStream.cs Outdated
/// <summary>
/// Used internally.
/// </summary>
private static void compute_single_rotations()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lower_underscore is totally against any .NET coding standards, I'm afraid.

Copy link
Contributor

@Arlorean Arlorean Apr 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree again. Best to not use underscores in C# code precomputedSingleRotations woud be better, for example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, just copied the name from Python. It was a literal translation. Fixed.

KaitaiStream.cs Outdated
if (amount > 7 || amount < -7) throw new ArgumentException("Rotation of more than 7 cannot be performed.", "amount");
if (amount < 0) amount += 8; // Rotation of -2 is the same as rotation of +6
if (groupSize < 1)
throw new Exception("group size must be at least 1 to be valid");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw new ArgumentException(...)
OK now I'm being picky!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Contributor

@Arlorean Arlorean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @GreyCat on his comments about naming.
I've verified the results of the optimizations on Windows with Visual C# Compiler (2.6.0.62329) and they're all good.

KaitaiStream.cs Outdated
#endregion

#region Stream positioning

/// <summary>
/// Check if the stream position is at the end of the stream
/// Check if the stream position is at the end of the stream (at EOF).
/// WARNING: This requires a seekable and tellable stream.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.NET does not use "tell" and thus you can expect that nobody would knows what "tellable" is. Microsoft actually calls both "seekable" and "tellable" => "supports seeking", and that actually makes sense to use that wording at least here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our own documentation (KS docs PR regarding seeking) we do use these terms, seekable and tellable, so there is no reason why we cant make it consistent with our own terminology, instead of Microsoft's. And I do keep Microsoft .NET in high regard. Its also consistent with docstrings in Python and in future, other runtimes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've already told you in these PRs that it's a bad idea to extrapolate C-style or Python-style terminology to end users who don't know anything about that in the first place, and I'm repeating it here. How many times do I need to repeat that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, dont yell at me. If you are adamant that it should be changed, then be explicit. You only old me your opinion on this, and I told you mine. I will fix it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I certainly don't want to yell at anyone or anything like that. It just looked like we had no discussion there and you just kept pushing that C/Python jargonism where it does not belong (and that would have raised lots of questions in future). I've stated my position in kaitai-io/kaitai_struct_doc#14, and you've just ignored my comment, continuing to use it here and there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didnt make the connection. That and just got overall sloppy... 😪

KaitaiStream.cs Outdated
/// <summary>
/// Check if byte array is all zeroes.
/// </summary>
public static bool ByteArrayZero(byte[] a)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably IsByteArrayZero would be a better name, as otherwise it kind of clashes with Microsoft's naming of functions like ZeroMemory, which actually sets all members of array to zero. Also, I'm not sure that it should be public.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed name to IsByteArrayZero. Fixed. The other helper methods are already public.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other static methods like Mod, ByteArrayCompare, etc, are public because they are actually invoked in class sources generated by ksc. But then again, IsByteArrayZero might be useful to somebody per se...

KaitaiStream.cs Outdated
private ulong Bits = 0;
/// <summary>
/// Used internally.
/// </summary>
private int BitsLeft = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like "Number of bits left in Bits buffer" + a cross-reference to it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is cross reference?

@arekbulski arekbulski force-pushed the patch-xorrotate branch 2 times, most recently from f912c61 to dc6488a Compare April 6, 2018 14:37
@arekbulski arekbulski changed the title docstrings, process xor, rotate optimisations Docstrings + Process XOR/ROL optimisations Apr 6, 2018
KaitaiStream.cs Outdated
public KaitaiStream(string file) : base(File.Open(file, FileMode.Open, FileAccess.Read, FileShare.Read))
/// <summary>
/// Create a KaitaiStream backed by a currently closed file (in read-only binary-mode).
/// </summary>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "currently closed" is still very confusing. I've definitely not a Python person, but for C# guys that would be just as confusing. From OS point of view, "currently closed" file just does not exist as a resource: it takes no memory buffers, no handles, no nothing. Why don't we rephrase it in a somewhat more straightforward voice, i.e. something

Create a KaitaiStream by opening a file in read-only binary mode. This KaitaiStream will be backed by FileStream.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@kaitai-io kaitai-io deleted a comment from GreyCat Apr 6, 2018
@kaitai-io kaitai-io deleted a comment from GreyCat Apr 6, 2018
@kaitai-io kaitai-io deleted a comment from GreyCat Apr 6, 2018
@kaitai-io kaitai-io deleted a comment from GreyCat Apr 6, 2018
@kaitai-io kaitai-io deleted a comment from GreyCat Apr 6, 2018
@arekbulski
Copy link
Member Author

arekbulski commented Apr 11, 2018

It seems that @GreyCat was once again right and I was half wrong. You were right, that benchmarking C# code is more tricky than I thought, although I would like to point out in my defense that its for reasons that nobody pointed out explicitly and possibly nobody even understood.

It is my current belief that warmup has very little impact as far as (1) CPU cache is concerned, since test data is 2GB in size and CPU L2 cache is ~1MB and (2) JIT cost is also smallish compared to overall run time. However, it seems that there is one other reason to warmup the tested method: When you allocate new byte[...], it seems that it works similar to malloc, it reserves address space but not allocate actual memory pages. C# guaranatees that its elements are defaulted to 0 but that might have been implemented by intercepting page fault on first access instead of memset-ing it in advance. Thats why first method that run in the suite was slower. I fixed it in below code:

using System;
using System.Diagnostics;
using System.Threading;
using System.Runtime;

namespace demo
{
    class MainClass
    {

        static void ByModulo (byte[] data, byte[] key)
        {
            int len = data.Length;
            int keylen = key.Length;

            for (int i = 0; i < len; i++)
                data[i] = (byte) (data[i] ^ key[i % keylen]);
        }

        static void ByModulo2 (byte[] data, byte[] key)
        {
            for (int i = 0; i < data.Length; i++)
                data[i] = (byte) (data[i] ^ key[i % key.Length]);
        }

        static void ByCheckWrap (byte[] data, byte[] key)
        {
            int len = data.Length;
            int keylen = key.Length;

            int k = 0;
            for (int i = 0; i < len; i++)
                data[i] = (byte) (data[i] ^ key[k]);
                k++;
                if (k == keylen)
                    k = 0;
        }

        static void ByCheckWrap2 (byte[] data, byte[] key)
        {
            int k = 0;
            for (int i = 0; i < data.Length; i++)
                data[i] = (byte) (data[i] ^ key[k]);
                k++;
                if (k == key.Length)
                    k = 0;
        }

        static void Measure (Action work, string description)
        {
            work.Invoke ();
            var watch = Stopwatch.StartNew ();
            work.Invoke ();
            var elapsed = watch.Elapsed.TotalMilliseconds;
            Console.WriteLine ("{0,-40}: {1} ms", description, elapsed);
        }

        public static void Main (string[] args)
        {
            byte[] data = new byte[int.MaxValue];
            byte[] key = new byte[10];
            Measure (() => ByModulo (data, key), "Loop with data[i % keylen]");
            Measure (() => ByModulo2 (data, key), "Loop with data[i % key.Length]");
            Measure (() => ByCheckWrap (data, key), "Loop with data[k]; if(k==keylen)k=0");
            Measure (() => ByCheckWrap2 (data, key), "Loop with data[k]; if(k==key.Length)k=0");
        }
    }
}
$ mcs --version
Mono C# compiler version 4.6.2.0
$ mcs Program.cs -optimize
$ ./Program.exe 
Loop with data[i % keylen]              : 20878 ms
Loop with data[i % key.Length]          : 20450 ms
Loop with data[k]; if(k==keylen)k=0     : 4329 ms
Loop with data[k]; if(k==key.Length)k=0 : 4190 ms

@arekbulski arekbulski force-pushed the patch-xorrotate branch 2 times, most recently from 7e2f45c to a6bbce0 Compare April 11, 2018 23:23
@arekbulski
Copy link
Member Author

All commits were amended to reflect last benchmarks. Previous changset was 52308c2.

@GreyCat
Copy link
Member

GreyCat commented Apr 12, 2018

@arekbulski, I would keep insisting that you're missing the point with benchmarks here. The platform is wrong (given that 95% of .NET installations use Microsoft's .NET, it's only natural to benchmark these), the methodology is all wrong both from point of view of general statistics, and from .NET's environment point of view, etc. This alone is a total disaster, for example, which makes all tests effectively useless:

            int k = 0;
            for (int i = 0; i < data.Length; i++)
                data[i] = (byte) (data[i] ^ key[k]);
                k++;
                if (k == key.Length)
                    k = 0;

If you don't see it, please take a longer look at this particular piece of code.

I'm by no means a .NET expert, but even what I know is enough to see that it is not the way to do it. I can teach you how to do it, I can do these benchmarks myself, but that would take a considerable chunk of my time that I'd rather spend on some real-life tasks. This is, of course, interesting from educational perspective, both for me, you and anyone who might be interested in microbenchmarking, but there are like 400 other pressing issues in the main repo, and I do believe they're more important than this task that nobody even ever requested.

@Arlorean
Copy link
Contributor

My experience in real-world .NET applications on Windows is that the amount of memory you turn over, which puts pressure on the Garbage Collector, often outweighs any micro-optimizations at the loop level like this. We use tools like dotTrace and dotMemory to pin down performance problems, but more often than not, it's a change in algorithm, rather than a change in loop structuring that provides the most performance gains.

The Roslyn team have some guidelines which may help. The key one being don't use LINQ as it allocates memory which causes performance issues.

This brings me to an overall concern about the KaitaiStruct C# API. As much as I love it, I don't love the amount of memory that gets allocated when parsing a file. A solution along the lines of FlatBuffers would be preferable since it performs zero memory allocations but still gives easy access to the data structures stored in essentially a byte array.

@GreyCat
Copy link
Member

GreyCat commented Apr 16, 2018

@Arlorean Unfortunately, it's not that easy. Projects like Cap'n Proto or FlatBuffers use carefully tuned representation of certain data laid out in memory in a manner that would be fast to access on a reasonably modern x86_64 box. If you'll try to apply the same principle to arbitrary binary data, chances are that you'll end up with very slow code due to every parsed.Bar call taking a long time due to data alignment issues.

Also, that's not a single answer for everything anyway. You can't escape creating a string object in C# if you indeed want a string, not a byte array. Actually, in C# you can't escape allocating new byte array and copying slice of the data there every time you want a very basic byte array in the middle of memory.

That said, I think that it's still a very good idea to try implementing such "zero-object creation", even if it would be of limited use and not a good fit for every one. I don't think that ksy format per se has any blockers for implementing that right now.

@arekbulski
Copy link
Member Author

arekbulski commented Apr 17, 2018

@GreyCat
Okay okay, take it easy. I will get you some .NET proper benchmarks. I looked at the code you quoted for few days, what is so wrong with it? The cases do arrays in-place but that doesnt make the results relative wrong.

Please note that only 2 items are (should be) in question: XOR(bytearray) and ROL(groupsize=1). Other changes are either easily justifiable (like returning entire array is cheaper than copying it) or add instead of replacing features (like ROL >1 groups).

@Arlorean
Note that in this PR there is no LINQ code, there is no real allocations either (rotation needs some indecies but those are small and short lived). Albeit not related to this PR, it is worthy discussion.

@GreyCat
You dont need a subarray copy to decode a unicode string, you can do that on a "slice" using: Encoding.GetString (Byte[], Int32, Int32)

@GreyCat
Copy link
Member

GreyCat commented Apr 17, 2018

@arekbulski

I looked at the code you quoted for few days, what is so wrong with it?

Ok, the catch is that you've used Python formatting syntax, missing {}s, and this is C#. Formatted properly, your code does the following:

int k = 0;
for (int i = 0; i < data.Length; i++)
{
    data[i] = (byte) (data[i] ^ key[k]);
}
k++;
if (k == key.Length)
{
    k = 0;
}

That k wrapped increment fell out of the loop.

@arekbulski
Copy link
Member Author

arekbulski commented Apr 17, 2018

Benchmarks of fixed code and fixed methodology, although still ran on Mono.

using System;
using System.Diagnostics;
using System.Runtime;

namespace demo
{
    class MainClass
    {
        static void ByBitwiseXor (byte[] data)
        {
            int len = data.Length;
            byte x = 1;

            for (int i = 0; i < len; i++) {
                data[i] = (byte) (data[i] ^ x);
            }
        }

        static void ByShifts (byte[] data)
        {
            int len = data.Length;
            int amount = 1;
            int anti = 7;

            for (int i = 0; i < len; i++) {
                byte x = data[i];
                data[i] = (byte)((x << amount) | (x >> anti));
            }
        }

        static void ByLookup (byte[] data)
        {
            int len = data.Length;
            byte[] translate = new byte[256];
            for (int i = 0; i < 256; i++)
                translate[i] = (byte) (i ^ 1);

            for (int i = 0; i < len; i++) {
                data[i] = translate[data[i]];
            }
        }

        static void ByModulo (byte[] data, byte[] key)
        {
            int len = data.Length;
            int keylen = key.Length;

            for (int i = 0; i < len; i++) {
                data[i] = (byte) (data[i] ^ key[i % keylen]);
            }
        }

        static void ByModulo2 (byte[] data, byte[] key)
        {
            for (int i = 0; i < data.Length; i++) {
                data[i] = (byte) (data[i] ^ key[i % key.Length]);
            }
        }

        static void ByCheckWrap (byte[] data, byte[] key)
        {
            int len = data.Length;
            int keylen = key.Length;

            int k = 0;
            for (int i = 0; i < len; i++) {
                data[i] = (byte) (data[i] ^ key[k]);
                k++;
                if (k == keylen)
                    k = 0;
            }
        }

        static void ByCheckWrap2 (byte[] data, byte[] key)
        {
            int k = 0;
            for (int i = 0; i < data.Length; i++) {
                data[i] = (byte) (data[i] ^ key[k]);
                k++;
                if (k == key.Length)
                    k = 0;
            }
        }

        static void Measure (Action work, string description)
        {
            GCSettings.LatencyMode = GCLatencyMode.LowLatency;
            work.Invoke ();
            var watch = Stopwatch.StartNew ();
            work.Invoke ();
            var elapsed = watch.Elapsed.TotalMilliseconds;
            Console.WriteLine ("{0,-40}: {1:F0} ms", description, elapsed);
        }

        public static void Main (string[] args)
        {
            byte[] data = new byte[int.MaxValue];
            byte[] key = new byte[10];
            Measure (() => ByBitwiseXor (data), "Loop with ^ key");
            Measure (() => ByShifts (data), "Loop with << >> |");
            Measure (() => ByLookup (data), "Loop with translate[...]");
            Measure (() => ByModulo (data, key), "Loop with data[i % keylen]");
            Measure (() => ByModulo2 (data, key), "Loop with data[i % key.Length]");
            Measure (() => ByCheckWrap (data, key), "Loop with data[k]; if(k==keylen)k=0");
            Measure (() => ByCheckWrap2 (data, key), "Loop with data[k]; if(k==key.Length)k=0");
        }
    }
}
$ mcs --version
Mono C# compiler version 4.6.2.0
$ mcs Program.cs -optimize
$ ./Program.exe 
Loop with ^ key                         : 2980 ms
Loop with << >> |                       : 5417 ms
Loop with translate[...]                : 3910 ms
Loop with data[i % keylen]              : 20341 ms
Loop with data[i % key.Length]          : 20650 ms
Loop with data[k]; if(k==keylen)k=0     : 4599 ms
Loop with data[k]; if(k==key.Length)k=0 : 5645 ms

The benchmark shows that array lookup is faster than shifts (in ROL) but not faster than simple xoring (in XOR). It also shows that local variable and instance field .Length have identical performance, but only in loop variables, so i is optimized this way but k is not. The PR code is not yet updated according to these results. MERGE ON HOLD.

@arekbulski
Copy link
Member Author

arekbulski commented Apr 18, 2018

The code was updated according to the benchmarks. READY TO MERGE (AGAIN! :).

@kaitai-io kaitai-io deleted a comment from KOLANICH Apr 20, 2018
@kaitai-io kaitai-io deleted a comment from KOLANICH Apr 20, 2018
pluskal added a commit to pluskal/kaitai_struct_csharp_runtime that referenced this pull request Mar 18, 2021
…-ends-with-exception' into 'master'

Resolve "Forward PipeReader.SeekAsync on completed ReadResult ends with exception"

Closes kaitai-io#10

See merge request marta/kaitai_struct_csharp_runtime!20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants