Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT]: drop obsoleted and exotic AVX512F #444

Closed
claudioandre-br opened this issue Jun 30, 2024 · 3 comments
Closed

[FEAT]: drop obsoleted and exotic AVX512F #444

claudioandre-br opened this issue Jun 30, 2024 · 3 comments
Labels
binaries Binaries will be impacted keep open Do NOT close automatically.

Comments

@claudioandre-br
Copy link
Member

claudioandre-br commented Jun 30, 2024

Description

I would say no (but I have little information). So, I compared nonOMP builds of commit 9950d782a7c6e3cf3184e163b706779bb15d8afd (see Additional Context below):

model name      : Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
[...]
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_stale_data retbleed gds
bogomips        : 5999.99

More investigation is desirable, but is there any reason to maintain two binaries for AVX512? I don't see any.


That said, I plan to remove one of the AVX512 binaries. Probably AVX512BW binaries. They offer (basically) the same performance, at the end of the day.

Alternatives Considered

  • Remove AVX512F (but it is the "level 1" AVX512. Better to keep it(?)
  • Remove AVX512BW

Additional Context

-       if ($verbose == 1) {
+       if ($verbose == 1 && ($kr < 0.9 || $kr > 10)) {
                printf "Ratio:\t%.5f real, %.5f virtual\t$id\n", $kr, $kv;
        }
$ ./relbench -v avx512f.txt avx512bw.txt 
Ratio:	0.89533 real, 0.89533 virtual	BKS, BouncyCastle:Raw
Ratio:	0.89675 real, 0.89675 virtual	HAVAL-128-4:Raw
Ratio:	0.89234 real, 0.89234 virtual	HAVAL-256-3:Raw
Ratio:	0.88959 real, 0.88959 virtual	NT:Raw
Ratio:	0.85729 real, 0.85729 virtual	PST, custom CRC-32:Raw
Ratio:	0.89568 real, 0.89568 virtual	Raw-MD5:Raw
Ratio:	0.88947 real, 0.88947 virtual	bitshares, BitShares Wallet:Only one salt
Ratio:	0.84717 real, 0.84717 virtual	dynamic_1:Many salts
Ratio:	0.87747 real, 0.87747 virtual	dynamic_5:Many salts
Ratio:	0.85823 real, 0.85823 virtual	dynamic_14:Many salts
Ratio:	0.84385 real, 0.84385 virtual	dynamic_20:Many salts
Ratio:	0.86657 real, 0.87092 virtual	dynamic_20:Only one salt
Ratio:	0.81111 real, 0.81111 virtual	dynamic_32:Many salts
Ratio:	0.86595 real, 0.86595 virtual	dynamic_32:Only one salt
Ratio:	0.88713 real, 0.88713 virtual	dynamic_160:Raw
Ratio:	0.88158 real, 0.87731 virtual	dynamic_180:Raw
Ratio:	0.88763 real, 0.89211 virtual	dynamic_190:Raw
Ratio:	0.88889 real, 0.88448 virtual	dynamic_210:Raw
Ratio:	0.88401 real, 0.88829 virtual	dynamic_240:Raw
Ratio:	0.87973 real, 0.87973 virtual	dynamic_300:Raw
Ratio:	0.84864 real, 0.84864 virtual	dynamic_1008:Many salts
Ratio:	0.88010 real, 0.88010 virtual	dynamic_1008:Only one salt
Ratio:	0.79302 real, 0.79302 virtual	dynamic_1012:Many salts
Ratio:	0.87527 real, 0.87527 virtual	dynamic_1012:Only one salt
Ratio:	0.84772 real, 0.84772 virtual	dynamic_1013:Many salts
Ratio:	0.89303 real, 0.89303 virtual	dynamic_1013:Only one salt
Ratio:	0.82138 real, 0.82138 virtual	dynamic_1034:Many salts
Ratio:	0.87467 real, 0.87467 virtual	dynamic_1034:Only one salt
Ratio:	0.87096 real, 0.87096 virtual	dynamic_2010:Many salts
Ratio:	0.87438 real, 0.87438 virtual	itunes-backup, Apple iTunes Backup:Raw
Ratio:	0.84691 real, 0.84691 virtual	leet:Many salts
Ratio:	0.89507 real, 0.89507 virtual	md5crypt, crypt(3) $1$ (and variants):Many salts
Ratio:	0.89180 real, 0.88852 virtual	monero, monero Wallet:Raw
Ratio:	0.83397 real, 0.83397 virtual	mssql05, MS SQL 2005:Only one salt
Ratio:	0.88348 real, 0.88348 virtual	mssql12, MS SQL 2012/2014:Many salts
Ratio:	0.87887 real, 0.87887 virtual	mssql12, MS SQL 2012/2014:Only one salt
Ratio:	0.87328 real, 0.87328 virtual	multibit, MultiBit or Coinomi Wallet:Many salts
Ratio:	0.76986 real, 0.76986 virtual	plaintext, $0$:Raw
Ratio:	0.88289 real, 0.88289 virtual	sapb, SAP CODVN B (BCODE):Many salts
Ratio:	0.87599 real, 0.87599 virtual	sapb, SAP CODVN B (BCODE):Only one salt
Ratio:	0.88409 real, 0.88409 virtual	xsha512, Mac OS X 10.7:Many salts
Number of benchmarks:		605
Minimum:			0.76986 real, 0.76986 virtual
Maximum:			1.29946 real, 1.29946 virtual
Median:				0.98733 real, 0.98679 virtual
Median absolute deviation:	0.01849 real, 0.01861 virtual
Geometric mean:			0.97784 real, 0.97793 virtual
Geometric standard deviation:	1.05101 real, 1.05107 virtual

A new run:
1.txt
2.txt

@claudioandre-br claudioandre-br added the binaries Binaries will be impacted label Jun 30, 2024
@solardiz
Copy link
Member

I suggest that you keep both. What you see in any one relbench is mostly noise, but you do also see a ~2% reduction in geometric mean.

Looking at the code, I see AVX512BW adds more optimal endianness conversion, which probably speeds up SIMD SHA somewhat - not enough to be seen among formats that became 10% slower, but perhaps enough for a few percent.

We also check for AVX512BW as a heuristic in DES_bs.h to distinguish CPUs vs. 2nd gen Xeon Phi. If you only leave AVX512F, then we'll be thrashing the cache on CPUs unnecessarily. IIRC, I tuned this in actual cracking runs on different devices, not just with --test.

Finally, I think for AVX512F we don't have vcmpeq_epi8_mask, which means a fallback from SIMD to scalar code in mgetl, which you did not benchmark. We do not have an intermediate fallback to AVX2 because we use the maximum length vtype.

@claudioandre-br
Copy link
Member Author

claudioandre-br commented Jun 30, 2024

Thinking the other way around: is the F build relevant?

  • can anyone cite relevant examples of CPUs that have AVX512F, but not AVX512BW?

CPUs that people buy (or that exist in some cloud)? Both binaries are necessary?

@solardiz
Copy link
Member

If you do insist on dropping one of these, then drop AVX512F. This will exclude 2nd gen Xeon Phi support (or cause a fallback to AVX2 in our binary builds), but those are obsoleted and exotic enough that the few people who still have them will generally make their own builds from source.

Current Intel CPUs either have AVX512BW or don't have any AVX-512.

For future Intel CPUs, the separation isn't going to be across the F vs. BW axis anyway, but rather across 256-bit vs. 512-bit support within AVX10, which we aren't fully prepared for anyway.

https://www.tomshardware.com/news/intels-new-avx10-brings-avx-512-capabilities-to-e-cores
https://twitter.com/InstLatX64/status/1807342697964835292/photo/1

@claudioandre-br claudioandre-br added the keep open Do NOT close automatically. label Jul 6, 2024
@claudioandre-br claudioandre-br changed the title [FEAT]: Is the AVX512BW build really better? [FEAT]: drop obsoleted and exotic AVX512F Jul 6, 2024
claudioandre-br added a commit that referenced this issue Jul 16, 2024
Drop the obsolete and exotic AVX512F. Current Intel CPUs either have
AVX512BW or don't have any AVX-512. For future Intel CPUs, the separation
isn't going to be across the F vs. BW.

Close #444.

Signed-off-by: Claudio André <[email protected]>
claudioandre-br added a commit that referenced this issue Jul 23, 2024
Drop the obsolete and exotic AVX512F. Current Intel CPUs either have
AVX512BW or don't have any AVX-512. For future Intel CPUs, the separation
isn't going to be across the F vs. BW.

Close #444.

Signed-off-by: Claudio André <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binaries Binaries will be impacted keep open Do NOT close automatically.
Projects
None yet
Development

No branches or pull requests

2 participants