NASAM: Not Another Strange Acronym Mixer
The various abbreviations became more and more unpronouncable. I give you "NASAM".This is a new mixer that has been tested quite thoroughly. Since I don't have more than 2 machines at my disposal (a Ryzen 7 1700 and an i7-5820), testing takes quite some time. I will complete this posting with data on sparse gammas and other interesting things I may think of.
This is the fastest permutation function I am aware of that passes RRC-64-42-TF2-0.94.
Tommy Ettinger's function Pelican also passes RRC-64-42-TF2-0.94 and has similar speed.
uint64_t nasam(uint64_t x) {
// ror64(a, r) is a 64-bit rotation of a by r bits. x ^= ror64(x, 25) ^ ror64(x, 47); x *= 0x9E6C63D0676A9A99UL; x ^= x >> 23 ^ x >> 51; x *= 0x9E6D62D06F6A9A9BUL; x ^= x >> 23 ^ x >> 51; return x; } |
- Adding a constant after the first multiplication ("rrma2xsm2xs" in this post).
- Xoring a constant prior to rotating/xoring ("xNASAM" in this post).
- Xoring a constant prior to rotating/xoring and after the last xor-shift ("xNASAMx" in this post).
Assuming the statistical quality of 1--3 above is at least as good as NASAM, the most interesting thing to analyze is of course speed. xNASAMx also has another property that might be of interest in some use cases.
For speed measurements, I've used Chris Wellon's shootout-program, described here.
i7-5820 | Ryzen 7 1700 | |||||
---|---|---|---|---|---|---|
Mixer | gcc 9.2.1 | clang 8.0.1 | % | gcc 9.2.1 | clang 8.0.1 | % |
baseline | 10165 | 10213 | - | 8639 | 8681 | - |
splitmix64 | 6276 | 7334 | 100.00% | 6471 | 6971 | 100.00% |
rrmxmx | 6091 | 6064 | 83.04% | 5949 | 6361 | 91.24% |
NASAM | 4412 | 4489 | 61.20% | 4385 | 4816 | 69.08% |
Pelican | 4320 | 4382 | 59.74% | 4256 | 4638 | 66.53% |
xNASAM | 4156 | 4313 | 58.81% | 3959 | 4632 | 66.45% |
rrma2xsm2xs | 4278 | 4214 | 58.33% | 4083 | 4475 | 64.20% |
xNASAMx | 3921 | 4067 | 55.46% | 3762 | 3806 | 54.60% |
rrma2xsm2xs
The preliminary name of NASAM was rrm2xsm2xs for "2 rotations, multiply, 2 term xor-shift, multiply, 2 term xor-shift". The "rrma..." is simply an addition after the first multiply:
uint64_t rrma2xsm2xs(uint64_t x) {
// ror64(a, r) is a 64-bit rotation of a by r bits. x ^= ror64(x, 25) ^ ror64(x, 47); x = x * 0x9E6C63D0676A9A99UL + C; // Avoids trivial fixpoint at 0. x ^= x >> 23 ^ x >> 51; x *= 0x9E6D62D06F6A9A9BUL; x ^= x >> 23 ^ x >> 51; return x; } |
It is worth noting that the speed penalty compared to NASAM is small (4%--7%).
xNASAM
The same as NASAM with xor of a constant before the mixing steps. Adding instead of xoring would be equivalent to a linear offset in the sequence.This is much more promising as a construction where the constant could be used for something resembling independent streams. Some care will have to be taken; I think there's a good
way of segmenting a 64-bit stream index into an increment part and an xor-part for a total of $2^{64}$ distinct streams, each with period $2^{64}$.
uint64_t xnasam(uint64_t x, uint64_t c) { x ^= c; // ror64(a, r) is a 64-bit rotation of a by r bits. x ^= ror64(x, 25) ^ ror64(x, 47); x *= 0x9E6C63D0676A9A99UL; x ^= x >> 23 ^ x >> 51; x *= 0x9E6D62D06F6A9A9BUL; x ^= x >> 23 ^ x >> 51; return x; } |
xNASAMx
The same as NASAM with xor of a constant before and after the mixing steps.This has the interesting property that it to a small extent masks the counter and the increment used. The idea is similar to the XEX encryption mode, using NASAM as an unkeyed block cipher. This has zero cryptographic security, of course. Still it may be useful if you have a use case where a minor inconvience with regards to deducing the counter and/or increment is appropriate. The last xorstep makes the function harder to invert, without knowing the constant used.
This version is also the slowest of the mixers published here at 54%--56% of the speed of SplitMix and 80%--90% of the speed of NASAM. I somehow doubt that it behaves any better than xNASAM when considering using the constant
to generate distinct streams.
uint64_t xnasamx(uint64_t x, uint64_t c) { x ^= c; // ror64(a, r) is a 64-bit rotation of a by r bits. x ^= ror64(x, 25) ^ ror64(x, 47); x *= 0x9E6C63D0676A9A99UL; x ^= x >> 23 ^ x >> 51; x *= 0x9E6D62D06F6A9A9BUL; x ^= x >> 23 ^ x >> 51; x ^= c; return x; } |
Below, 256 subtests of 4 TB each (1 PB total) didn't show any anomalies. This of course doesn't mean that there aren't any; just that RRC-64-42-TF2-0.94 is a too weak test to detect them.
ror64(reverse(x, R) ^ |
FORWARD | REVERSED | |||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Offset | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | |
0 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | |
16 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | |
32 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | |
48 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | |
ror64(reverse(x, R) ^ |
FORWARD | REVERSED | |||||||||||||||||||||||||||||||
Offset | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | |
0 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | |
16 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | |
32 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | |
48 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 | 42 |