måndag 16 december 2019

Stronger, better, morer, Moremur; a better Murmur3-type mixer.

I'll keep this brief: The original constants used in MurmurHash3 and Stafford's Variant 13 are not as good as they could have been. This affects the strength of SplitMix64.

The testing procedure I've used is described in an earlier post. There is one extension, compared to the earlier post: every rotation and reversal is also complemented for a total of 256 subtests. I'll call the extended version "RRC-64-40-TF2-0.94" for "Reversed, rotated, complemented-64 bits, 2^40 bytes, PractRand 0.94 with -tf 2". Without further ado, here's "Moremur", a statistically stronger but equally fast mixer:

uint64_t moremur(uint64_t x) {
  x ^= x >> 27;
  x *= 0x3C79AC492BA7B653UL;
  x ^= x >> 33;
  x *= 0x1C69B3F74AC4AE35UL;
  x ^= x >> 27;

  return x;
}
One strong flaw with this construction, i.e. the first operation being an xor-shift, is that there are many increments that will lead to few or none of the less significant bits being one. This in turn implies that the less significant bits will be zero after the first two operations, xor-shift and multiply. Particularly bad increments have this form: $\gamma = (2^s + 1)a, a \lt 2^s$, $s$ being the first shift and $a$ being sparse.

In the table below, Moremur outperforms Murmur3 and Variant13 by quite some margin.

Table 1: Test length where PractRand fails
$\gamma$ Murmur3 Variant13 Moremur
0x0000000000000001
$2^{17}$ $2^{19}$ $2^{33}$
0x0000000000000003
$2^{17}$ $2^{17}$ $2^{34}$
0x0000000000000005
$2^{17}$ $2^{18}$ $2^{34}$
0x0000000000000009
$2^{16}$ $2^{18}$ $2^{34}$
0x0000010000000001
$2^{18}$ $2^{21}$ $2^{41}$
0xffffffffffffffff
$2^{17}$ $2^{19}$ $2^{33}$
0x0000000000ffffff
$2^{17}$ $2^{23}$ $2^{37}$
0xffffff0000000001
$2^{18}$ $2^{23}$ $2^{43}$
0x0000000000555555
$2^{18}$ $2^{28}$ $2^{42}$
0x1111111111110001
$2^{25}$ $2^{30}$ $> 2^{46}$
0x7777777777770001
$2^{28}$ $2^{34}$ $> 2^{45}$
0x7f7f7f7f33333333
$2^{30}$ $2^{33}$ $> 2^{46}$
0x5555550000000001
$2^{21}$ $2^{27}$ $2^{45}$
0xc45a11730cc8ffe3
$2^{39}$ $> 2^{42}$ $2^{?}$
0x2b13b77d0b289bbd
$2^{39}$ $> 2^{42}$ $2^{?}$
0x40ead42ca1cd0131
$2^{40}$ $> 2^{42}$ $2^{?}$
As can be seen from the tables below, these constants are a great improvement over the ones I've been aware of so far (original MurmurHash3 and Variant13). Moremur still fails RRC-64-40-TF2-0.94 but far from as badly as the earlier constants do.
Moremur failures, $2^{40}$ bytes maximum.
ror64(reverse(x, R) ^ 0x0000000000000000, r)
FORWARDREVERSED
Offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 3319191922222425282728291616161716171919202827283429293536373738
16 1717181919191920212223232425262736353539403431313940393837343436
32 2829303131323333343434343433323228343332323133333434343535363736
48 3228282829292929303030303131313231313133363736283334313128252219
ror64(reverse(x, R) ^ 0xFFFFFFFFFFFFFFFF, r)
FORWARDREVERSED
Offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 3319191922222425282918281516161716171921212628303431343436363737
16 1718181919191919212223242425262736353539403431313940393837343436
32 2829303032333334343535343432333228343332323133333434343536363636
48 3228282828282929293030303131323231313133353534283232313028252219
Murmur3 failures, $2^{40}$ bytes maximum.
ror64(reverse(x, R) ^ 0x0000000000000000, r)
FORWARDREVERSED
Offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 1718181817161616161515171514151515171818181717161614141414141417
16 1414141414151515151516161616151517171617171717181917181817161414
32 1616171716161414161514151515151515161718181717171716141414141515
48 1514141414141515151515161717171717171717171717181819191818181716
ror64(reverse(x, R) ^ 0xFFFFFFFFFFFFFFFF, r)
FORWARDREVERSED
Offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 1717181717151616161515171415151514171818171716171615141414141617
16 1414141415151515151516161616151517171617171717181818171615151414
32 1617171716141414151514151415151415151718181717171716151414141415
48 1414141414141515151515161717171717171716171717171819191818181715
Variant 13, $2^{40}$ bytes maximum.
ror64(reverse(x, R) ^ 0x0000000000000000, r)
FORWARDREVERSED
Offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 1917181818171818181818191916171716171818181919202019202019202019
16 1717161716161717161716171717181820192118192119222021222120222219
32 1919191919191919191920201920191918181819192020192020192020202019
48 1717161717171617161717171818191921181821192220211918181818181818
ror64(reverse(x, R) ^ 0xFFFFFFFFFFFFFFFF, r)
FORWARDREVERSED
Offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 1917181818171818181818181916171716171818181919202019202019202019
16 1717161716171717161716171717171820192118182119222021222120222219
32 1919191919191919191920201920201918181819192020192020192020192019
48 1617161617171617161717171818191921191921192220211918181818181818
The constants were found through a stochastic hill climber I wrote using CUDA. The first testing I did of Murmur3 and Variant13 revealed that they had fairly weak higher order avalanche properties. Letting the fitness function include metrics for 1st to 4th order avalanche proved to be quite successful. The CUDA program was used for sieving; once some possibly good candidates turned up, I ran them through RRC-64-40-TF2-0.94. The constants above were the best I could find; the fitness function, combining several tests, was too insensitive to make any further improvements possible with the approach I used.

Conclusion

The shifts and constants above give a permutation that is statistically closer to a pseudo-random permutation than the ones used in SplitMix64, java.util.SplittableRandom and java.util.ThreadLocalRandom. In particular, the number of bad gammas/increments should be considerably fewer.

Applying the RRC-64-test to the degski64 and Lea64 mixers

In LXM: better splittable pseudorandom number generators (and almost as fast) by Guy L Steele and Sebastiano Vigna, approaches for pseud...