The whole point of these formats is to compress down to less
space, so let's not leave actually putting it in the space-saving
form on the client code.
It is identical to the 32-bit format, except with more precision
and range due to using more bits. This format should comfortably
store any color information with precision easily exceeding the
limits of human vision.
This also makes encoding faster. However, it no longer does
rounding to the nearest precision when encoding, and insead does
flooring. This seems like a reasonable tradeoff: if you want more
precision... you should use a format with more precision.
It was a worthwhile experiment, but for it to really work it needs
a really proper luma-chroma separation, which is both slower than
I really want, and requires knowing the colorspace being used.
I might make another go at this based on the TIFF LogLUV color
format, requiring XYZ as input.
Before this it needed SSE 4.1, which is not strictly present on
all x86-64 platforms. This will still compile the faster path if
SSE 4.1 is available, but has an alternate path as well for all
x86-64 platforms.
The biggest one is avoiding a bunch of bit reversals by keeping
numbers in bit-reversed form for as long as we can.
Also reduced the hashing rounds: just 2 rounds seems to be enough
for a reasonable amount of statistical independence on both the
scrambling and shuffling. I tested both independently, keeping
the other with no scrambling/shuffling respectively. This makes
sense because in normal contexts 3 is enough, but in this case
both act as input to yet another hash which is effectively doing
more rounds.
Turns out it causes interference with the Sobol sampler.
Also tweaked some other things about sampling after removing
golden ratio sampling, to make things better.
Due to the undefined behavior of shifting a number by its
bit-width, the Sobol sampler would panic when sample index
`1 << 15` was requested.
This fixes it without introducing any additional checks or
operations.
This limits the number of samples per dimension to 2^16, but that
should be more than enough for any rendering situation. And this
reduces the direction numbers table size by a factor of 4.
This commit also takes advantage of the reduced bit space to
provide even better Owen scrambling, by utilizing the unused
16 bits for better mixing.
- Added an additional scramble round to the Owen scrambling, with
new optimized constants.
- Reordered the dimensions of the direction numbers to improve 2d
projections between adjecent dimensions. Only did this for
dimensions under ~40.
- Updated constants for Owen scrambling, based on better optimization
criteria.
- Increased randomness for the higher bits in the Owen scrambling.
- A simple and efficient implementation of Cranley-Patternson rotation
for the Sobol sampler.
I forgot to add this in. It wasn't noticable, since the QMC
sequences did use the seed, and we probably don't ever get to
the random values for 15+ light bounces. But it seems worth
fixing anyway!
This produces identical results, but generates the direction
vectors from the original sources at build time. This makes
the source code quite a bit leaner, and will also make it easier
to play with other direction vectors in the future if the
opportunity arises.