This limits the number of samples per dimension to 2^16, but that
should be more than enough for any rendering situation. And this
reduces the direction numbers table size by a factor of 4.
This commit also takes advantage of the reduced bit space to
provide even better Owen scrambling, by utilizing the unused
16 bits for better mixing.
- Added an additional scramble round to the Owen scrambling, with
new optimized constants.
- Reordered the dimensions of the direction numbers to improve 2d
projections between adjecent dimensions. Only did this for
dimensions under ~40.
- Updated constants for Owen scrambling, based on better optimization
criteria.
- Increased randomness for the higher bits in the Owen scrambling.
- A simple and efficient implementation of Cranley-Patternson rotation
for the Sobol sampler.
This produces identical results, but generates the direction
vectors from the original sources at build time. This makes
the source code quite a bit leaner, and will also make it easier
to play with other direction vectors in the future if the
opportunity arises.
1. Use better constants for the hash-based Owen scrambling.
2. Use golden ratio sampling for the wavelength dimension.
On the use of golden ratio sampling:
Since hero wavelength sampling uses multiple equally-spaced
wavelengths, and most samplers only consider the spacing of
individual samples, those samplers weren't actually doing a
good job of distributing all the wavelengths evenly. Golden
ratio sampling, on the other hand, does this effortlessly by
its nature, and the resulting reduction of color noise is huge.
The previous implementation was fundamentally broken because it
was mixing the bits in the wrong direction. This fixes that.
The constants have also been updated. I created a (temporary)
implementation of slow but full owen scrambling to test against,
and these constants appear to give results consistent with that
on all the test scenes I rendered on. It is still, of course,
possible that my full implementation was flawed, so more validation
in the future would be a good idea.
This gives better variance than random digit scrambling, at a
very tiny runtime cost (so tiny it's lost in the noise of the
rest of the rendering process).
The important thing here is that I figured out how to use the
scrambling parameter properly to decorrelate pixels. Using the
same approach as with halton (just adding an offset into the sequence)
is very slow with sobol, since moving into the higher samples is
more computationally expensive. So using the scrambling parameter
instead was important.
It has a slight color cast to it at the moment, I believe due to
incorrect color space conversions, not because of the upsampling
method itself. So Meng upsampling is still the active method
at the moment.
Tests random vectors, and makes sure that encoding/decoding
round trip only introduces precision errors below a certain
threshold.
Pretty confident that the implementation is correct now.
This way the executable code can be worked with directly, instead
of via the python file that generates the rust code.
Also introduced some small optimizations.
Rust 1.27 stablized a variety of cpu intrinsics, including SIMD
on x86/64 platforms. This commit moves to using those intrinsics
for the optimized Float4 implementation. This means Psychopath
now compiles on stable Rust with all optimizations. Yay!
Might move this into the main source base at some point, but
I'm not totally sure about the correctness of the table yet, so
would like to generate it for now.
They are now generated by a build.rs script from nothing but the
colorspace's primaries, which makes it super easy to add more
colorspaces. So easy that I added three more: ACES AP0, ACES AP1
and Rec.2020.
This lays the foundation for supporting output to different
colorspaces.
This is more a piece-of-mind thing than anything else. But it
also lets us make the number of LDS dimensions lower without
worrying, which in turn makes the code smaller.
After implementation, it does appear to make rendering slower
by a noticable bit compared to what I was doing before. At very
low sampling rates it does provide a bit of visual improvement,
but by the time you get to even just 16 samples per pixel its
benefits seem to disappear.
Due to the slow down and the minimal gains, I'll be removing
this in the next commit. But I want to commit it so I don't
lose the code, since it was an interesting experiment with
some promising results.