Commit Graph

159 Commits

Author SHA1 Message Date
96b8dd84b9 Cleaned up the u32 trifloat implementation.
This also makes encoding faster.  However, it no longer does
rounding to the nearest precision when encoding, and insead does
flooring.  This seems like a reasonable tradeoff: if you want more
precision... you should use a format with more precision.
2020-09-18 21:04:16 +09:00
f13ffac7bd Removed the experimental luma-chroma color format.
It was a worthwhile experiment, but for it to really work it needs
a really proper luma-chroma separation, which is both slower than
I really want, and requires knowing the colorspace being used.

I might make another go at this based on the TIFF LogLUV color
format, requiring XYZ as input.
2020-09-18 17:57:13 +09:00
c1f516c2b6 Use a better chroma formula for the RGB32 format.
This makes much better use of the bit space.
2020-09-13 11:22:48 +09:00
bd6cf359b4 Some code clean-up in the RGB32 encoding/decoding code. 2020-09-13 06:44:42 +09:00
d6ab9d06be More work on the packed HDR RGB 32-bit format.
Switched to a different chroma encoding, which is notably faster
and never produces negative values when decoded.
2020-09-11 21:57:43 +09:00
339568ec0c Remove stale comments. 2020-09-10 22:42:20 +09:00
7066c38189 Implement an experimental packed HDR RGB 32-bit storage format. 2020-09-10 22:36:20 +09:00
99b6ddfa54 Run clippy on Sobol sampler, and fix/silence warnings. 2020-05-01 16:07:19 +09:00
78acaa7b63 Make Sobol SIMD code work on all x86-64 platforms.
Before this it needed SSE 4.1, which is not strictly present on
all x86-64 platforms.  This will still compile the faster path if
SSE 4.1 is available, but has an alternate path as well for all
x86-64 platforms.
2020-05-01 15:32:18 +09:00
fd75a72655 More code tidying of the Sobol sampler.
In particular, clearer and more concise documentation.
2020-05-01 10:42:02 +09:00
45241784fb Code tidying on the Sobol sampler.
Also swapped the sample index and dimension paramater in the function
signature.  This feels more intuitive.
2020-04-30 22:46:47 +09:00
1f75e7854e Properly hash all four scramble values in the 4d Sobol sampler. 2020-04-25 18:12:35 +09:00
72adbedbb4 Accelerate the Sobol sampler with SIMD on x86_64. 2020-04-24 23:32:43 +09:00
0dfe916523 Preparing for SIMD accelerated Sobol sampling.
This implements the 4-wide API, and moves the renderer over to it.
But the actual implementation is still scalar.
2020-04-24 21:05:29 +09:00
78b5cf4c53 Faster lk hash for the sobol sampler.
This gets about the same quality as the previous hash, but in
much fewer operations.
2020-04-24 10:47:53 +09:00
3ffaf7bee5 Forgot to use a wrapping multiply instead of a straight multiply. 2020-04-23 13:52:04 +09:00
29fcc69ae1 Move seed hashing into the lk_scramble function. 2020-04-23 13:36:04 +09:00
b776bf56b8 Improved lk scrambling function.
This actually gets very close to the behavior of a full per-bit
hash, except that you still need a fairly random seed.
2020-04-23 08:31:03 +09:00
22b7151919 Use independent Sobol sequences for different sets of dimensions.
This reduces correlation artifacts, while preserving good convergence
for the most part.
2020-04-22 18:48:12 +09:00
aecff883ab Misc optimizations on the Sobol sampler.
The biggest one is avoiding a bunch of bit reversals by keeping
numbers in bit-reversed form for as long as we can.

Also reduced the hashing rounds: just 2 rounds seems to be enough
for a reasonable amount of statistical independence on both the
scrambling and shuffling.  I tested both independently, keeping
the other with no scrambling/shuffling respectively.  This makes
sense because in normal contexts 3 is enough, but in this case
both act as input to yet another hash which is effectively doing
more rounds.
2020-04-22 16:21:50 +09:00
660576ec2b Make Sobol seeding more robust.
This way incremental seeds can be passed (e.g. 0, 1, 2, etc.) and
still get statistically independent Sobol sequences.
2020-04-22 14:19:57 +09:00
085d1d655e A single unified Sobol implementation.
This version of Sobol implements both Owen scrambling and index
permutation, allowing for multiple statistically independent
Sobol sequences.
2020-04-19 01:11:43 +09:00
f36b71184a Use better seeding for slow version of Owen scramble as well. 2020-04-17 13:18:37 +09:00
c4e1bedd43 Improve Owen scrambling by seeding with add instead of xor.
Also removed some unnecessary complexity from the implementation,
and use different constants.
2020-04-17 10:42:35 +09:00
e46fc5a4d6 Cleaning up Sobol sampling code.
In particular, removing some things I tried when the golden ratio
sampling was causing problems, but that are now no longer needed.
2020-03-21 09:00:43 +09:00
3916043f33 Removed golden ratio sampling.
Turns out it causes interference with the Sobol sampler.

Also tweaked some other things about sampling after removing
golden ratio sampling, to make things better.
2020-03-19 19:49:45 +09:00
e014df2b1a Fix rare panic in Sobol sampler.
Due to the undefined behavior of shifting a number by its
bit-width, the Sobol sampler would panic when sample index
`1 << 15` was requested.

This fixes it without introducing any additional checks or
operations.
2020-03-19 09:59:19 +09:00
7daa133e15 Only use 16 bit integers for generating Sobol samples.
This limits the number of samples per dimension to 2^16, but that
should be more than enough for any rendering situation.  And this
reduces the direction numbers table size by a factor of 4.

This commit also takes advantage of the reduced bit space to
provide even better Owen scrambling, by utilizing the unused
16 bits for better mixing.
2020-03-19 08:58:42 +09:00
9e14e164e7 More improvements to Sobol sampling.
- Added an additional scramble round to the Owen scrambling, with
  new optimized constants.
- Reordered the dimensions of the direction numbers to improve 2d
  projections between adjecent dimensions.  Only did this for
  dimensions under ~40.
2020-03-18 12:05:42 +09:00
de5a643a2a Implement Sobol sampler that does both Owen and Cranley-Patterson.
This seems to work well in practice, and only takes one more addition
operation.
2020-03-17 20:00:09 +09:00
e3152e6f9c Add alternative Sobol direction numbers to the repo.
Also switching to one of the alternates, as it seems to give
better results than the one I was using before.
2020-03-17 16:49:19 +09:00
04a5dbff43 Generate sobol direction numbers from text file.
This way the numbers aren't cluttering up the source file, and
this also makes it easier to play with other direction numbers.
2020-03-17 15:21:15 +09:00
9c20c7a02f Improve Owen scrambling and add Cranley-Patterson rotation.
- Updated constants for Owen scrambling, based on better optimization
  criteria.
- Increased randomness for the higher bits in the Owen scrambling.
- A simple and efficient implementation of Cranley-Patternson rotation
  for the Sobol sampler.
2020-03-17 12:26:02 +09:00
420e078b70 Some cleanup of the comments in the last commit. 2020-03-15 22:37:34 +09:00
047e66d9aa Reworked Sobol sampler implementation.
This produces identical results, but generates the direction
vectors from the original sources at build time.  This makes
the source code quite a bit leaner, and will also make it easier
to play with other direction vectors in the future if the
opportunity arises.
2020-03-15 22:01:06 +09:00
ffc77ee1d5 Implement optimization for sobol sampler.
This significantly increases the sobol sampler's speed, especially
for higher sample counts.
2020-03-15 18:05:39 +09:00
c9f24e2728 Silence code style warning on generated Halton sampler code. 2020-03-15 17:17:39 +09:00
9b4781c81d Multiple improvements to sampling.
1. Use better constants for the hash-based Owen scrambling.
2. Use golden ratio sampling for the wavelength dimension.

On the use of golden ratio sampling:
Since hero wavelength sampling uses multiple equally-spaced
wavelengths, and most samplers only consider the spacing of
individual samples, those samplers weren't actually doing a
good job of distributing all the wavelengths evenly.  Golden
ratio sampling, on the other hand, does this effortlessly by
its nature, and the resulting reduction of color noise is huge.
2020-03-15 14:47:40 +09:00
a3ea90afdc Fixed broken Owen scrambling.
The previous implementation was fundamentally broken because it
was mixing the bits in the wrong direction.  This fixes that.

The constants have also been updated.  I created a (temporary)
implementation of slow but full owen scrambling to test against,
and these constants appear to give results consistent with that
on all the test scenes I rendered on.  It is still, of course,
possible that my full implementation was flawed, so more validation
in the future would be a good idea.
2020-03-12 20:20:56 +09:00
9ba51cd43a Improvments to the owen scrambling. 2020-03-11 23:22:07 +09:00
db9efc6a55 Removed unsafe code from sobol sampler and improved its documentation. 2020-03-11 21:30:14 +09:00
b081424ba6 Implemented Owen scrambling for the Sobol sampler.
This gives better variance than random digit scrambling, at a
very tiny runtime cost (so tiny it's lost in the noise of the
rest of the rendering process).
2020-03-11 18:29:46 +09:00
4a6284be40 Switch to sobol sampler.
The important thing here is that I figured out how to use the
scrambling parameter properly to decorrelate pixels.  Using the
same approach as with halton (just adding an offset into the sequence)
is very slow with sobol, since moving into the higher samples is
more computationally expensive.  So using the scrambling parameter
instead was important.
2020-02-22 09:18:45 +09:00
022c913757 Split out memory arena into an external crate. 2019-12-27 10:43:03 +09:00
c753890bb0 Fix/silence various clippy warnings. 2019-08-01 14:18:26 +09:00
95e7d6bdea Silence some silly clippy warnings on generated code. 2019-08-01 13:43:32 +09:00
f021def789 Do better chromatic adaptation for input RGB colors. 2019-07-26 16:01:56 +09:00
f42eedfd08 Make better use of the types in the glam crate.
Appears to give a tiny performance boost.
2019-07-25 11:36:52 +09:00
88e7365bc4 Switched from in-tree float4 lib to glam. 2019-07-22 22:30:37 +09:00
4adc81b66b Minor code pretty-ing in the Jakob spectral upsampler. 2019-07-09 16:24:02 +09:00
70721be8e0 Moved separate functions in halton sampler inline into the match.
Doesn't really have much impact, but makes me feel better for some
reason.
2019-07-07 17:39:34 +09:00
103775f0e9 Some cleanup and improvements to the trifloat sub-crate. 2019-07-07 16:27:44 +09:00
e31ec6eb4e Added a new trifloat type that uses 48 bits and is signed. 2019-07-07 14:02:09 +09:00
152d265c82 Switched all uninitialized memory to use MaybeUninit. 2019-07-06 13:46:54 +09:00
874b07df02 Filled in missing methods on the fall-back non-SIMD code. 2019-06-29 07:48:33 +09:00
b09f9684d1 Remove non-SIMD BVH4, and keep more bool calculations in SIMD format. 2019-06-29 07:22:22 +09:00
c5d23592b9 Keep Bool4 in its native format instead of converting to a bitmask.
This gives a small performance boost.
2019-06-28 22:56:51 +09:00
cd50e0dd11 Added some useful shuffle ops to Float4. 2019-06-21 22:14:18 +09:00
50f09a6134 Removed full Jakob implementation and moved table loading to build time.
The "light" version of Jakob still remains, which uses a much smaller
table.
2019-06-21 21:45:13 +09:00
5eeaec0a8b Use fmadd method in Jakob spectrum eval. 2019-06-19 17:49:52 +09:00
b3cc5c070a Added fused multiple-add method to Float4. 2019-06-19 17:45:04 +09:00
48e015996f Initial implementation of Jakob 2019 spectral upsampling.
It has a slight color cast to it at the moment, I believe due to
incorrect color space conversions, not because of the upsampling
method itself.  So Meng upsampling is still the active method
at the moment.
2019-06-09 19:51:43 +09:00
4aa002bb92 Reorganizing spectral upsampling crate for multiple algorithms. 2019-06-08 18:25:35 +09:00
fdad8f71bb Renamed spectral upsampling sub-crate. 2019-06-02 07:28:43 +09:00
508cda6021 Better path usage and "extern crate" removal in sub-crates. 2018-12-16 13:14:06 -08:00
5fb349cc49 Second step transitioning to Rust 2018. 2018-12-16 12:07:11 -08:00
8deb1e87bb First step transitioning to Rust 2018. 2018-12-16 12:02:20 -08:00
c73db2edbe Fix/silence a bunch of clippy warnings in the main crate. 2018-12-15 23:26:12 -08:00
d57c896151 Silence/fix clippy warnings in mem_arena sub-crate. 2018-12-15 22:34:51 -08:00
53424b393d Silence clippy warnings in spectra_xyz sub-crate. 2018-12-15 22:22:29 -08:00
f9d75f490c Silenced warnings in color sub-crate. 2018-12-15 22:06:32 -08:00
f2e591a91f Fixed clippy warnings in math3d. 2018-12-15 21:56:48 -08:00
8b6181d262 Fixed Clippy warnings in float4. 2018-12-15 21:41:16 -08:00
589a67caa4 Run latest rustfmt on code. No functional changes. 2018-12-08 13:23:44 -08:00
e9b495e729 Silence some clippy warnings on generated code and large preformatted data. 2018-12-08 13:21:41 -08:00
ea75e3ed21 Added benchmarks for both Trifloat and Oct32 encoding/decoding. 2018-11-29 11:09:48 -08:00
a6cae26c34 Added property tests for Oct32 encoding/decoding.
Tests random vectors, and makes sure that encoding/decoding
round trip only introduces precision errors below a certain
threshold.

Pretty confident that the implementation is correct now.
2018-11-29 09:50:38 -08:00
8e15dba29d Implementation of the Oct32 encoding of unit vectors.
The code still needs testing, but initial toying around suggests
that it's working correctly.
2018-11-28 23:41:12 -08:00
c0cb071251 Further optimizations to the trifloat implementation.
Also improved documentation.
2018-11-28 15:31:21 -08:00
27521f44a6 Cleanup and better docs for trifloat. 2018-11-23 23:12:06 -08:00
ff9a56977a Use bit fiddling to avoid some expensive operations in trifloat encoding/decoding. 2018-11-23 22:31:28 -08:00
3d1ade21c2 Better naming for the trifloat functions. 2018-11-23 21:38:27 -08:00
3fb22fdefa Implemented a "tri-float" encoding, similar to RGBE.
This implementation trades less range for more precision, giving
9 bits to each mantissa instead of just 8 bits as in RGBE.
2018-11-23 20:01:15 -08:00
498c1ea8d9 Running latest rustfmt. No functional change. 2018-10-30 22:31:25 -07:00
b14b1b13b5 Cleaned up some of the SIMD code in spectra_xyz. 2018-07-01 16:49:41 -07:00
3f55df7225 Some basic SIMD optimizations for XYZ->Spectrum conversion. 2018-07-01 15:50:34 -07:00
ef7084e694 Reorganized xyz_spectra crate a bit.
This way the executable code can be worked with directly, instead
of via the python file that generates the rust code.

Also introduced some small optimizations.
2018-07-01 14:29:19 -07:00
6d21a30840 Formatting with newer cargo fmt.
No meaningful code change, only formatting.
2018-06-24 21:18:10 -07:00
df27f7b829 Moved matrix transpose and inverse code into Float4 crate.
This allows for more optimized implementations, taking advantage
of SIMD intrinsics.
2018-06-24 21:06:32 -07:00
8e791259b3 Sped up Float4::h_sum for platforms with SSE3.
Since this is used heavily during matrix multiplication, gives a
nice little speed boost.
2018-06-24 16:45:21 -07:00
27d1b2286b Switch to stable SIMD intrinsics.
Rust 1.27 stablized a variety of cpu intrinsics, including SIMD
on x86/64 platforms.  This commit moves to using those intrinsics
for the optimized Float4 implementation.  This means Psychopath
now compiles on stable Rust with all optimizations.  Yay!
2018-06-24 15:32:09 -07:00
bbf832a3d8 Fixed compile error in float4 lib, and updated to latest simd crate. 2018-03-04 13:32:31 -08:00
c990672dfe Fix compiler warnings. 2018-03-04 13:06:22 -08:00
97d3304149 Run new rustfmt on codebase. 2018-03-04 13:00:55 -08:00
f39589ab72 Small refactor of float4 crate to make it easier to read. 2018-03-04 12:27:35 -08:00
09daf617ef Implemented a non-SIMD BVH4. Perf appears to be identical to BVH. 2017-07-01 15:08:05 -07:00
a4a73713d2 Created crate for BVH node traversal order calculations.
Might move this into the main source base at some point, but
I'm not totally sure about the correctness of the table yet, so
would like to generate it for now.
2017-07-01 12:44:19 -07:00
011405e131 Implemented robust ray origin calculation for bounced rays.
We take a small performance hit for this, but given that it's
making things meaningfully more correct I feel like it's more
than worth it.
2017-06-19 22:28:44 -07:00
b5f2237676 Reformatted sub-crates with new rustfmt as well. 2017-06-15 22:21:25 -07:00
b8321beaad Split colorspace transform functions out into their own crate.
They are now generated by a build.rs script from nothing but the
colorspace's primaries, which makes it super easy to add more
colorspaces.  So easy that I added three more: ACES AP0, ACES AP1
and Rec.2020.

This lays the foundation for supporting output to different
colorspaces.
2017-06-11 03:03:23 -07:00