Very small triangles were being missed because of the
not-so-robust ray-triangle intersection algorithm I was using.
Switched to the algorithm from the paper "Watertight
Ray/Triangle Intersection" by Woop et al. Happily, the new
algorithm doesn't seem to measurably slow down renders at all.
They are now generated by a build.rs script from nothing but the
colorspace's primaries, which makes it super easy to add more
colorspaces. So easy that I added three more: ACES AP0, ACES AP1
and Rec.2020.
This lays the foundation for supporting output to different
colorspaces.
All scene data collection is now done in a single sweep of frame
changing. Previous commits were already working towards this, and
but now it's done. Yay!
Over-all, switching to this approach gives huge speed boosts on
large scenes with animation, rigs, dependencies, etc. For such
scenes, frame changing is very expensive.
This eliminates writing temp files to disk for any part of the
Blender/Psychopath integration.
The option to export to a file still exists, however, by
specifying an export output path.
This is more a piece-of-mind thing than anything else. But it
also lets us make the number of LDS dimensions lower without
worrying, which in turn makes the code smaller.
After implementation, it does appear to make rendering slower
by a noticable bit compared to what I was doing before. At very
low sampling rates it does provide a bit of visual improvement,
but by the time you get to even just 16 samples per pixel its
benefits seem to disappear.
Due to the slow down and the minimal gains, I'll be removing
this in the next commit. But I want to commit it so I don't
lose the code, since it was an interesting experiment with
some promising results.
I couldn't make the BVH4 faster than the BVH, and the bitstack
was bloating the AccelRay struct. Removing the bitstack gives
a small but noticable speedup in rendering.
Specifically, LightPath is now significantly smaller, and
resultingly faster to process.
Also finally fixed the bug where without light sources the light
from the sky wouldn't affect surfaces.
If the average surface area of all the time samples is close enough
to the surface area of their union, just take the union and use that.
This both makes the BVH smaller in memory (time samples don't
propigate up the tree beyond their usefulness) and makes it
faster since traversal can avoid interpolating BBoxes when there's
only one BBox for a node.
Reduced from 64 to 42. This still allows each BVH to hold 4.4
trillion elements, but it guarantees that the accel ray's
traversal bitstack can accommodate at least two nested max-depth
trees.