In the past two months, I've been experimenting with GPU-based volume raycasting on again, off again.
What I'm realizing personally through my experiments there is that GPU-based raycasting is so highly fragment-shader-intensive mainly due to several dynamic branching and more than a hundred of loop iterations during the ray integration for each pixel, hence it is easily expected that reducing fragment shader load can drastically improve the overall performance.
One such optimization would be empty-space skipping, since only a small portion of all voxels contributes to the final image. And I am really curious how Ken did empty-space skipping in Voxlap. Would you tell me what kind of data structures / algorithms you employed for empty-space skipping, Ken? (Please don't just say "Read the source code!")
Also I'd like to ask whether you employed variable step sizes for raycasting in Voxlap. Did you use sort of adaptive, nonlinear step sizes or just a constant step size?
Best regards,
Awesoken at
what kind of data structures / algorithms you employed for empty-space skipping
I use a form of run-length encoding along vertical columns. Search for "vbuf format" in VOXLAP5.C. This fully describes my compression format. My rendering algorithm renders directly from this compressed data structure. I raycast along vertical planes (perpendicular to the ground), similar to a standard Comanche-style heightmapping demo. With 6 degrees of freedom, these planes project to diagonal lines on the screen (a 4 degree of freedom engine always projects to vertical lines, which is much easier to deal with).
Also I'd like to ask whether you employed variable step sizes for raycasting in Voxlap. Did you use sort of adaptive, nonlinear step sizes or just a constant step size?
In Voxlap, I sample at the grid intersections. If you think about it, there's no reason to sample at any other places - it would be wasteful. The raycasting works similar to a Wolf3D-style engine.
Slang at
Thanks for your reply.
So you didn't employ any geometric data structures like octrees or blocks at all but integrated both the color data of surface voxels and the additional data for empty-space skipping into one RLE-compressed data.
Awesoken said
I sample at the grid intersections. If you think about it, there's no reason to sample at any other places - it would be wasteful.
Ah, it sure is! :D I've traversed inside the unit cubes during the ray integration even though I was rendering only opaque surface voxels... That's a huge performance loss.
Now, I am also very curious how you detect floating objects in Voxlap. I guess it would be too slow to traverse all the adjacency voxels for each surface voxel. Would you briefly tell me your algorithm for the floating-object detection?
Awesoken at
Whenever voxels are removed from the map (solid to air), I add a bounded box of the modified region to a dirty box list. At the end of each frame, I merge dirty boxes (to avoid processing them twice) and check the regions for floating objects. I work directly with the RLE data so I can process a whole vertical columns of voxels at once. I like to call these vertical columns 'slabs' (i.e. (x,y,z0) to (x,y,z1)). I visit every slab intersecting the dirty box, and start a floodfill search for any slab that hasn't yet been visited. I abandon the search as soon as any part of the floodfill hits ground - in this case, the floodfilled region can't be floating. If another floodfill search hits an already visited slab, it is also not floating. To determine which slabs have already been visited, I use a hash table, which allows for a faster clear than the brute force alternative
Slang at
Thank you very much for the description. :D
As GPU vendors are now working on their more generalized streaming processors and new programming architectures (NVIDIA's CUDA and DAAMIT's CTM), I think hardware-accelerated real-time voxel rendering will finally see the light in the near future and we will be able to get at least one order, hopefully two orders of magnitude performance increase since then.
The future is voxel, isn't it? All the polygon engines are only getting 'reality' and, as a result, game development costs are rising tens of millions of dollars and thus we are losing 'novelty' in the industry. I strongly believe that voxels will change this situation. (Will voxels eventually go for the "reality race" too...?)
Slang at
Re: Empty-space skipping & variable step sizes in Voxlap
Please let me post my latest results of the GPU-based real-time voxel raycasting, by the way:
After the Ken's reply, I modified my raycasting shader to sample at grid intersections and now all the voxels look like cubes perfectly ;D And each pixel, I calculate its depth value in the fragment shader hence it is easy to integrate the voxels with polygons; just enable depth buffering.
Also check this out: Real-time footage video (.m1v, 9.30 MB) http://www.quakewars.jp/slang/files/videos/flx_vrc_01.m1v
Thanks,
Edited by Slang at
Awesoken at
Looks nice. I would be curious to see a comparison in frame rate between your shader, Slab6, and JFBuild (which can also render voxel models as cubes in OpenGL, just not with shaders).
Slang at
Awesoken said at
I would be curious to see a comparison in frame rate between your shader, Slab6, and JFBuild (which can also render voxel models as cubes in OpenGL, just not with shaders).
The frame rate of my raycasting shader is still quite slow at this point. On a GeForce 6800 GT, I get 20 fps at worst (of course, it rapidly increases as the number of pixels generated by raycasting the voxels decreases). Furthermore, the window size is only 320x240 in that case. It's really tough to render at 640x480 = 307200 pixels with lots of dynamic branching for each pixel due to NV40's dynamic branching cost. (I am not sure how is the GeForce 8800's dynamic branching performance since I have no money to buy the card :))
Currently, I am using uniform blocks 4x4x4 voxels each for empty-space skipping but it doesn't contribute to the performance much, so I am going to look into other techniques and see how they play out.