Developer Diary | Posted: 01/11/2005
Visibility Determination System
Introduction
In this issue of the developer's diary I will mention a solution of a visibility determination system (VSD), which I have recently implemented to the engine for the game El-Matador.
For the most of 3D games (except of few exceptions, naturally) VSD system is quite a key component that affects a speed of a depiction of the game. VSD system deals with a very important task - from thousands of objects, which together make millions of polygons it has to determine expressly those, which the player can see. That means that if everything what lies in a visual pyramid was always processed for a complex environment, you may have any PC you want, but FPS you get would not be too interactive. If you don't believe it, you may start Doom3, write "noclip" in the console and then go out of the room and look around a little. If you reach a higher speed than 1 FPS, well done! VSD system is very important for the games that take place rather in an indoor environment (Doom3), the games that take place in an outdoor environment (FarCry) use more LOD (Level Of Detail) system. If the game takes place in a combined environment, what is just our case, everything is a little more complicated.
I was always very interested of VSD systems and related issues, so I have been following actual trends for some 10 years and I must say, when I implemented BSP-PVS system for our first 3D engine few years ago, I didn't even dream that we would solve visibilities by such an exotic method as we are doing now :
"Spatial databases" & VSD
The 3D scene must include information of a spatial layout of objects - so-called spatial index (or spatial database, as you wish) to allow us any reasonable activity (depictures, calculation of collisions etc.). This spatial index is a data structure, which enables you to make some operations very effectively.
You can see it as an abecedarian phone book (which is basically a one-dimensional index). If you try to find somebody's phone number in a phone book which is not in an alphabetical order, you will not be probably very successful. The same is with the spatial indexes except of the fact that there are many methods of implementing of these indexes. From the non-technical and practical point of view, the principal difference that we (and in particular the designers) are interested in is whether this index is static or dynamic. If you use a static way of the spatial division of the scene, which is represented e.g. by BSP trees (BSP method is used e.g. by engines going from Quake), you must make a compilation of the scene to the resulting format after each bigger adjustment of the map, which may be a very uneasy, slow and potentially defective process. We have implemented quite a new way of the spatial division for Matador in our engine (before we also used BSP) - a dynamic AABB tree. This structure is fully dynamic and any handling with the objects of the scene has an immediate response. You can insert objects to the scene, move, delete them and you never need any "compilation".
The static methods of the representation of the scene usually use pre-calculated (static) structures to determine the visibility. In case of Quake engine and its derivates we used the legendary PVS (Potentially Visible Set) structure, which include information from which part of the map you can see other parts of the map. Such structure is super-effective, which somewhat balances the uneasy re-counting after each bigger change of the map. As we use a dynamic method of the division of the scene in our engine, also the determination of the visibility is fully dynamic. Instead of determine what we can see the system determines what you cannot see. All this goes together well. PVS systems use so-called portals (in fact these are windows by which the system decide where you can see), our system uses so-called antiportals (alias occluders), which determine where you cannot see.
Occlusion culling
The system of the visibility in our engine comes under the category of so-called image space occlusion culling systems, which use the fact that some objects shade the view very effectively and thus inhibit display of objects lying behind them. What we appreciate in such systems is the fact that they may be quite easily (using some optimizations) used for any 3D environment. It doesn't matter whether this is an indoor or outdoor environment. Almost always you can find suitable objects that shade you the view. Typhoon2 engine uses a modified Hierarchical Occlusion Maps (HOM) system, which is a very effective way of the implementation of the occlusion culling. This simply works so that the objects, which are good occluders for the actual frame (so they well shade in the view), depict to a special z-buffer, from which a hierarchic z-buffer is made (this is similar to making of mipmaps at textures, but the rules for the filtration are different) against which the visibility of other parts of the scene (again by a hierarchic way, due to the spatial index). The hierarchic z-buffers use almost all modern graphic cards in the level of pixels. That is because this structure allows determining very effectively whether a part of the scene is overlaid by a nearer part of the scene and so it doesn't need to be worked up. It seems to me that some readers may think this hierarchy is too much. Simply said, at all algorithms and programming applies: The hierarchic is fast, the non-hierarchic is slow :)
Exotic part of occlusion culling :)
To make the occlusion system working as effectively as possible we have to do few things. The first is that a rasterization of the occluders to the special z-buffer (much smaller than a resulting resolution of the picture) must be as effective as possible. This can be reached very easily by using of specially modelled parts of the scene and objects with very low number of polygons for the occluders and by drawing of these occluders with a software rastizer written by hand and optimized (utilization of SSE instructions) (such things were programmed when the graphic accelerators didn't exist yet). Though the engine includes this system, the modelling of simplified versions of models - occluders took us much time, so we implemented the whole by a slightly different way. Currently the system works like this: the occluders are drawn by a graphical accelerator, that does it very quickly and we don't need to use the simplified versions of models. However it cannot manage (at least not on AGP bus and an actual version of Direct3D) a reading of z-buffer data from the graphic card back to the main PC storage. But evaluating the results, in the actual situation using of occluders GPU pays for the rasterization.
Maybe the only disadvantage of the occlusion culling systems is that they work well in the environment where many objects shade the view, but they work much worse in the environment with big quanta of objects that do not shade the view at all. Just these are the objects, which the designers like to put thousands of them to maps (any boxes, cans, chairs etc.).
The problem is that if you enter a room with several hundreds of such objects, the occlusion system looses so much time by finding whether the given object is visible, that it is no more effective. We solve this problem in our engine by keeping statistics of the visibility in a special cache, which is then used to optimize the requirements for the calculation of the visibility of each object. So if you run around a host of tiny objects, the system knows that they were visible during the last x seconds and it draws them automatically without solving their visibility.
I think that's enough for today, so let's leave something for the next time.
Thank you for reading this.
Petr
Introduction
In this issue of the developer's diary I will mention a solution of a visibility determination system (VSD), which I have recently implemented to the engine for the game El-Matador.
For the most of 3D games (except of few exceptions, naturally) VSD system is quite a key component that affects a speed of a depiction of the game. VSD system deals with a very important task - from thousands of objects, which together make millions of polygons it has to determine expressly those, which the player can see. That means that if everything what lies in a visual pyramid was always processed for a complex environment, you may have any PC you want, but FPS you get would not be too interactive. If you don't believe it, you may start Doom3, write "noclip" in the console and then go out of the room and look around a little. If you reach a higher speed than 1 FPS, well done! VSD system is very important for the games that take place rather in an indoor environment (Doom3), the games that take place in an outdoor environment (FarCry) use more LOD (Level Of Detail) system. If the game takes place in a combined environment, what is just our case, everything is a little more complicated.
I was always very interested of VSD systems and related issues, so I have been following actual trends for some 10 years and I must say, when I implemented BSP-PVS system for our first 3D engine few years ago, I didn't even dream that we would solve visibilities by such an exotic method as we are doing now :
"Spatial databases" & VSD
The 3D scene must include information of a spatial layout of objects - so-called spatial index (or spatial database, as you wish) to allow us any reasonable activity (depictures, calculation of collisions etc.). This spatial index is a data structure, which enables you to make some operations very effectively.
You can see it as an abecedarian phone book (which is basically a one-dimensional index). If you try to find somebody's phone number in a phone book which is not in an alphabetical order, you will not be probably very successful. The same is with the spatial indexes except of the fact that there are many methods of implementing of these indexes. From the non-technical and practical point of view, the principal difference that we (and in particular the designers) are interested in is whether this index is static or dynamic. If you use a static way of the spatial division of the scene, which is represented e.g. by BSP trees (BSP method is used e.g. by engines going from Quake), you must make a compilation of the scene to the resulting format after each bigger adjustment of the map, which may be a very uneasy, slow and potentially defective process. We have implemented quite a new way of the spatial division for Matador in our engine (before we also used BSP) - a dynamic AABB tree. This structure is fully dynamic and any handling with the objects of the scene has an immediate response. You can insert objects to the scene, move, delete them and you never need any "compilation".
The static methods of the representation of the scene usually use pre-calculated (static) structures to determine the visibility. In case of Quake engine and its derivates we used the legendary PVS (Potentially Visible Set) structure, which include information from which part of the map you can see other parts of the map. Such structure is super-effective, which somewhat balances the uneasy re-counting after each bigger change of the map. As we use a dynamic method of the division of the scene in our engine, also the determination of the visibility is fully dynamic. Instead of determine what we can see the system determines what you cannot see. All this goes together well. PVS systems use so-called portals (in fact these are windows by which the system decide where you can see), our system uses so-called antiportals (alias occluders), which determine where you cannot see.
Occlusion culling
The system of the visibility in our engine comes under the category of so-called image space occlusion culling systems, which use the fact that some objects shade the view very effectively and thus inhibit display of objects lying behind them. What we appreciate in such systems is the fact that they may be quite easily (using some optimizations) used for any 3D environment. It doesn't matter whether this is an indoor or outdoor environment. Almost always you can find suitable objects that shade you the view. Typhoon2 engine uses a modified Hierarchical Occlusion Maps (HOM) system, which is a very effective way of the implementation of the occlusion culling. This simply works so that the objects, which are good occluders for the actual frame (so they well shade in the view), depict to a special z-buffer, from which a hierarchic z-buffer is made (this is similar to making of mipmaps at textures, but the rules for the filtration are different) against which the visibility of other parts of the scene (again by a hierarchic way, due to the spatial index). The hierarchic z-buffers use almost all modern graphic cards in the level of pixels. That is because this structure allows determining very effectively whether a part of the scene is overlaid by a nearer part of the scene and so it doesn't need to be worked up. It seems to me that some readers may think this hierarchy is too much. Simply said, at all algorithms and programming applies: The hierarchic is fast, the non-hierarchic is slow :)
Exotic part of occlusion culling :)
To make the occlusion system working as effectively as possible we have to do few things. The first is that a rasterization of the occluders to the special z-buffer (much smaller than a resulting resolution of the picture) must be as effective as possible. This can be reached very easily by using of specially modelled parts of the scene and objects with very low number of polygons for the occluders and by drawing of these occluders with a software rastizer written by hand and optimized (utilization of SSE instructions) (such things were programmed when the graphic accelerators didn't exist yet). Though the engine includes this system, the modelling of simplified versions of models - occluders took us much time, so we implemented the whole by a slightly different way. Currently the system works like this: the occluders are drawn by a graphical accelerator, that does it very quickly and we don't need to use the simplified versions of models. However it cannot manage (at least not on AGP bus and an actual version of Direct3D) a reading of z-buffer data from the graphic card back to the main PC storage. But evaluating the results, in the actual situation using of occluders GPU pays for the rasterization.
Maybe the only disadvantage of the occlusion culling systems is that they work well in the environment where many objects shade the view, but they work much worse in the environment with big quanta of objects that do not shade the view at all. Just these are the objects, which the designers like to put thousands of them to maps (any boxes, cans, chairs etc.).
The problem is that if you enter a room with several hundreds of such objects, the occlusion system looses so much time by finding whether the given object is visible, that it is no more effective. We solve this problem in our engine by keeping statistics of the visibility in a special cache, which is then used to optimize the requirements for the calculation of the visibility of each object. So if you run around a host of tiny objects, the system knows that they were visible during the last x seconds and it draws them automatically without solving their visibility.
I think that's enough for today, so let's leave something for the next time.
Thank you for reading this.
Petr




