
It includes a profiler that actually does quite a good job because it has a simple mode as well. That mode does not measure the times spent in every function but rather takes samples. A sample is one little measurement that only evaluates the call stack at a specific moment.
The sampling rate is approximately 160/s, so I ran the test on a time frame of about 10 minutes.
Specs:
- Windows XP, compiling with Visual Studio 2008, Core Duo 2 GHz, 2 GB RAM
- Dedicated mode, unlimited fps, no clients, 30 bots, level loading excluded
Results:
I have attached three jpegs (yeah, there is no way to export it sorted and usefully). A short explanation about 'inclusive' and 'exclusive'. The latter means the samples in the particular function only. So when the sample was taken, this function was on top of the call stack. 'inclusive' also accounts when a child function was called.
Interpretation:
I have done some research on why the dynamic_cast might be so expensive and especially why it uses _strcmp. My worst fears were met:
Microsoft uses string comparisons of the whole class name (and templates can makes names veeery long)! How stupid is this for compiler that performs quite well in other applications?
http://www.gamedev.net/community/forums ... _id=508788
Another thing to note is the SceneNode udate function. Either badly implemented in OGRE or it really does use that much performance.
Anyway I don't see a way around it.
Notes:
When running standalone the spent in OGRE outweighs anything else (about 80% or so).
here we go:
http://www.orxonox.net/wiki/reto/Profiling