Can't find the bug.

Found a bug? Report it here.

Moderator: PPS-Leaders

Post Reply
The Jo
General DuGalle
Posts: 121
Joined: Mon Mar 01, 2010 7:43 pm

Can't find the bug.

Post by The Jo » Tue Jul 26, 2011 9:54 pm

The bug occurs in the ai branch, when bots are in a level with pickups. After about a minute (timefactor =3, #bots > 10) the game suddenly crashes. I couldn't find neither the real reason nor a workaround.


=======================================================
= time: Tue Jul 26 23:38:16 2011
=======================================================
Program received signal SIGSEGV, Segmentation fault.
0x00781e24 in orxonox::WorldEntity::getWorldPosition (this=0xb42df18) at /home/jo/orxonox/IDEai/src/orxonox/worldentities/WorldEntity.cc:609
609 return this->node_->_getDerivedPosition();
(gdb)
#0 0x00781e24 in orxonox::WorldEntity::getWorldPosition (this=0xb42df18) at /home/jo/orxonox/IDEai/src/orxonox/worldentities/WorldEntity.cc:609
#1 0x00597577 in orxonox::AIController::tick (this=0xcd68558, dt=0.107454002) at /home/jo/orxonox/IDEai/src/orxonox/controllers/AIController.cc:243
#2 0x005bfe3c in orxonox::GSRoot::update (this=0x9b63c50, time=...) at /home/jo/orxonox/IDEai/src/orxonox/gamestates/GSRoot.cc:134
#3 0x011359be in orxonox::Game::updateGameStates (this=0x9ade5a8) at /home/jo/orxonox/IDEai/src/libraries/core/Game.cc:268
#4 0x011351c4 in orxonox::Game::run (this=0x9ade5a8) at /home/jo/orxonox/IDEai/src/libraries/core/Game.cc:198
#5 0x0051378e in orxonox::main (strCmdLine=...) at /home/jo/orxonox/IDEai/src/orxonox/Main.cc:105
#6 0x080494a8 in main (argc=1, argv=0xbf9806f4) at /home/jo/orxonox/IDEai/src/Orxonox.cc:78
(gdb)
Fail. Fail again. Fail better.

User avatar
x3n
Baron Vladimir Harkonnen
Posts: 810
Joined: Mon Oct 30, 2006 5:40 pm
Contact:

Re: Can't find the bug.

Post by x3n » Wed Jul 27, 2011 8:58 am

Crashes like this usually happen if you call a function on an invalid pointer. In this case you call wPoint->getWorldPosition(), hence we have to assume that wPoint is an invalid pointer.

Invalid pointer means that the pointer's address is either just crap (e.g. if the pointer is not/wrong initialized) or if the pointer used to point to an existing object, but this object is now destroyed.

Looking at your code, wPoint is a value in the waypoints_ vector, and the only location where you add values to this vector is line 1189 in ArtificialController.cc. This function gets called to add PickupSpawners and at the first glance it looks correct.

Now I don't know too much about PickupSpawners, but is it possible that they get destroyed during the game, e.g. if the pickup is picked up and respawn is disabled? This would explain why the pointer becomes invalid.

I can't answer this for sure, but you could try and add a line of output to the destructor of PickupSpawner to see if one gets destroyed. You may also print the "this" pointer of the destroyed PickupSpawner to the console to see if it matches the wPoint pointer in the crash log.

If this turns out to be the problem, you could fix it by using weak pointers. I'm not entirely sure about the syntax, but I think you could define the waypoints vector this way: std::vector<WeakPtr<WorldEntity> > waypoints_; (Note that we need the space between > and > because >> is the shift operator.) The rest of the code should remain more or less the same. Weak pointers become 0 if the object is destroyed, so you should take care of this and occasionally remove them from the waypoint vector.

If this is not the problem, I need to have a closer look and try it at home.
Fabian 'x3n' Landau, Orxonox developer

User avatar
Mozork
Mogthorgar, the mighty
Posts: 134
Joined: Wed Sep 24, 2008 3:27 pm

Re: Can't find the bug.

Post by Mozork » Wed Jul 27, 2011 9:12 am

PickupSpawners are destroyed when they run out of pickups to spawn. So this might very well be the problem.

The Jo
General DuGalle
Posts: 121
Joined: Mon Mar 01, 2010 7:43 pm

Re: Can't find the bug.

Post by The Jo » Wed Jul 27, 2011 7:39 pm

Ok. I could fix this one with your help. But while testing I could catch a different bug (which I couldn't reproduce so far). It occured quite after a while, when letting 20 bots gather pickups on timefactor 3, waiting for the previous bug to occur again.

=======================================================
= time: Wed Jul 27 21:15:29 2011
=======================================================
Program received signal SIGSEGV, Segmentation fault.
0x00000000 in ?? ()
(gdb)
#0 0x00000000 in ?? ()
#1 0x009b265a in orxonox::Pickupable::drop (this=0xbde98c0, createSpawner=true) at /home/jo/orxonox/IDEai/src/orxonox/interfaces/Pickupable.cc:255
#2 0x01d70736 in orxonox::MetaPickup::changedUsed (this=0xd317ce8) at /home/jo/orxonox/IDEai/src/modules/pickup/items/MetaPickup.cc:148
#3 0x009b202c in orxonox::Pickupable::setUsed (this=0xd317ce8, used=true) at /home/jo/orxonox/IDEai/src/orxonox/interfaces/Pickupable.cc:135
#4 0x01d1424e in orxonox::Pickup::changedPickedUp (this=0xd317ce8) at /home/jo/orxonox/IDEai/src/modules/pickup/Pickup.cc:208
#5 0x009b27b8 in orxonox::Pickupable::setPickedUp (this=0xd317ce8, pickedUp=true) at /home/jo/orxonox/IDEai/src/orxonox/interfaces/Pickupable.cc:287
#6 0x009b2390 in orxonox::Pickupable::pickup (this=0xd317ce8, carrier=0xb946ba0) at /home/jo/orxonox/IDEai/src/orxonox/interfaces/Pickupable.cc:227
#7 0x01d50db6 in orxonox::PickupSpawner::trigger (this=0xb04b8d8, pawn=0xb9468c0) at /home/jo/orxonox/IDEai/src/modules/pickup/PickupSpawner.cc:319
#8 0x01d506ca in orxonox::PickupSpawner::tick (this=0xb04b8d8, dt=0.0385529995) at /home/jo/orxonox/IDEai/src/modules/pickup/PickupSpawner.cc:203
#9 0x008c0e54 in orxonox::GSRoot::update (this=0x9987878, time=...) at /home/jo/orxonox/IDEai/src/orxonox/gamestates/GSRoot.cc:134
#10 0x0104d9be in orxonox::Game::updateGameStates (this=0x98875a8) at /home/jo/orxonox/IDEai/src/libraries/core/Game.cc:268
#11 0x0104d1c4 in orxonox::Game::run (this=0x98875a8) at /home/jo/orxonox/IDEai/src/libraries/core/Game.cc:198
#12 0x008140be in orxonox::main (strCmdLine=...) at /home/jo/orxonox/IDEai/src/orxonox/Main.cc:105
#13 0x080494a8 in main (argc=1, argv=0xbfaf50e4) at /home/jo/orxonox/IDEai/src/Orxonox.cc:78
(gdb)
Fail. Fail again. Fail better.

User avatar
Mozork
Mogthorgar, the mighty
Posts: 134
Joined: Wed Sep 24, 2008 3:27 pm

Re: Can't find the bug.

Post by Mozork » Wed Jul 27, 2011 7:50 pm

Seems to be a problem with a MetaPickup. I'll look into it.

User avatar
x3n
Baron Vladimir Harkonnen
Posts: 810
Joined: Mon Oct 30, 2006 5:40 pm
Contact:

Re: Can't find the bug.

Post by x3n » Thu Jul 28, 2011 7:55 am

Hum, yesterday I told you in IRC that the bug is probably not in your code (because it seems to happen entirely in the pickup system), however looking at your code again, line 109 in ArtificialController.cc draw my attention:

Code: Select all

this->waypoints_[i]->destroy();
What you're doing here is deleting the waypoint object itself. Since you use existing objects (like a PickupSpawner) as waypoints, you effectively destroy parts of the level and thats probably not what you want. destroy() makes only sense if you create the waypoint objects yourself and later destroy them again, but not for existing objects.

The crash probably happens because the pickup assumes the pickup spawner still exists, but it was deleted because it was used as a waypoint. Even though deleting the spawner is unexpected behavior in this case, I suggest Damian still fix it because I can imagine cases where we deliberately want to destroy pickup spawners, e.g. if there are pickups on a mothership and the mothership gets destroyed. ;)
Fabian 'x3n' Landau, Orxonox developer

The Jo
General DuGalle
Posts: 121
Joined: Mon Mar 01, 2010 7:43 pm

Re: Can't find the bug.

Post by The Jo » Thu Jul 28, 2011 3:37 pm

Of course it is no good idea to delete pickupspawners that I have not created - that's why I fixed this issue. The waypointcontroller should still work as intended. But since the bug was triggered in the middle of the testing, I'm quite sure that it couldn't be triggered by the artificialcontroller's destructor, which is called when I leave the game.

By the way, there's another issue I couldn't analyse. Similar to the previous issue the game crashed while the game was running.
=======================================================
= time: Thu Jul 28 17:11:23 2011
=======================================================
Program received signal SIGABRT, Aborted.
0x007e6416 in __kernel_vsyscall ()
(gdb)
#0 0x007e6416 in __kernel_vsyscall ()
#1 0x04597e71 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#2 0x0459b2ea in abort () at abort.c:121
#3 0x06cd40b5 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#4 0x06cd1fa5 in ?? () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#5 0x06cd1fe2 in std::terminate() () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#6 0x06cd2be5 in __cxa_pure_virtual () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#7 0x0632cf32 in orxonox::OrxonoxClass::~OrxonoxClass (this=0xc3effdc, __in_chrg=<value optimized out>) at /home/jo/orxonox/IDEai/src/libraries/core/OrxonoxClass.cc:74
#8 0x00f6e49d in orxonox::SpaceShip::~SpaceShip (this=0xc3efb88, __in_chrg=<value optimized out>, __vtt_parm=<value optimized out>) at /home/jo/orxonox/IDEai/src/orxonox/worldentities/pawns/SpaceShip.cc:97
#9 0x00f6e599 in orxonox::SpaceShip::~SpaceShip (this=0xc3efb88, __in_chrg=<value optimized out>, __vtt_parm=<value optimized out>) at /home/jo/orxonox/IDEai/src/orxonox/worldentities/pawns/SpaceShip.cc:106
#10 0x0632d081 in orxonox::OrxonoxClass::destroy (this=0xc3effdc) at /home/jo/orxonox/IDEai/src/libraries/core/OrxonoxClass.cc:88
#11 0x00c87763 in orxonox::PawnManager::preUpdate (this=0x97c60b8, time=...) at /home/jo/orxonox/IDEai/src/orxonox/PawnManager.cc:56
#12 0x00c8a9b9 in orxonox::Singleton<orxonox::PawnManager>::preUpdateSingleton (this=0x97c60b8, time=...) at /home/jo/orxonox/IDEai/src/libraries/util/Singleton.h:147
#13 0x00c8a66c in orxonox::ClassScopedSingletonManager<orxonox::PawnManager, (orxonox::ScopeID::Value)0, false>::preUpdate (this=0x11edf80, time=...) at /home/jo/orxonox/IDEai/src/libraries/util/ScopedSingletonManager.h:184
#14 0x062e709f in orxonox::ScopedSingletonManager::preUpdate<(orxonox::ScopeID::Value)0> (time=...) at /home/jo/orxonox/IDEai/src/libraries/util/ScopedSingletonManager.h:97
#15 0x062e39ad in orxonox::Core::preUpdate (this=0x96c9770, time=...) at /home/jo/orxonox/IDEai/src/libraries/core/Core.cc:403
#16 0x062f81b9 in orxonox::Game::run (this=0x96c95a8) at /home/jo/orxonox/IDEai/src/libraries/core/Game.cc:188
#17 0x00c801be in orxonox::main (strCmdLine=...) at /home/jo/orxonox/IDEai/src/orxonox/Main.cc:105
#18 0x080494a8 in main (argc=1, argv=0xbf89ede4) at /home/jo/orxonox/IDEai/src/Orxonox.cc:78
(gdb)
Fail. Fail again. Fail better.

User avatar
x3n
Baron Vladimir Harkonnen
Posts: 810
Joined: Mon Oct 30, 2006 5:40 pm
Contact:

Re: Can't find the bug.

Post by x3n » Thu Jul 28, 2011 4:18 pm

Ah yes, of course, you're right... I thought AIControllers are destroyed when a bot dies, but in fact they're not. So we have two really strange crashes with unknown cause... maybe there's a bug somewhere else which overwrites part of the memory where it shouldn't (for example because of a wrong pointer).

This would be really hard to find, but maybe you can narrow it down by trying to find situations where it often crashes and situation where it doesn't crash... answering questions like these can help: how often do you get these crashes? Does it depend on the number of bots? Does it depend on the game speed? Only in your branch or also in the trunk? Do they happen in other levels as well? Only if there are pickups? Does it depend on the number of pickups? Only if bots use the steerable rocket? etc

If you believe the bug could be in your code you may even try to guess where it is... is there some code where you store pointers? Is there a chance that these pointers point to a deleted object? Do you cast pointers? Is there a chance that you use a static_cast in a situation where you should use a dynamic_cast? (Note: use static_cast if you cast e.g. from SpaceShip to WorldEntity, but use dynamic_cast if you cast from WorldEntity to SpaceShip). Do you use wrong c-style casts somewhere (they look like this: (SpaceShip*)pointer, it crashes if pointer is not a SpaceShip)? Do you have an array/vector in your code where you try to access elements which are out of bounds (e.g. element #10 in an array with 9 elements)?


Edit: You don't have to answer all these questions for me, it's more like a guide how to find the bug yourself. I'll have a look myself as well, but I probably don't have time today.
Fabian 'x3n' Landau, Orxonox developer

Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests