GameMonkey Script

GameMonkey Script Forums
It is currently Mon Oct 22, 2018 7:22 am

All times are UTC




Post new topic Reply to topic  [ 20 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: How to debug GM crashes?
PostPosted: Fri Jan 20, 2006 11:58 am 
Offline

Joined: Fri Jan 14, 2005 2:28 am
Posts: 439
I still occasionally get crashes in GM. I'm not entirely sure how to debug them. Here's the stack trace.

#0 0x08a4af0c in ?? ()
#1 0x4d095fda in gmGCColorSet::BlackenNextGray ()
from ./omni-bot/omnibot_et.so
#2 0x4d09651c in gmGarbageCollector::BlackenGrays ()
from ./omni-bot/omnibot_et.so
#3 0x4d09658a in gmGarbageCollector::Collect () from ./omni-bot/omnibot_et.so
#4 0x4d09aa04 in gmMachine::CollectGarbage () from ./omni-bot/omnibot_et.so
#5 0x4d09a6d4 in gmMachine::Execute () from ./omni-bot/omnibot_et.so
#6 0x4d072928 in ScriptManager::Update () from ./omni-bot/omnibot_et.so
#7 0x4d03e74d in IGameManager::UpdateGame () from ./omni-bot/omnibot_et.so
#8 0x4d04c2a1 in BotUpdate () from ./omni-bot/omnibot_et.so

I used to get these quite frequently and Greg helped me debug them in the past but it's a bit annoying to still get them. Greg, you got any advice on how to debug these?


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Jan 20, 2006 12:30 pm 
Offline

Joined: Fri Jan 14, 2005 2:28 am
Posts: 439
Btw, I'm running the latest:

GameMonkey Script current beta (v1.24c updated 11-Jan-06)


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Jan 20, 2006 12:44 pm 
Offline

Joined: Thu Jan 01, 2004 4:31 pm
Posts: 307
I also get random crashes, but mostly they're when the application exits. To counter this, I often have to manually perform a full GC sweep before I start trying to deallocate normal game objects.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Jan 20, 2006 1:14 pm 
Offline

Joined: Fri Jan 14, 2005 2:28 am
Posts: 439
I used to get asserts hit on exit because apparently it wan'ts CPP owned objects removed from the system before you delete the machine. Other than that I had asserts hit on the built in Vector3 library long ago that was rid of by not calling shutdown on it.

But these crashes in the GC code as shown have haunted me for some time. It's gotten less frequent but they are still there.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Jan 20, 2006 2:44 pm 
Offline

Joined: Mon Dec 15, 2003 1:38 pm
Posts: 707
I don't have any good ideas to help debug. When an object is not handled correctly for whatever reason, it will sooner or later crash in one of the BlackenNextGray() type functions. Enabling GC debug mode will typically assert just before the crash, but won't tell you the cause, which could be some time back. From my recent experience debugging GC issues, it appears that most have been caused by 1) lack of WriteBarrier or 2) C++ owned gmObjects not being traced (with GetNextObject()) during GC. (2) should be easier to find as a pointer to a known object might be trackable. (1) might still be located as the bad object, or a child of it. The WriteBarrier MUST be applied whenever a variable that references an object changes. (In practice, this doesn't happen for stack operations and only happens when the GC is actually running a cycle) So, for any container such as a gmTable, when A = B, the WriteBarrier is called with A.

I think a code path that runs a classic GC using the incremental interface could help debug or verify the integrity of the GC. I don't know of any better way at present to verify the GC integrity or prevent the user (person binding user types) from making mistakes.

The most helpful way to debug these issues is to find a 100% reproducable case. DrEvil, you should not need to use the Vector3 shutdown, in fact I believe it was removed in a recent version. Downgraded, if you have to call full collect before deallocating objects, you are just hiding a bug by causing the GC object list to shuffle somewhat and by chance prevent the bug from being exposed.


Last edited by Greg on Sat Jan 21, 2006 12:03 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Jan 20, 2006 10:18 pm 
Offline

Joined: Fri Jan 14, 2005 2:28 am
Posts: 439
Yea I didn't use the shutdown vec3 for long, and these days I'm using my own Vec3 bindings. I'll try to reproduce it this weekend.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sat Jan 21, 2006 7:11 am 
Offline

Joined: Fri Jan 14, 2005 2:28 am
Posts: 439
Well I've found a reproducable crash I have no idea what is causing it.

Code:
>   omnibot_ff.dll!gmGCObjBase::SetNext(gmGCObjBase * a_next=0x23ac930c)  Line 101 + 0x29 bytes   C++
    omnibot_ff.dll!gmGCColorSet::GrayThisObject(gmGCObjBase * a_obj=0x23ac930c)  Line 220   C++
    omnibot_ff.dll!gmGCColorSet::GrayAWhite(gmGCObjBase * a_obj=0x23ac930c)  Line 389   C++
    omnibot_ff.dll!gmGarbageCollector::GetNextObject(gmGCObjBase * a_obj=0x23ac930c)  Line 289 + 0x32 bytes   C++
    omnibot_ff.dll!gmMachine::ScanRootsCallBack(gmMachine * a_machine=0x21fd9aa0, gmGarbageCollector * a_gc=0x23ac9228)  Line 174   C++
    omnibot_ff.dll!gmGarbageCollector::Collect()  Line 546 + 0x1b bytes   C++
    omnibot_ff.dll!gmMachine::CollectGarbage(bool a_forceFullCollect=false)  Line 1203 + 0xb bytes   C++
    omnibot_ff.dll!gmMachine::Execute(unsigned int a_delta=15)  Line 1051   C++
    omnibot_ff.dll!ScriptManager::Update()  Line 214   C++


I've tracked it to this code

Function is:
Code:
typedef std::vector<String> StringVector;
int CommandReciever::DispatchCommand(const StringVector &_tokList)

I've verified the toklist has a valid string in it.

gmMachine *pMachine = ScriptManager::GetInstance()->GetMachine();
   gmTableObject *pCommandsTable = ScriptManager::GetInstance()->GetGlobalCommandsTable(); // treturns "Commands" table under the global table
if(pCommandsTable)
{
   pMachine->EnableGC(false);
   gmCall call;
   if(call.BeginTableFunction(pMachine, _tokList[0].c_str(), pCommandsTable))
   {
      // Add all the params
      gmTableObject *pParamTable = pMachine->AllocTableObject();
      if(_tokList.size() > 1)
      {
         for(obuint32 i = 1; i < _tokList.size(); ++i)
         {
            pParamTable->Set(pMachine, i-1, gmVariable(pMachine->AllocStringObject(_tokList[i].c_str())));
         }
      }      

      call.AddParamTable(pParamTable);
      call.End();
   }   

   pMachine->EnableGC(true);
   return 1;
}


The actual BeginTableFunction ends up failing due to functions not existing that matches the provided name, which is fine. The crash is apparently due to BeginTableFunction because that's the only thing that gets hit.

Even if I commented out the Disable and Enable of the GC it still crashes.

I don't know why this is happening. If I try a BeginTableFunction on a table after the init of the script system it works fine.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sat Jan 21, 2006 7:13 pm 
Offline

Joined: Fri Jan 14, 2005 2:28 am
Posts: 439
The actual crash looks like it could just be a logic error in GM.

Here's the code it bombs on.

Code:
#if DEPTH_FIRST
in void gmGCColorSet::GrayThisObject(gmGCObjBase* a_obj)

  // Put the gray object at the head of the gray list.
  a_obj->SetPrev(m_scan->GetPrev());
  a_obj->SetNext(m_scan);
  m_scan->GetPrev()->SetNext(a_obj);
  m_scan->SetPrev(a_obj);
#else // BREADTH_FIRST


When it crashes the GetPrev() is null, so the 3rd line where it does
m_scan->GetPrev()->SetNext is what bombs.

Any idea what could cause this?


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sat Jan 21, 2006 10:56 pm 
Offline

Joined: Thu Jan 01, 2004 4:31 pm
Posts: 307
have you ruled out gmBind as a cause?


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Jan 22, 2006 2:37 am 
Offline

Joined: Fri Jan 14, 2005 2:28 am
Posts: 439
Not really. I have no idea what's going on.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Jan 22, 2006 3:58 am 
Offline

Joined: Mon Dec 15, 2003 1:38 pm
Posts: 707
The crash is going to be in the GC linked list operation as you show. (Or in the ASSERT if debug checks enabled.)

Since the GC and gmObjects are a closed system now, it's near impossible to accidently 'new' or 'delete' a gmObject manually, so the cause is almost certainly a missing WriteBarrier somewhere causing a object to be freed when it is still valid. The write barrier can't be overused (that would just waste time and have no effect), but if it is missing bad things are guaranteed to happen sooner or later.

I havn't had a close look at gmBinds implementation recently to see if there are any GC suspects. I'll try and have a look soon.

What we're looking for is a container of gmVariables, where the variables can be reference types (gmObjects). The Set interface to such a container must apply the WriteBarrier. It could also be a system C++ owned gmObject*s where these objects are not responding to a GC cycle and are deemed disconnected and treated as garbage. The new experimental gmGCRoot<> is designed to prevent this from occuring by wrapping a gmObject* in appropriate code.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Jan 22, 2006 10:56 am 
Offline

Joined: Thu Jan 01, 2004 4:31 pm
Posts: 307
Greg - would you mind putting a paragraph together about what the write barrier is/does and how to use it? I'm still pretty clueless about it all...

I'll look into gmBind, it's not got any write barrier operations so could possibly be suspect #1


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Jan 22, 2006 11:55 am 
Offline

Joined: Mon Dec 15, 2003 1:38 pm
Posts: 707
An attempt at briefly explaining a WriteBarrier...

Incremental garbage collectors do not have complete control of the system while running a garbage collection cycle, so they must have some other means of maintaining the integrity of the graph of objects that determine what is and is not garbage, or what is or is not connected to a graph root.

A pointer variable references an object. When the value of such a pointer is stored, a Write Barrier may be used, or when such a pointer is loaded, a Read Barrier may be used to perform a special action required by the garbage collector.

If A and B are pointers to objects.
A = B statement, could apply WriteBarrier(A,B) if needed.

The gmTable, for example, stores a Key and Value pair, so the Write Barrier is applied when a Set occurs to both the Key and Value. Note that only reference types matter, we care about Objects, not Variables. Also note that even a NULL assignment matters because when A = NULL, we still care about the old A before it becomes NULL.

GameMonkey's Incremental GC method was chosen because of its high performance and minimal requirements. When the GC is actually running (which is rarely), it requires a Write Barrier (WB) to be applied, and only on the Left Hand Side, that is, the old value of a pointer before it is assigned a new value. (The WB only needs to be applied to non-root objects, so the thread stack, which is a root, and is regularly operated on, does not need apply a WB.)

For more all round and specific GC information I recommend reading the 'GameMonkeyGarbageCollection.pdf' and 'Non Compacting Memory Allocation and Real Time Garbage Collection' by Mark S Johnstone 1996, a paper on which the GC implementation used in GM was designed. You can also google '"write barrier" "read barrier"' for related links.

As a side note, I believe that Microsoft uses a generational, compacting version of a related algorithm for the .Net managed language environment.

I hope that made some sense?


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue Jan 24, 2006 5:05 pm 
Offline

Joined: Fri Jan 14, 2005 2:28 am
Posts: 439
Grr, I'm pulling my hair out, not able to find what is causing this crash. Greg, any way I could get some help again by chance? :)


Top
 Profile  
Reply with quote  
 Post subject: WriteBarrier
PostPosted: Fri Jul 30, 2010 1:26 am 
Offline

Joined: Tue Nov 11, 2008 9:03 am
Posts: 20
Sorry to ressurect an old thread.
Can you clarify if the write barrier has to be applied :
- when moving a reference inside a container? (the gmVariable change location)
- a reference is duplicate (the gmVariable is copied in another place in the container)
etc?
Thanks. :)


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 20 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group