Author |
Message |
Agile Guest
Offline
|
Posted: Sun Mar 23, 2008 6:25 pm Post subject: Deadlock? |
 |
|
|
|
So I thought I had the "deadlock" situation under control by putting pthread_mutex_lock and unlock around every call to LLAdd and LLRemove, but i just got a deadlock today when me and my buddy were playin around in the zone. Can someone explain to me what exactly this is and how to fix it?
-me
|
|
Back to top |
|
 |
tcsoccerman Server Help Squatter
Age:33 Gender: Joined: Jan 15 2007 Posts: 694 Location: Atlantis Offline
|
Posted: Sun Mar 23, 2008 7:47 pm Post subject: |
 |
|
|
|
As far as i know it happens after any crash. Review any modules you made and or added. Remove modules to see if it still happens. Remake the deadlock situation. Test. Debug. Yes it sucks.
|
|
Back to top |
|
 |
JoWie Server Help Squatter
Gender: Joined: Feb 25 2004 Posts: 215 Offline
|
Posted: Sun Mar 23, 2008 8:03 pm Post subject: |
 |
|
|
|
It can happen in two ways:
Main thread (The thread you are most likely working in) hangs / crashes
Some thread holds a mutex lock, and never unlocks it (or it just takes to long)
The deadlock module aborts ASSS if the main thread hasn't responded for over 10 seconds
|
|
Back to top |
|
 |
Bak ?ls -s 0 in

Age:26 Gender: Joined: Jun 11 2004 Posts: 1826 Location: USA Offline
|
Posted: Mon Mar 24, 2008 1:35 am Post subject: |
 |
|
|
|
Well here's how a by-definition deadlock occurs:
Both threads need to acquire two locks, lock A and B. Thread 1 executes first and acquires lock A. After some time, thread 2 starts to execute (after all, one processor can only execute one thread at a time so it switches between them). It acquires lock B and then tries to get lock A. It can't get lock A since thread 1 holds the lock so thread 2 must stop executing until lock A is free. Thread 1 then executes and tries to acquire lock B. Since thread 2 holds lock B it can't get the lock. Now both threads are stuck forever.
Now this situation heavily depends on when tasks get preempted by the scheduler so it's a difficult bug to find and fix. Also, as mentioned before, if you never release a lock somewhere it can occur (as a thread will never be able to acquire the lock). An infinite loop inside a locked region (a critical section) may also cause a deadlock somewhere. And who said multi-threaded programming was supposed to be easy (until transactional memory becomes popular)!?!
I'd say do a
lm->log(L_DRIVEL,"got lock for MyLinkedList at line %i in file %s.\n", __LINE__, __FILE__);
when you acquire locks and
lm->log(L_DRIVEL,"released lock for MyLinkedList at line %i in file %s.\n", __LINE__, __FILE__);
whenever you release a lock. Then disable the deadlock module so it doesn't restart your server... when the program hangs check the log to see if there's some lock that you haven't released. (also do the same for arenalist locks and playerlist locks). If all locks are okay, do the same for every time you enter and leave a function in your code (or at least the popular ones like callbacks and interface functions). that way you can check for infinite loops. lastly it wouldn't surprise me if asss had deadlocks in the main code, since they're some of the hardest bugs to find and fix. However, since others aren't experiencing all these deadlocks it's most likely your code. _________________ SubSpace Discretion: A Third Generation SubSpace Client
deadlock
deadlock.PNG - 13.34 KB
File downloaded or viewed 35 time(s)
|
|
Back to top |
|
 |
Agile Guest
Offline
|
Posted: Mon Mar 24, 2008 10:27 am Post subject: |
 |
|
|
|
Okay, so this is the situation: any time I kill a fake player, the server "deadlocks".
This wasn't always the case so I assume I added something that is causing this, but I have no idea what it could be. The deadlock described by definition does not occur, because I tried adding the messages and all locks are locked/released as they should be.
Next I tried adding messages at the beginning to the necessary bot functions to see where the hang occurs. No hints there either, no messages are sent/received before the hang.
So I don't really know what to do now. It seems like the Bot_killed function just isn't being called, which doesn't make sense to me because it's worked before and nothing has changed in it. But when the bot dies, the server freezes instantly and "u killed it" is never sent.
Note: Just to be sure, I also tried again by logman logging it and there's no message in the console log either.
So, what is the next step? I'm at a loss.
dmg->AddFake(bd->p, &bd->pos, Bot_Killed, Bot_Respawn, bd);
void Bot_Killed(Player *p, Player *killer, void *clos)
{
chat->SendMessage(killer, "u killed it");
BotData *bd = clos;
void *v;
chat->SendMessage(killer, "u killed it");
stats->IncrementStat(killer, STAT_FLAG_POINTS, p->position.bounty);
stats->SendUpdates(v);
chat->SendMessage(killer, "u killed it");
kill_bot(bd);
if (get_bases(0) < 1)
{
chat->SendArenaMessage(ALLARENAS, "RTZ game has been won by team 1!");
new_game();
}
else if (get_bases(1) < 1)
{
chat->SendArenaMessage(ALLARENAS, "RTZ game has been won by team 0!");
new_game();
}
}
|
Hang occurs when the bot dies, but none of the BotDied code is executed.
-me
|
|
Back to top |
|
 |
Bak ?ls -s 0 in

Age:26 Gender: Joined: Jun 11 2004 Posts: 1826 Location: USA Offline
|
Posted: Mon Mar 24, 2008 11:56 am Post subject: |
 |
|
|
|
try using printf instead of logman or sendarenamessage. also make sure you end your lines with a \n so it'll flush the stream..
logman and sendarenamessage do buffering so they aren't instant.
|
|
Back to top |
|
 |
Dr Brain Flip-flopping like a wind surfer

Age:39 Gender: Joined: Dec 01 2002 Posts: 3502 Location: Hyperspace Offline
|
Posted: Mon Mar 24, 2008 6:42 pm Post subject: |
 |
|
|
|
Try using the Hyperspace version from monotone. Branch asss.asss.hs. We had an issue with destroying fakes from a locked context, and I committed fixes to the monotone to solve them.
The patch to fake.c is all you really need, actually. http://asss.yi.org/viewmtn/viewmtn.py/revision/info/ae7e0babc0bc7d9862e7e0c3fe1ce0709d4abc1f _________________ Hyperspace Owner
Smong> so long as 99% deaths feel lame it will always be hyperspace to me
|
|
Back to top |
|
 |
Bak ?ls -s 0 in

Age:26 Gender: Joined: Jun 11 2004 Posts: 1826 Location: USA Offline
|
Posted: Tue Mar 25, 2008 8:26 am Post subject: |
 |
|
|
|
wait is the problem when you kill a fake player or when you make the fake player leave the arena?
|
|
Back to top |
|
 |
Animate Dreams Gotta buy them all! (Consumer whore)

Age:37 Gender: Joined: May 01 2004 Posts: 821 Location: Middle Tennessee Offline
|
Posted: Wed Mar 26, 2008 11:21 am Post subject: |
 |
|
|
|
Bak wrote: | try using printf instead of logman or sendarenamessage. also make sure you end your lines with a \n so it'll flush the stream..
logman and sendarenamessage do buffering so they aren't instant. |
Does \n really flush the stream? =\ My professors told me the difference between using \n and using std::endl was that endl would flush the stream.
|
|
Back to top |
|
 |
Bak ?ls -s 0 in

Age:26 Gender: Joined: Jun 11 2004 Posts: 1826 Location: USA Offline
|
|
Back to top |
|
 |
Animate Dreams Gotta buy them all! (Consumer whore)

Age:37 Gender: Joined: May 01 2004 Posts: 821 Location: Middle Tennessee Offline
|
Posted: Wed Mar 26, 2008 11:26 am Post subject: |
 |
|
|
|
Oh. I assumed since they were both streams, they'd operate basically the same as far as flushing.
|
|
Back to top |
|
 |
Smong Server Help Squatter

Joined: 1043048991 Posts: 0x91E Offline
|
Posted: Wed Apr 02, 2008 10:06 am Post subject: |
 |
|
|
|
v isn't initialised, but I doubt you would notice any side effects. You may want to use stats->SendUpdates(NULL) instead. Also are you attaching the points_kill module to the arena? You seem to be duplicating some of the code here.
Can you post the code for this?
As for \n flushing I would agree it probably depends on the OS/environment. _________________ ss news
|
|
Back to top |
|
 |
|