Server Help

ASSS Questions - Ball bug

Mr Ekted - Fri Mar 24, 2006 3:53 am
Post subject: Ball bug
I've been dealing with this now for a long time to no avail. I was trying to think about the problem from the client's point of view, since I've never seen it happen with subgame. Under asss it happens every few minutes. Here are the symptoms:

- I catch the ball and fly with it for 5+ seconds.
- I pass the ball. I see it fly from my ship.
- I hear the catch sound immediately, and I have the ball again.
- I turn my ship.
- I see the ball fly out from my ship along the original path without pressing any keys.

Here's my theory for what is happening:

The client obeys ball packets from the server. If it gets a packet saying it is carrying the ball, then it picks up the ball, even if it doesn't make sense. Likewise, if it gets a packet saying it has shot the ball, it will apply the information in the packet to a prior unknown ball pass, even if it thought it was carrying the ball.

However, the client SHOULD be ignoring ball packets with timestamps earlier than any previous ball packets it SENT or RECEIVED. Otherwise we'd be seeing all sorts of anomalies.

It is conceivable, and probably a common occurrence, for a ball shoot packet from the client to be "crossing by" a ball carry packet from the server. So that even after the client has shot the ball, it receives an older ball update saying it is still carrying the ball.

There seems to be two possibilities that this could cause this bug. One, that asss is sending a bad timestamp on a few ball packets. The code doesn't bear this out, but I'm not certain. Two, that the client improperly handles SOME ball packets that have expired. If the latter, what's different about the subgame ball packets and the asss ball packets?

Any thoughts?
Dr Brain - Fri Mar 24, 2006 2:47 pm
Post subject:
Have you talked to anyone in HZ? They're the asss powerball experts. I never really used 'em seriously in my zone.
i88gerbils - Fri Mar 24, 2006 4:43 pm
Post subject:
Hmm, I didn't notice anything when Powerball was in the HZ sub-arena on that one day (granted we were playing small pb, not real pb). I'm not an export on balls, but has anyone else experienced this?
Mr Ekted - Fri Mar 24, 2006 5:11 pm
Post subject:
Yes, it happened in HZ, just not as much. I believe they made ball packets reliable. This may mask the symptoms but I thinks it's a very poor solution. The server and the clients have to be able to deal with ball packets coming in in all sorts of crazy timings and sequences. In the 9 years I used subgame I never saw this happen once, and it doesn't use reliable ball packets.
Mr Ekted - Sun Mar 26, 2006 2:12 am
Post subject:
Here's a packet capture of the double catch happening. asss is sending all the right packets, but the client still tries to send a second pass packet (with a new timestamp) 950ms after the first one. Players visually saw the ship catchthe ball a second time.


Code: Show/Hide
SENDING Ball Packet:
2e 00 4e 1d a6 26 00 00 00 00 04 00 00 00 00 00

SENDING Ball Packet:
2e 00 28 1c 96 26 00 00 00 00 04 00 00 00 00 00

SENDING Ball Packet:
2e 00 e6 1a 81 26 00 00 00 00 04 00 00 00 00 00

PASS:
1f 00 ac 1a 7e 26 e7 e3 31 ff 04 00 1d 44 ac 09

SENDING Ball Packet:
2e 00 ac 1a 7e 26 e7 e3 31 ff 04 00 1d 44 ac 09

PASS:
1f 00 7d 19 6a 26 e7 e3 31 ff 04 00 7c 44 ac 09

SENDING Ball Packet:
2e 00 ac 1a 7e 26 e7 e3 31 ff 04 00 1d 44 ac 09

SENDING Ball Packet:
2e 00 ac 1a 7e 26 e7 e3 31 ff 04 00 1d 44 ac 09

SENDING Ball Packet:
2e 00 ac 1a 7e 26 e7 e3 31 ff 04 00 1d 44 ac 09

SENDING Ball Packet:
2e 00 ac 1a 7e 26 e7 e3 31 ff 04 00 1d 44 ac 09

Cyan~Fire - Sun Mar 26, 2006 11:12 am
Post subject:
So is the 2nd pass the only difference from the times it doesn't happen?
i88gerbils - Sun Mar 26, 2006 11:46 am
Post subject:
Do you get any Malicous statements?
Mr Ekted - Sun Mar 26, 2006 12:22 pm
Post subject:
The ball code correctly issues a warning that the player is trying to pass a ball they are not carrying.
Bak - Sun Mar 26, 2006 2:03 pm
Post subject:
Perhaps the S2C_BALL packet was lost on its way back to the shooter, and Continuum is coded to attempt to refire it if it doesn't receive ballfire confirmation from the server. The fire packet timestamp needed to be changed, or else it would be ignored by the server, so it was updated to the new value.

But since the server ignores your attempt to fire a ball you aren't carrying, you eventually receive the BallTimer packet which tells you the ball is indeed on the correct orignal course and it jumps to the correct position.

You could try increasing the BallTimer so it happens every 10 seconds instead of every 2 or 4, if this increases the time between when you catch the ball again and when it jumps to the correct position, then you know the timer is what is resetting it to it's correct position.
Mr Ekted - Sun Mar 26, 2006 2:55 pm
Post subject:
Bak wrote:
Perhaps the S2C_BALL packet was lost on its way back to the shooter, and Continuum is coded to attempt to refire it if it doesn't receive ballfire confirmation from the server. The fire packet timestamp needed to be changed, or else it would be ignored by the server, so it was updated to the new value.


Why would it be ignored? If the server recieves the 2nd pass packet with the original time, it just as valid now as then. No one else can catch the ball until the server decides.
stag shot - Mon Apr 10, 2006 10:49 pm
Post subject: solution at long last, or not
Well, after a reply by Priitk, it seems we have figured out the double clutch bug...It comes down to this piece of code in balls.c:

Code: Show/Hide

   bd->state = BALL_CARRIED;
   bd->x = p->position.x;
   bd->y = p->position.y;
   bd->xspeed = 0;
   bd->yspeed = 0;
   bd->carrier = p;
   bd->freq = p->p_freq;
   bd->time = 0;
   send_ball_packet(arena, bp->ballid, NET_UNRELIABLE | BALL_SEND_PRI);


EDIT: After debugging Subgame further, this theory was proven wrong. It appears ASSS is handling ball packets properly in balls.c.
BlueGoku - Mon Apr 10, 2006 11:25 pm
Post subject:
Sweet. Thanks for the update. We'll be adding this and making our balls unreliable again in the next update.
i88gerbils - Tue Apr 11, 2006 12:07 am
Post subject:
nice. I can't resist but to say that my balls are always reliable.
Mine GO BOOM - Tue Apr 11, 2006 12:54 am
Post subject:
i88gerbils wrote:
nice. I can't resist but to say that my balls are always reliable.

Yeah, but do you have yellow or blue balls?
Chambahs - Tue Apr 11, 2006 2:10 am
Post subject:
Now you have to trash your own post MGB icon_razz.gif
Bak - Tue Apr 11, 2006 2:37 am
Post subject:
did priitk say why this was happening?
Mr Ekted - Tue Apr 11, 2006 12:52 pm
Post subject:
I assume the client (VIE and Cont) treat zero speed special for some unknown reason.
stag shot - Sat Apr 15, 2006 6:12 pm
Post subject: recant
I've edited my above post...There was some confusion, so I went ahead and debugged Subgame further and discovered the ball packets appear to be correct in asss (as is). We are now convinced there is an out-of-order packet being sent from net.c causing this strange bug. We are in the process of logging this from net.c.

stag
stag shot - Sun Apr 16, 2006 1:24 pm
Post subject: the REAL solution
After further digging, it was discovered the ball packets were not ALL being sent with equal priority(!). This caused ball packets to arrive out-of-order when the bandwidth limiting kicked in. Anyways, here's the patch for balls.c:

At the end of send_ball_packet, change

Code: Show/Hide

net->SendToArena(arena, p, (byte*)&bp, sizeof(bp), rel);

to

net->SendToArena(arena, p, (byte*)&bp, sizeof(bp), NET_UNRELIABLE | BALL_SEND_PRI);

I believe grel removed the "rel" parameter entirely in the official revision (either way is fine). This patch increased ball reliability quite a bit and finally fixed the dreaded double clutch!

stag
numpf - Sun Apr 16, 2006 4:08 pm
Post subject: Re: the REAL solution
stag shot wrote:
This caused ball packets to arrive out-of-order when the bandwidth limiting kicked in.
It's not just when BW limiting kicks in, it can and probably usually comes from normal thread switching. There is a thread dedicated to sending outbound packets. I don't know which thread ball handling runs in, but it's definitely different. Packets in that thread are processed higher-priority first. So the sequence to get the bug is:

timer for balls goes off, carry packet A gets queued
before the outgoing thread can run, a client throw B is handled and queued

Before stag's patch, B was set at a higher priority than A, so it would always send() first.

It is a flaw in the ball protocol that it relies on packets send()'t in sequence to arrive in sequence. However, it practice they arrive out-of-order so rarely (Ekted> 0.00001%) that in however many years of playing PB I only saw double clutching on subgame once, a few months ago when I was really lagged.

Another issue that is probably from sequencing is a rare bug we say in the proball subarena. In that arena, the ball spawn and player spawn area are very small in the center. If the ball spawns and you warp to C you will generally catch the ball immediately. Sometimes when a player would warp, catch the ball, and die almost instantly, the ball would be dropped where they had been before they warped, halfway across the map. Subgame depends on the sequence of packets for death/position/balldrop.

I would guess one could find 2+ other places in asss where it mixes priorities for packets that subgame (incorrectly) assumed would be sequential, though probably without as problematic an effect as double clutching. Flags and turreting come to mind.

-numpf
Grelminar - Sun Apr 16, 2006 4:34 pm
Post subject:
Flags and turrets are both all reliable, except for the turf ownership summary packet, which isn't very important. So is death.
numpf - Sun Apr 16, 2006 4:55 pm
Post subject:
Grelminar wrote:
Flags and turrets are both all reliable, except for the turf ownership summary packet, which isn't very important. So is death.
Good for them; you are missing the point. It's not just a question of a flag packet being in the proper sequence with another flag packet. Position packets are not reliable. They generally effect everything. If a driver warps and kicks off his turrets almost immediately afterwards, is it possible the gunners get split into pre- and post- warp positions? Or together but separate from the driver? I gave examples that I think are good candidates, I do not know for sure. The point is large portions of the code should get a lookover for this issue, and users/zone ops should be aware of it so they can diagnose it if they see it.

-numpf
Grelminar - Sun Apr 16, 2006 7:52 pm
Post subject:
I'm well aware of your point, I just chose not to address it directly in that post.

That post was to point out that there can't be any issues among combinations of {flags, turrets, death}, not that there can't be any more issues at all.

Solving all these problems at once would take dropping the priority system, and I'm not sure that's the best thing to do yet. If it turns out to be, there needs to be some way to handle the things that it's used for (e.g., sending important position packets before unimportant ones, preventing unrels from crowding out rels).

For the first, buffering in the position packet code, and not the network layer might be a solution. There are other reasons why that may be a good idea.

For the second, special handling of unrel vs. rel in the bandwidth limiter might suffice.
All times are -5 GMT
View topic
Powered by phpBB 2.0 .0.11 © 2001 phpBB Group