Server Help

ASSS Questions - ball phasing

BlueGoku - Tue Jul 26, 2005 7:13 pm
Post subject: ball phasing
Not sure if this is noticeable to any zones that don't revolve around a powerball, but for HZ it's pretty much the only thing stopping us from moving to ASSS right this minute. The bug I'm talking about is when the ball is passed to someone and it seems to phase through, but then a second later the person it phased through magically has it again. Don't get me wrong though, this happens with subgame as well, but it occurs way, way, way less. With subgame it is related to your ploss, though with ASSS it happens all the time, and doesn't really seem to be in direct relation with ploss or ping. It was happening to people who have low ping and low ploss.

Bak was saying something about it being related to the way ASSS handles ball packets, though I could be mistaking. Hopefully there's a way to fix it, but if not, I can't see how I can support a move to ASSS since it would be such a major problem for us in HZ.
Mine GO BOOM - Tue Jul 26, 2005 8:59 pm
Post subject:
I remember Grelminar at one point mentioning a way he wanted to deal with ball pickups of putting them on hold for a couple of cycles (ms), but the code shows it doesn't wait at all, and just forwards almost any ball request.

My recommendation? Try adding debugging code to print timestamps to all the ball-related events. It shouldn't be hard to quickly add in a new module that has all the soccer-related callbacks registered that will print either arena messages or to a log file.

What I really think this is, is a problem with net.c. It has shown (though I havn't source dived much to find a possible reason) that it can introduce artifical lag (buffering). I'm guessing this is probably related to some lock on some player or something somewhere in which it causes a specific thread to either not get access for a couple of cycles, or another one in which it spins for too long.

A possible, but very, very ugly hack, for a way to check this is to add a logging method to the rwlock.c in which everytime a function locks/unlocks/waits for a lock, it will log which thread is doing such, and at what time. Then, you could write up a quick script in which would parse this log, and show the average time certain locks sit, which thread locks things for too long, etc. A lot of nice statistics in which could show if a certain lock was spinning too long while three other things, or more importantly the send/recv threads, were waiting for it to finish. Either way, this should result in interesting information.
Grelminar - Wed Jul 27, 2005 4:20 am
Post subject:
Mine GO BOOM wrote:
...but the code shows it doesn't wait at all, and just forwards almost any ball request.


Well, that's how the protocol works :)

Seriously, the ball protocol was one of the last things to get added to SS, and it shows. It's not as clean as most of the rest of it, and there's a lot of implicit stuff that I probably don't fully understand. I don't think I'm doing anything wrong with balls specifically, although it's possible. Probably best would be for me to add some debugging code, and then watch people pass balls around and see if I can figure out what's going on.

Re: MGB's comments: I highly doubt this is due to any sort of lock contention. If that were the case, everything would be laggy, not just balls. Also, I don't think any of the locks are held for time scales that would affect packet timings so much.

More likely is that the ball packets are just getting buffered because they're going over their bandwidth limit. Could you send me the output of "?netstats", at a few different points in time?

The bandwidth code is still new, and the parameters may need tweaking. You might want to try adding something like this to global.conf:

Code: Show/Hide
Net:PriLimit1=20
Net:PriLimit2=40


That should let more ball packets through on the first try, possibly at the expense of some position/weapons. Eventually, something like 25/35 or 30/30 might be best, but starting with 20/40 will at least give a clear signal if this is the problem. (The current defaults for these values are 40/20.)
BlueGoku - Wed Jul 27, 2005 9:46 am
Post subject:
MGB: I'll get our devs to add that debugging code in, thanks.

Grel: There aren't any people on right now, so these netstats are just taken while only I'm shooting. Not sure if they'll be helpful. I'll get some more once someone gets on if needed.

Code: Show/Hide
netstats: pings=947350  pkts sent=8948  pkts recvd=22981
netstats: bw out=14  bw in=8
netstats: buffers used=2/52 (3.8%)
netstats: grouped=7520/674/172/17/26/23/9/28
netstats: pri=45/35472/1824/1469002/11772
Goal by BlueGoku!! No Assist!
  BlueGoku> ?netstats
netstats: pings=947403  pkts sent=9005  pkts recvd=23454
netstats: bw out=14  bw in=8
netstats: buffers used=2/52 (3.8%)
netstats: grouped=7565/675/173/17/26/25/9/28
netstats: pri=45/35712/1888/1469442/11832
  BlueGoku> ?netstats
netstats: pings=947443  pkts sent=9056  pkts recvd=23844
netstats: bw out=14  bw in=8
netstats: buffers used=2/52 (3.8%)
netstats: grouped=7605/678/173/17/27/26/9/28
netstats: pri=45/35952/1920/1469858/11880
  BlueGoku> ?netstats
netstats: pings=947542  pkts sent=9162  pkts recvd=24741
netstats: bw out=14  bw in=9
netstats: buffers used=2/52 (3.8%)
netstats: grouped=7692/681/173/17/27/28/9/28
netstats: pri=45/36448/1984/1470425/11940
  BlueGoku> ?netstats
netstats: pings=947572  pkts sent=9219  pkts recvd=25049
netstats: bw out=14  bw in=9
netstats: buffers used=2/52 (3.8%)
netstats: grouped=7731/693/173/17/27/29/9/28
netstats: pri=45/36608/2016/1470740/12156
  BlueGoku> ?netstats
netstats: pings=947798  pkts sent=9452  pkts recvd=26746
netstats: bw out=14  bw in=10
netstats: buffers used=2/52 (3.8%)
netstats: grouped=7924/697/173/17/30/31/9/28
netstats: pri=45/37648/2208/1471747/12348
  BlueGoku> ?netstats
netstats: pings=947947  pkts sent=9608  pkts recvd=28029
netstats: bw out=15  bw in=10
netstats: buffers used=2/52 (3.8%)
netstats: grouped=8052/699/173/17/32/34/9/28
netstats: pri=45/38336/2352/1472542/12510
  BlueGoku> ?netstats
netstats: pings=947963  pkts sent=9661  pkts recvd=28211
netstats: bw out=15  bw in=10
netstats: buffers used=2/52 (3.8%)
netstats: grouped=8079/720/175/17/32/35/9/28
netstats: pri=45/38400/2368/1472807/12900
  BlueGoku> ?netstats
netstats: pings=948353  pkts sent=11070  pkts recvd=29757
netstats: bw out=18  bw in=11
netstats: buffers used=2/52 (3.8%)
netstats: grouped=9418/731/179/17/33/36/9/28
netstats: pri=60/40827/2368/1888648/13032

BlueGoku - Wed Jul 27, 2005 10:00 am
Post subject:
Someone logged on and I was able to test passing to each other over and over. The phase thing never occured. Which isn't all that surprising because it wasn't happening while we were testing with under 5 people anyway. The bandwidth thing seems to be most logical. Anyway, here are various netstats from our passing session.

Code: Show/Hide
netstats: pings=948595  pkts sent=12383  pkts recvd=31160
netstats: bw out=19  bw in=11
netstats: buffers used=2/52 (3.8%)
netstats: grouped=10632/776/182/18/34/39/9/28
netstats: pri=75/63227/2463/1893828/13338
  BlueGoku> ?netstats
netstats: pings=948712  pkts sent=14230  pkts recvd=32806
netstats: bw out=20  bw in=12
netstats: buffers used=4/52 (7.7%)
netstats: grouped=11681/936/259/38/35/40/9/28
netstats: pri=75/76347/10772/1897138/14910
  BlueGoku> ?netstats
netstats: pings=948797  pkts sent=16036  pkts recvd=34513
netstats: bw out=20  bw in=12
netstats: buffers used=2/52 (3.8%)
netstats: grouped=12761/1088/326/58/49/41/9/28
netstats: pri=75/91675/18068/1900966/16296
  BlueGoku> ?netstats
netstats: pings=948931  pkts sent=17995  pkts recvd=36559
netstats: bw out=21  bw in=13
netstats: buffers used=2/52 (3.8%)
netstats: grouped=14171/1171/372/91/88/51/9/28
netstats: pri=75/112331/23060/1909465/17298
  BlueGoku> ?netstats
netstats: pings=949107  pkts sent=20697  pkts recvd=39425
netstats: bw out=22  bw in=14
netstats: buffers used=2/52 (3.8%)
netstats: grouped=16233/1301/420/139/128/63/10/28
netstats: pri=75/143963/28308/1918507/18360
  BlueGoku> ?netstats
netstats: pings=949218  pkts sent=22389  pkts recvd=41197
netstats: bw out=23  bw in=15
netstats: buffers used=2/52 (3.8%)
netstats: grouped=17033/1410/535/215/153/66/11/28
netstats: pri=75/153835/37012/1931781/20028
  BlueGoku> ?netstats
netstats: pings=949310  pkts sent=23789  pkts recvd=42649
netstats: bw out=23  bw in=15
netstats: buffers used=3/52 (5.8%)
netstats: grouped=17884/1501/614/262/159/67/11/28
netstats: pri=75/165979/41908/1939401/20982
  BlueGoku> ?netstats
netstats: pings=949322  pkts sent=23913  pkts recvd=42775
netstats: bw out=23  bw in=15
netstats: buffers used=2/52 (3.8%)
netstats: grouped=18001/1505/614/262/159/68/11/28
netstats: pri=75/167723/41908/1939758/20994

Grelminar - Thu Jul 28, 2005 3:37 am
Post subject:
Wow, those numbers are pretty unexpected.. over 90% of s2c traffic is reliable? I expect they'd be a little different during a busy game. Two snapshots of ?netstats, taken 5-10 minutes apart, during a busy time, would be interesting. I'd also like to see per-type stats, but I'd have to add new code to get that information.
BlueGoku - Thu Jul 28, 2005 10:54 am
Post subject:
Alright I'll see if I can get those to you today. I'm not sure how busy you meant, but the best I can probably do is 7-8 people, or maybe a full 6 on 6 game.
BlueGoku - Mon Sep 26, 2005 2:43 pm
Post subject:
Sorry to bump up this thread, but we're about to make the permanent move to ASSS and we've actually been using ASSS for the last couple of days due to some SSC biller problems we've had, namely we're changing servers and for some reason our old IP has been removed from the SSC firewall, but the new one hasn't been added yet either.

I set the prilimits to 20/40 and the ball phase was still happening quite a bit. I then played around with it and set it at 15/45 and it's still happening, but a lot less, maybe once every 6-7 minutes. Now, this is playable, but I'd rather it stopped happening at all. Here are two snapshots at a busy time, around 5 minutes apart with 15/45 prilimits:

netstats: pings=131311 pkts sent=2589956 pkts recvd=681257
netstats: bw out=23512 bw in=4026
netstats: buffers used=74/1241 (6.0%)
netstats: grouped=625560/705187/598584/366591/167116/59141/15797/5393
netstats: pri=78720/106836939/658968/4095166/37974

netstats: pings=134838 pkts sent=2876344 pkts recvd=748188
netstats: bw out=25524 bw in=4294
netstats: buffers used=94/1283 (7.3%)
netstats: grouped=677388/779767/669837/414821/190874/68276/18151/6203
netstats: pri=86805/119730202/721647/4534792/40674

Thanks in advance.
Anonymous - Tue Sep 27, 2005 2:25 pm
Post subject:
One thing we have done to at least make this better is change the Soccer:SendDelay to a much lower value. This makes the ball appear in the correct place after doing this weird phase sooner. Of course we would prefer to eliminate these "phasewarps" as much as possible.
Grelminar - Wed Sep 28, 2005 4:26 am
Post subject:
Hmm. First of all, those numbers look much more reasonable. 96.2% medium-priority unrel traffic (positions/weapons), 0.5% high-priority unrel (balls, some positions/weapons), 3.3% reliable. Also, lowering Soccer:SendTime has some precedent; I see PB does it. How low are you setting it? Less than 250 is probably too low.

Now, high priority unrel is using much much less than the 45% you've allocated to it, so I think the effect of those prilimits settings is just to cause more nonessential stuff to be dropped. And if you're still having issues, then maybe the limits are just getting too high. Another few settings you might want to play with are:

Net:LimitScale - default 512, settting it lower will make bandwidth limits grow more slowly and be less variable. Maybe try 400 and then 300?

Net:LimitMaximum - bytes/second, default 102400, bandwidth limits are clipped by this value on the top end. You could try 15k or even 10k or 5k, to see what happens.

Also, there's always the possibility that this behavior is a bug in the ball packet handling, and not due to network-related effects. I'd have to add a bunch of debugging stuff to a server, and then watch it while the phasing was happening, to understand it better.
BlueGoku - Thu Sep 29, 2005 2:41 am
Post subject:
Well, we're thinking of moving pretty soon actually, so is there any way you could possibly set up that debug code? Arnk and I can get it set up on our server so you don't actually have to go through all the trouble of getting enough people to test or wait for the phasing to occur.
Mr Ekted - Thu Sep 29, 2005 2:00 pm
Post subject:
Ball phasing happens client-side. If the ball passes by clients A and B, they each decide who they THINK should be catching it (by position). If they both think the other should have caught it, they both decide to "back off" for a few seconds, so they don't unintentionally "steal it". In fact, ball phasing happens on every catch. It's just normally too quick to see.

The way to fix the protocol is to have all clients within prox send a catch packet. The server starts timing when it receives the first one, and waits for some small period of time (say 50ms) to see if it gets any more. The closest catch within that period wins and the server sends the appropriate catch packet back.

The side effect of this change is that people with really high latencies will get the ball "stolen" from them often when ships are close together.
Anonymous - Thu Sep 29, 2005 2:30 pm
Post subject:
This type of phasing is not the classic phase you have described. One person all alone can try to catch the ball, but it will not appear on their ship until it has passed through their ship noticably. The same thing happens with passing, someone will pass it, but it won't actually leave their ship for a little bit.
Mr Ekted - Thu Sep 29, 2005 3:26 pm
Post subject:
Ah ok. That's almost definitely server-side. icon_smile.gif
BlueGoku - Sat Oct 01, 2005 7:59 pm
Post subject:
Yep, guest beat me to it. While it happens a lot less with the changes we've made as stated earlier in the thread, it still happens a noticeable amount.
Grelminar - Sun Oct 02, 2005 12:28 am
Post subject:
Perhaps I should try to observe the phenomena in HZ first. What are your busy times? I can try to drop by tomorrow (Sunday), some time between 3 and 9 eastern time. Midweek is difficult, but if there are people around late (after midnight eastern), it might work.
BlueGoku - Sun Oct 02, 2005 11:28 am
Post subject:
We should have at least a full game going between 6-8 EST. Our population has really taken a hit lately, since we've lost SSC access and priitk hasn't bothered setting us up with it again yet.
All times are -5 GMT
View topic
Powered by phpBB 2.0 .0.11 © 2001 phpBB Group