Weird rumble scores
Being fair means no one is getting advantage over it. And the ability to withstand ocationally skipped turns is part of the competition.
Since no one can guarantee that robots are always running with sufficent resource, I’m always on the side that robot authors should assume low performance computers.
"no one is getting advantage over it" is certainly untrue. Consider the notion of a bot that uses very little CPU in the main thread but creates lots of GC overhead, in a battle versus a bot that uses most of the typical CPU allotment but doesn't create much GC overhead. If this is on a system where the GC thread can affect the time available to the main thread (i.e. don't have enough spare CPU for the GC thread), then the bot that uses very little CPU in the main thread but is creating lots of GC overhead will be advantaged by causing the GC overhead, since it would cause more skipped turns for the other bot but not so much itself. This is somewhat of an extreme example, but the point is that skipped turns caused by GC overhead are anything but fairly distributed.
You are right. Apart from creating many thrads that do a lot of work when it’s others turn, creating a lot of objects to increase GC overhead does affect others’ bots as well, making the result a little bit random.
But I doubt how much difference can GC overhead put. The most unreproducible scores I experienced are always coming from some rare exceptions, say 1/1000. And once happened, it causes some random pairing to be close to 0. If averaged with some normal score, it really looked like it’s decreasing with no reason.
But there’s always some reason, and mostly coming from specific bot instead of the clients, since not everyone is affected.
So my advice is that you output exceptions to file, and check if there are any. Skipped turns could also be counted. I was doing this in older bots as well, and concluded that GC overhead & skipped turns aren’t really the problem, but exceptions are.
For the specific case I mentioned of ags.Glacier 0.3.0 versus lxx.Emerald 0.6.5, when it was at 1 battle, I thought it was likely some rare exception as you say, but then a 2nd battle came in with about the same low score as the 1st battle, and yet I'm not able to reproduce a result anything like that in many many tests. This leads me to believe that there is most likely something significantly different about the environment those two battles were run in, as compared to my own environment.
I think there must have been a bad client running at some point. BeepBoop 0.14a is identical to 0.14, but is getting 0.3 points better in the rumble now. Even weirder, I just noticed that BeepBoop 0.13 has 85% against eem.awful, which disables itself at the beginning of each battle!