From Robowiki
Jump to navigation Jump to search

Desired queries

I'm currently interested in queries to enable tweets about RoboRumble entrants once they've achieved a stable rating. For that, I think I'd need the following:

  • PARAMS: Bot/Version, Rumble; RETURN: Battle count
  • PARAMS: Bot/Version, Rumble; RETURN: APS
  • PARAMS: Bot/Version, Rumble; RETURN: Ranking (eg, 9th)

These might be nice, so I'll post them for sake of brainstorming, but they're not needed by me:

  • PARAMS: Bot/Version, Rumble; RETURN: Glicko-2
  • PARAMS: Bot/Version, 1v1 or Melee; RETURN: Lowest weight class
    • Can be deduced from querying with each of the 4 rumbles, anyway.
  • PARAMS: Bot/Version, Rumble; RETURN: Pairings complete? (boolean)
    • Can be assumed complete for battle count > 2000.

--Voidious 14:37, 6 August 2009 (UTC)

Any reason you list the first three ones separate? Wouldn't it make sense to have one query get all three of those things about a bot, since a number is so much smaller than a http header, wouldn't it be easier to just get all of those at once? --Rednaxela 21:14, 6 August 2009 (UTC)
No particular reason, it could certainly get them all at once. One potential benefit to separating out battle count is that I wouldn't need APS / ranking until battle count reached a certain threshold, and I think ranking is a lot more overhead than the other two (requires sorting all the bots by APS). --Voidious 21:32, 6 August 2009 (UTC)
One basic query could get you the entire row (minus rank #) from the any rankings table based on bot name and game type. Or you could get the whole table, sort it yourself by whatever column you wish. I'd rather keep the # of query types small and their functions general and then users consuming that data can do sorting, comparing, etc. --Darkcanuck 05:45, 7 August 2009 (UTC)
Yep, makes sense. Those two queries (get row, get all rows for a given bot/rumble) would do everything I need. --Voidious 12:48, 7 August 2009 (UTC)

The one I'd mostly be interested in would be the following:

  • PARAMS: Rumble, Timestamp(optional); RETURN: All paring details (bot names, score, survival, maybe battle count) in Rumble (that had a battle since Timestamp).

It would make it easy to get/update test data for doing analysis of how bots do against other bots.

--Rednaxela 15:43, 6 August 2009 (UTC)

Pairing data (ie summarized battles for each pair) or raw battle data? It sounds very expensive... How would you use this? --Darkcanuck 05:45, 7 August 2009 (UTC)
Pairing data, not raw battle data I mean. I was thinking of doing a huge clustering of all 269011 pairings, across 734 dimensions (one for each bot). Essentially, the same data as RatingDetailsJson, except 1) the pairings about all bots not just one, 2) being able to filter out data that's not new, and 3) Don't care about Ranking/ExpectedScore/PBI. Once this huge clustering is computed, the cluster centers could be used to categorize bots, and potentially reveal hidden categories of bots that aren't well known on the wiki. I really think the result of such a clustering would be immensely interesting :) --Rednaxela 06:28, 7 August 2009 (UTC)

I was also interested in pairing data for all bots. It could be used to prioritize battles in a more efficient manner than simply aiming for 2000 battles. For example, pairings that never occurred having priority over other battles, pairings with fewer battles having more priority and/or bots pairing with retired bots having priority. The rankings would stabilize A LOT faster (hours instead of days) and the BATTLESPERBOT parameter will no longer be needed. But the only ways to retrieve this pairing data right now are hammering on the "query=pairing" once for each bot, or parsing the ranking pages. --MN 20:10, 3 July 2011 (UTC)

Actually that's the best way to get the pairings -- returning them all in one glob would be way too much data... I never did implement Rednaxela's query because filtering out old results would likely be too much for the server. I'm working on another web project right now and may have some ideas to improve the rumble database, but it's been awhile. --Darkcanuck 21:17, 3 July 2011 (UTC)
I think a few hundred thousand pairings would not overload the database/server, but the download may take a while depending on the connection (about 1 to 30 megabytes, depending on the format). 10 clients downloading this, like, every 2 hours, seems viable. --MN 21:15, 8 July 2011 (UTC)
Bandwidth isn't an issue (we have tons) but memory usage and disk I/O operations by MySQL are what really slow things down. I've done big queries (manually) before and they can be really, really slow, even without joins or complex filters -- just based on the size of the database. I think the current API still gives you what you need, albeit with more calls. Send me an email and I'll get you an API key, we can always reduce the throttling rate to speed things up if necessary. I also make daily database dumps, maybe you'd rather use that directly? --Darkcanuck 22:24, 8 July 2011 (UTC)

Actually, the BATTLESPERBOT method has an annoying flaw when mini/micro/nanobots are added to the rumble. Since there are less bots there, the client keeps generating mini/micro/nano battles long after those rankings have stabilized. --MN 20:10, 3 July 2011 (UTC)

I agree with you -- I posted a complaint about the "unfairness" of the smaller leagues (taking cpu time away from the main rankings) but there wasn't enough support to change it. It was a long time ago (haven't posted much in past year or so) so not sure where to link to... Darkcanuck 21:12, 3 July 2011 (UTC)
I was willing to create a "better" client, but it will be useless if the server stays the same. I was also willing to create an efficient server in open source Java EE, but only if the Darkcanuck/RRServer development is dead. --MN 21:15, 8 July 2011 (UTC)
Definitely not dead; I'd call it "stable". There have been zero client updates in a long time (until this week) so there hasn't been a pressing need to make any changes. I would certainly be interested in a better client, I think there's lots of room to improve the client/server interface (the current server is based on being backwards-compatible with the old one). Maybe start a new page to outline what you'd like to change? --Darkcanuck 22:24, 8 July 2011 (UTC)
It was here: Talk:RoboRumble#Nanos get little love when things get busy. Well, mathematically, fewer bots in the division doesn't mean you need fewer battles to stabilize. A single pairing would take 2k battles to get as accurate a result as we want our overall APS ranking to be. So if we want to support Mini/Micro/Nano rankings, I think they deserve to get enough battles for an accurate ranking. My view is that the right solution is for the RoboRumble client to offer greater control of what battles you run. I think you should be able to ignore Mini-only battles, or only run your own bots, etc. --Voidious 21:33, 3 July 2011 (UTC)
Intuitively, pairings with different bots offer more information than pairings with the same bot. You don´t need 2k battles to get an accurate result, but it is there to ensure at least 1 pairing with each bot in an 800+ bots rumble. It is already possible to customize the client and have greater control, but it doesn´t solve the problem if you want to coordinate multiple contributors in stabilizing the rankings. The default configuration need to be able to do the work. --MN 21:15, 8 July 2011 (UTC)
I agree having more and varied bots in the rumble gives a more rounded view of any bot's prowess. But mathematically, it's all about the variance in the pairings. Consider two results for a pairing with variance x compared to two results for different pairings, each with variance x. Both sets of results have the same statistical accuracy. If you need 2k battles for 800 pairings with average variance x to get a result accurate to some threshold, you also need 2k battles for a single pairing with variance x to get the same level of accuracy. I understand that different pairings have different variance and it would be really fancy shmancy to take that into account too, as Rednaxela tries to below, but I was trying to ignore that tangent in my first response. =) --Voidious 20:13, 11 July 2011 (UTC)
Actually, "to ensure at least 1 pairing" in incorrect IIRC, since there is a separate mechanism to get battles for missing pairings. It is true that you get an additional battle in a pairing that already exists is less valuable than filling in a missing pairing, but 1-battle-per-pairing is still less numerically stable as the number of pairings get smaller.
What would be the ideal system really, would be one that considered that the standard deviation between multiple battles a bot has, will vary between different bots. A bot that has a greater standard deviation within a pairings contributes more uncertainty to the overall score and thus the pairing needs more priority for extra battles. I envision something like the following:
  • All missing pairings get top priority automatically
  • All pairings with only one battle get the next most priority, because at least 2 battles per pairing allows for better assessment of which pairings need still-more-battles.
  • For every robot, get the average standard deviation of all pairings it's involved in, we'll call this the robot's "unpredictability factor". Then for every pairing, take the average of the two bot's "unpredictability factor" values, to create what I would term the "de-noised estimated standard deviation" for each pairing
  • (That step was because the small count of battles makes the direct standard deviation for a pairing unpredictable in and of itself. Generally however, unpredictably scoring bots will tend to score unpredictably across most of their pairings, and visa versa.)
  • For each pairing, square the value of the "de-noised estimated standard deviation" to get the "de-noised estimated variance" for the pairing, lets's call that "V". If you were to divide V by the number of battles already in the pairing, you would get the variance of the average score of the pairing. Instead however, we want to know how valuable one more battle would be to the pairing. Let "B" be the number of battles already in the pairing, then calculate V/B - V/(B+1). This gives us the "estimated variance drop for adding one more battle to the pairing". You then divide that by the number of robots in the league this calculation is being performed for, to get what the variance decrease would be for each bot's score! Consider this to be the "priority" for the pairing!
  • To incorporate multiple leagues (i.e. main/micro/mini/nano), which overlap in pairings, perform all of the above calculations separately for each. Assuming you value the score variance equally for all leagues, simply sum the priorities for each pairing, because their priorities are proportional to the variance decrease.
.... wow that was long...
And there you have it! A system to prioritize battles, which should pick the battles that reduce score variance the most automatically! It should be statistically robust I believe :)
Perhaps I should make an implementation of this whole calculation? I could put it on my test rumble server and have it tell clients the high-priority battles via the priority battles system :)
--Rednaxela 01:19, 9 July 2011 (UTC)
Interesting... I've previously considered adding the first part: getting bots up to 2-3 battles per pairing would be very easy to implement on the server without performance impact, and requires zero client code changes (since clients blindly execute priority battles sent by the server). The rest sounds good, a pseudo-code implementation might be helpful to clarify what data would be need per upload. If it could be implemented with an online or incremental algorithm (so that you don't have to query the detailed battle history, just update the pairing summary) then there should be no performance impact. --Darkcanuck 05:40, 9 July 2011 (UTC)
Indeed an interesting idea. If I understand it correctly, no changes to clients are necessary, only on the server side. It would make the ranking of new and almost new bots stable as quick as possible. On note though, does this imply that an older bot with a stable performance never gets any 'random battles' anymore? --GrubbmGait 09:17, 9 July 2011 (UTC)
I think Rednaxela's idea could theoretically provide a priority value for every pairing in the rumble, removing the need for random battles. But I think we would still want a fraction of each client's battles to be random so that we don't get stuck on paths where only a small subset of bots are being evaluated.
If no one's opposed to the 2 battle minimum per pairing, I can add that to the server today? --Darkcanuck 15:58, 9 July 2011 (UTC)
Yes, what my idea proposes is having a priority value for every pairing. I don't think random battles are still needed though. It shouldn't be possible to get stuck on a small subset of bots unless that small subset of bots *actually* needs that many more battles to have a stable result. If it's needed for a stable result, then it's needed for a stable result I figure.
About adding a 2 battle minimum per pairing to the current server right now I'm in support of it :) --Rednaxela 21:57, 9 July 2011 (UTC)
I've finished programming it actually, just getting some testing time in on my dev server, will update the live server soon. I've made a few changes to the priority code such that:
  • only one (random) priority battle is sent per upload (used to be 10, which could lead to a lot of duplicates as a bot neared full pairings)
  • that battle will always be a missing pairing for either of the two bots just uploaded (if there are any missing)
  • otherwise, if pairings for that league are incomplete (eg. after a bot removal) then a battle for a bot which needs a scoring update will be sent
  • finally, 50% of the time the lowest battle pairing for either bot will be requested
  • the other 50% of the time the client can do its thing (random pairings, focusing on bots with < 2000 battles)
It's not as fancy as your suggestion, but should help to fill out thin pairings. The % is just a guess (probably too high?), will adjust lower if bots aren't getting to 2000 fast enough (they should end up getting highest priority anyway). --Darkcanuck 22:10, 9 July 2011 (UTC)
Hard to find single battle pairings now, isn't it? ;) --Darkcanuck 14:43, 10 July 2011 (UTC)
Seems to work great, even pairings with 2 battles are not easy to find. One note though, it seems that bots that should become retired, still get battles preventing the retirement. See f.e. 3 versions of Xandercat. --GrubbmGait 11:01, 11 July 2011 (UTC)
Yeah, I've only been replacing a single entry on the participants page, but looks like every older version sticks around. Same thing with mn.Combat I believe. -- Skotty 12:15, 11 July 2011 (UTC)
Thanks for the feedback. Priority battles and retirement have always conflicted, I should have remembered that. Fixed now, will work on a more elegant fix in the next few days. The client really should be updated to ignore priority battles for retired bots (since it's the client that does the retirement!) ... --Darkcanuck 16:51, 11 July 2011 (UTC)
Nice! Very cool. --Voidious 20:13, 11 July 2011 (UTC)
About labeling bots unpredictable instead of individual pairings... It may be the best we can do given the amount of data we have for bots vs pairings, but I think it is far from optimal. I feel like the closer to even a pairing is, the higher the variance is. For pairings that are even, it would never surprise me to see a score swing from 40% to 60%. But if one of those two bots scores 95% vs a third bot, a score of 90% would seem hugely unlikely, but it's because of the pairing, not either of the bots themselves. I wonder if in practice, your formulas would come up with all bots having almost the same unpredictability, or if the unpredictability will correlate more directly to overall APS than to anything specific about a bot. --Voidious 20:13, 11 July 2011 (UTC)
WOW, a few days and you have already done a lot more than what I was thinking myself. But in case you still want to know, I was thinking in prioritizing battles like this:
minimum weight = 1
weight of this pairing = max(battles of all pairings) - battles of this pairing + minimum weight
% probability of pairing being chosen = weight of this pairing / sum(weights of all pairings)
The idea was making the number of battles for each pairing equalize in a decentralized manner (no need for a priority list generated on the server to minimize repeated work). --MN 17:20, 16 July 2011 (UTC)
Thinking a little more, the weighting could be done proportionally to the contribution the battle would have on a bot score:
number of pairings = number of bots in this league - 1
battles of this pairing after this battle = battles of this pairing + 1
weight of this pairing = 1 / battles of this pairing after this battle / number of pairings
% probability of pairing being chosen = weight of this pairing / sum(weights of all pairings of all leagues)
The method above catches what Voidious was arguing about bots needing the same amount of battles no matter the number bots in the rumble. Also, doesn't need 2 battles in a pairing to start guessing the next battle, nor a centralized priority list generation. --MN 23:38 16 July 2011 (UTC)

Compressing the data

I'm thinking, some of this stuff could be quite a bit of data. Perhaps it would be better to put it all into a zip file before returning it? --Skilgannon 19:48, 6 August 2009 (UTC)

Err... zip file? A raw deflate or gzip stream would be less overhead and much saner. Zip files are for archiving groups of files, not compress raw data. Actually, if the web server is set up correctly, it can transparently compress anything coming out of it as it sends to the client, so long as the client said in the http header that it accepts compressed responses (which as a note, firefox for one does on any web page). Anyway, might I suggest that maybe the query results should be encoded in JSON, since this is a very simple low-overhead format, which php has a builtin function to serialize to? --Rednaxela 20:57, 6 August 2009 (UTC)

I'm pretty sure my server is setup to send a gzip stream if the client can handle it. No zip files though, just a plain text stream that other apps can consume easily. JSON was exactly what I was thinking, it's a simple structure and already used by ABC's very cool LRP graphs. (try using RatingDetailsJson instead of RatingsDetails to see an example). --Darkcanuck 05:45, 7 August 2009 (UTC)
Just tested here, and yes, it does gzip it if the client asks nicely :) --Rednaxela 06:22, 7 August 2009 (UTC)


Well, I was eager to flesh out the rest of the logic in the Rumble tweeting script, so I got it working with parsing the RR server pages. (Developed with local copies, then final tests on the actual URLs.) I figured that if I wrote it right, it would be very simple to switch over to the query API later. I've only run it at long intervals as of last night (and killed it for now), but I think the result is pretty cool: twitter/roborumble. It watches the 1v1 and Melee participants lists; when a new version's added, it fetches / caches the rating info for the old version; when the new version hits 2000 battles in General, it tweets. It will also print the same style of results for the bot's lowest weight class if it's a MiniBot or lower, and slightly different text for a new bot (not new version). --Voidious 17:44, 9 August 2009 (UTC)

You can re-enable it, nothing to do with Diamond's troubles. --Darkcanuck 19:50, 9 August 2009 (UTC)

I also wanted to mention another idea I had that could make use of the query interface. I know that I have become super lazy in updating Diamond/Version History, perhaps in part because pasting the URLs and transcribing all the APS/ranking info feels tedious. Well, maybe I could write a MediaWiki extension that would let you write something like [[RumbleLink|minimelee| 1.04]], and it would formulate the link, fetch the rating/ranking info, and insert a line like I have on Diamond's version history. (I'm thinking it would run once when you make the edit, not every time the page is viewed.) Or I could do it via a form like the UseMod table converter, if the MediaWiki extension doesn't work out. --Voidious 17:44, 9 August 2009 (UTC)

That would be pretty neat. Will be working on the server today, might take a stab at the interface... --Darkcanuck 19:50, 9 August 2009 (UTC)


Ok, the first two query types are live. Future ones to come are for pairings (full set for one bot, plus single pairing result) and battles (full set for one pairing). Rednaxela's request for recently updated pairs will be evaluated after that. Request your keys, play with the interface (gently) and post your feedback here. The API is not yet set in stone, I'm willing to make changes to make it useable. --Darkcanuck 05:34, 16 August 2009 (UTC)

Note that the MINE type of JSON is application/json. It may produce 406 Not Acceptable with JSON client request with "Accept: application/json". I did get that sometimes when I try it (via Telnet/NetCat) » Nat | Talk » 10:06, 16 August 2009 (UTC)

Interesting, I'll look into it. The API right now sends nothing special as headers, just the data. --Darkcanuck 17:44, 16 August 2009 (UTC)

This is great, Darkcanuck, thanks. I'm just tinkering with it now. I have a question about Glicko-2. Right now, DrussGT in the rumble shows a Glicko-2 rating of 2187.7 (3, 0.003). This is the info I get for DrussGT when I query the "roborumble" rankings:

  battles: 2147
  bot_id: 1222
  count_wins: 736
  created: 2009-08-10 13:45:44
  name: jk.mega.DrussGT 1.3.10
  pairings: 738
  rating_classic: 1854.737
  rating_glicko: 1879.878
  rating_glicko2: 1878.26
  rd_glicko: 0.895
  rd_glicko2: 3.372
  score_dmg: 79.848
  score_pct: 86.827
  score_survival: 93.978
  state: 1
  timestamp: 2009-08-16 15:51:12
  vol_glicko2: 3.371

How is the Glicko-2 displayed via the web related to the data shown above? (1878.26 is very close to 300 below what the web server gives, but not exactly.) I actually don't need Glicko-2, anyway, but the discrepancy caught my eye when testing this stuff. --Voidious 17:10, 16 August 2009 (UTC)

Good catch! A long time ago Skilgannon suggested scaling Glicko2 to match the ELO scale (of the day). I calculated a scaling factor using the numbers from that time period but didn't want to mess around in the database, so the scaling is applied when the table is displayed. Looks like the query interface is missing that -- easy to fix. I also have a few new queries ready to go on my test server, plus some formatting changes -- will upload in the next hour or so. --Darkcanuck 17:42, 16 August 2009 (UTC)

I've got my rumble Twitter script updated to use the query API. One comment and one question about error messages:

  • I almost feel silly even mentioning this one, but: message: ERROR: Could not find bot "voidious.Dookious 3.0 in database! Just missing a closing quote.
  • What does the error message look like for rate limit exceeded?

As of now, if it encounters any error besides "Invalid robot name" or "Could not find bot", it immediately exits (and, after a few changes, won't get confused next iteration... I think). I doubt it would hit the rate limits very often, but if enough bots were entered/updated at once, it would. Once I learned how to work with JSON, it was all very easy to work with. Very cool. --Voidious 20:02, 16 August 2009 (UTC)

I may have broken your script with the latest update -- after adding the new queries I went back and re-arranged the JSON data for all so that they're consistent. Basically the data field will always contain an array of recordsets, regardless whether the query returns multiple (eg. rankings, details) or a single (eg. participant, pairing) record. And the ranking field has been moved to become part of each record. Fixed the missing quote - thanks! The rate error will set "error=1" and the message will be "Too many requests per hour/minute (<number>) - try again later." Try hitting it 10 times in a browser to see it in action. Nat, I've added the content type which means that I had to switch to using the JSONView FF extension to test the results. =) --Darkcanuck 20:25, 16 August 2009 (UTC)

Yes, that was some funny timing. =) Anyway, updated and working again. --Voidious 20:39, 16 August 2009 (UTC)

Which queries are you using and how often? Some of them are less than optimal, so I want to know if they're worth improving (it doesn't matter so much for the page views). And if there's a better way to present some of the data, let me know. --Darkcanuck 21:12, 16 August 2009 (UTC)

I'm using both of the first two queries, "rankings" and "participant". The script is currently running at 15 minute intervals. I can't imagine it ever going over the 60 per hour limit, and I just updated it to sleep for a minute if it exceeds the per minute interval. There's basically 3 different things it does:

  • If a new version of a bot is added to the participants list, it tries to fetch the rating / ranking of the old version, both in General and in its smallest weight class. This is the part that can generate a lot of queries. It first checks the full rankings, and if the bot's already retired, it queries the bot specifically and deduces its ranking based on APS. It goes down each weight class until it's not present to deduce the smallest weight class. So for MicroBots and NanoBots, when the old version's already retired, it's going to do 8 queries. This could be optimized down to a max of 6, since I don't need the ranking for the weight classes in between, but it would require a little refactoring.
  • For new bots/versions that it's watching until they hit 2,000 battles, it just queries the General rankings to get the battle count. This could query the bot specifically, instead, but that would also create more (albeit cheaper) queries when there are multiple bots on the watchlist. If you think it's preferable, I'll change it to do the participant query instead.
  • Once a bot hits 2,000 battles, it uses the rating/ranking that it just fetched, and also looks for the rating/ranking in its smallest weight class. So up to 3 additional rankings queries when that happens (since there's no way it's retired).

During one iteration of the script, it caches any rankings queries it does. If a few 1v1 NanoBots and a few Melee NanoBots were all added at the same time, it would be a lot of queries, but that seems like a rare occurrence. I can think of a few more ways to optimize it down, too, but right now, it's not ever coming anywhere close to 60 per hour. In the last 72 hours of watching Melee/1v1 rumbles, it's done ~280 queries (URL fetches before, API queries now). --Voidious 22:41, 16 August 2009 (UTC)

Was just curious, your usage seems very moderate to me. =) The 'participant' query is the ugliest one on the server side -- since it's the one you're using the most I'll spend some time tweaking it. --Darkcanuck 22:58, 16 August 2009 (UTC)
Well, I'm glad to hear that. =) It seems moderate to me, too, but I'm not the one spending machine resources running the server. The participant query must still be cheaper than rankings, right? And as of now, most queries of mine are probably rankings, since that's how it's checking battle count. Switching those to participant could be much easier on the server, even if it's more queries, right? --Voidious 01:19, 17 August 2009 (UTC)
Actually, no. The participants query reused a function I already had built (actually all the queries do) but for some convoluted reasons it actually pulls the whole rankings list for that game (which gets cached) then extracts the specific bot. And if the bot was retired, it goes back to the db and gets just that lone record. So the way things stand today, participant queries can actually cost more than rankings. But don't change your setup -- it makes sense and I need to make a better participant query anyway. --Darkcanuck 01:59, 17 August 2009 (UTC)

Duplicates in rating details

I'm hitting a strange issue. When I fetch rating details for Diamond 1.5.38c, I'm getting duplicates of some bots for old versions that aren't in the rumble, such as xandercat.XanderCat 6.5 and 7.3, cs.Nene MC42k7 and MC48k7, cs.ags.Etna 1.0, and lxx.Tomcat 3.10.145 and 3.12.151. Those bots only show one version vs Diamond when I view details on the web [1]. Not a big deal - I can easily exclude them from BedMaker for now. =) --Voidious 04:31, 19 August 2011 (UTC)

Er, unless that's as designed? Wouldn't be hard for me to join against the rumble details if so... --Voidious 04:34, 19 August 2011 (UTC)
I don't remember why I did it that way, but the code indicates that it's by design. You can filter own retired pairings (not really duplicates) using the "state" field: it should be "1" for active pairings. --Darkcanuck 05:08, 19 August 2011 (UTC)
There are no threads on this page yet.