Archived talk:Lukious 20090521
From old wiki Lukious
Congrats on top-10 with the first version of this. I have a few questions. Since I've known about this project a while I got a bit precise that it includes DynamicClustering. Or if it is I who don't understant that concept correctly. Could you elaborate some on the DC implementation of your gun/movement? Also, I saw that the TC2K6 scores wasn't stellar. Can we expect MC2K6 scores soon? Or, even better, hook Dookious' gun on this movement and see what happens. -- PEZ
Thanks. As I understand DynamicClustering, you also use it in Ali, right? The cluster is the n most-similar scans, and that cluster is what's used to decide targeting/movement. Ali doesn't use anything like gaussian KernelDensity estimations, but if you look at those formulas Albert posted, you could say that the "Uniform" formula is what it uses to decide firing angle.
The only thing unique about my DC implementation is that I am using a 2D formula for KernelDensity estimations, with one dimension as the GuessFactor, and the other as the difference in distance from the current scan. I haven't done extensive research to show that it works better, but it seems to in small amounts of research, and it feels better to me (intuitively) to weight the scans by distance somehow. I also don't have to worry much about the number of scans in my cluster, since the distance stuff is taken into account by the KernelDensity stuff.
I have a problem with the MC2K6 where Lukious gets REALLY REALLY slow after a couple hundred rounds against the learning guns, because he's processing data on the scans every single tick. (I can and will come up with a way to cache the closest scans...) I can tell you that his most recent WSC2K6 scores were 99.13 / 97.69 / 95.59 = 97.47, and the most recent full MC was 37.87 / 36.26 / 52.06 / 29.42 for the rest of them. He still has a long way to go before the gun or movement could rival Dookious, but I do think the DC-based movement shows a lot of promise.
OK. As I had imagined DynamicClustering it was about dynamically weighting the segmentation dimensions. By somehow figuring out which dimensions where best "describing" a particular opponents movement. -- PEZ
Well, I wouldn't mind hearing what ABC has to say about it, but I don't think that's a necessary component of DynamicClustering. That seems like a DynamicSegmentation type of idea that could be part of a DC system. In my mind, it seems that with segmented stats arrays, you predefine the "clusters" with your segmentations, while in these log-based systems, the clusters are formed on the fly (er, dynamically =).) -- Voidious
Yeah, maybe. But to me it's just clustering. Using predefined clusters isn't clustering, is it? That's just VisitCountStats. That's why I thought the Dynamic part was about the weighting of the dimensions. -- PEZ
Well, you're right about "clustering", and it's just terminology, but... I still say the clusters are "dynamic" even without any dynamic weighting. Each time you make a decision, you redefine the cluster. With dynamic weighting, you also redefine how you form the cluster. DCBot uses fixed weights on the different factors of the distances, as does Chalk, and both are considered to be using DynamicClustering. -- Voidious
Also, I wonder if "dynamic" is because you form a new cluster with every evaluation, instead of just redefining the clusters every so often. But I realize I should probably just shut up and wait for ABC to clarify what he meant, since he coined the term in terms of Robocode... -- Voidious
Yes, I call it dynamic even without the dynamic weighting of the dimensions. The cluster you choose is never the same, unlike in VisitCountStats. And it sounds better :). Anyway, I'm glad more people are experimenting with it, I believe it has potential to go beyond GF guns, I just never managed to realise it. Great work on Lukious, top 10 with the first version is amazing! -- ABC
OK. Thanks for the clarification. Though now you're confusing me again with that reference to GF guns... Ali uses a DynamicClustering gun right? But to say it's not a GF gun would be quite misleading. In fact Ali's gun and CassiusClay's are very, very similar. The only thing that really separates them is the clustering part. Dynamic vs Fixed clustering if you like. They are both StatisticalTargeting guns using GuessFactors. Yeah, yeah, only terminology. But I'm sort of big on terminology. =) But factwise I agree with you. Intuitively DC should have a higher potential than fixed buffers. And I think the key to harvest a bigchunk of the potential is DynamicClusterDimensions. Which dimensions to choose and how to weight them. Though I guess both can be expressed in weights (the ones you don't want at all you just weight to zero). -- PEZ
Sorry, GF guns as in "fixed clusters statistical guns using guess factors and stats buffers", as in the big majority of the rumble top bots, from FloodMini do Dookious. I just got used to calling that a GF gun ;). -- ABC
Big majority? I count to 5 of the top-10 using DynamicClustering. That's only a big majority in US presidential elections. =) -- PEZ
Hehe, but that is a recent trend, for a long time Shadow was very lonely at the top... :) -- ABC
Well, SilverSurfer has been there for quite a while. But I'll stop giving you a hard time now that you have realized your clustering method is now mainstream. =) -- PEZ
<deleted absurd comment> You don't even need a majority vote to win a US presidential election, but the winner usually gets the majority. But seriously, I too think that the DC system could have a higher potential than segmented stats buffers, both in learning speed and accuracy. Time will tell if I can figure out how to make it happen, though! -- Voidious
- Umm, did I just imagine us talking about "traditional GF" guns, PEZ? I didn't think I was that old, but I can't find the conversation... ;) -- Voidious
Ali's gun is not a traditional GF gun, that's for sure. Ali + traditional GF gun = CC. -- ABC
Hmm... yeah, I know how Ali's gun works, I think I have a faulty memory chip in my brain... -- Voidious
So you guys are saying that the "fixed clusters statistical gun using guess factors and stats buffers" a.k.a traditional GF-gun I just started working on, actually already is outdated? -- GrubbmGait
If you consider that the top 3 bots is the rumble use it, I wouldn't say it's outdated, no. Many other gun variants have been attempted, but until someone manages to outscore those 3 bots, it's very much state of the art actually. -- ABC
I think that what I'm saying is that it's hard to see where to make any quantum leaps in performance with traditional VisitCountStats. It's about striking that gold source of the exact right segentations and granularities. But DC-type stats seems to have the potential of getting us out of that boring state. It's yet to be proven though.
Note that I stubbornly avoid Ali's gun a non-traditional GF gun. Everyting about the GFs there actully is very traditional. It's only the storagre of scan info that's different from the "traditional" VisitCountStats path. Also note that TronsGun was around long before FloodMini so which one is more traditional anyway?
So, for you, the most defining characteristic of a GF gun (as invented by Paul Evans and made mainstream by Kawigi) is the guess factor computation and not the statistical storage/clustering method? Not for me, a "traditional" GF gun is the combination GF+VisitCountStats, anything that deviates from that is not traditional GF targeting. -- ABC
It's because I come from Marshmallow which used several guns of which one was a bit like LaserTargeting. That is, it wasn't using GFs per se, but VisitCountStats definately. It also used Waves (indeed CustomEvent based Waves even). But it was all before I had heard of GFs or Waves. Once I understood the GF concept I started using that instead of raw bearing offsets. (Indeed I don't even use "real" GFs today even.) This main Marshmallow gun was the first gun I ever thought up when going beyond HeadOnTargeting. So it was very, very much like what you call a traditional GF gun. But it wasn't using GFs so it doesn't click with me to call it a GF gun traditional or not. I could very easily, extremely easily even, exchange the GFs in Ali for raw bearing offsets a la Marshmallow. I could do the same in CassiusClay. I guess for me the distinctive divides are:
- DynamicClustering vs VisitCountStats
- GuessFactors vs ScanLogTracing (relink that if there's already a page for it).
I think the only not-so-clever combination here is VisitCountStats + ScanLogTracing, but I might be wrong on that.
As for "traditional". The only thing that makes VisitCountStats more traditional would be that it was OpenSource a while before Musashi was released. Then again Axe was one for creating huge amounts of code which made the impact of Musashi being OpenSource a bit less than it might have been if Ali would have been released earlier. But Ali has been around a while now too. I think you might confuse "traditional" with not-choosen-by-ABC. =)
Ok, I see where you'r coming from. But for me the defining moment was when Paul described SandboxDT's gun after the IBM Rumble ended, probably before your time even. He was using GFs, virtual bullets and VisitCountStats at that time. It was leagues ahead of anything done until then, and didn't win the competition only because of bad luck (10 round battles). The best gun before that was a simple PatternMatcher. It took a while before anyone could understand it, but the traditional GF gun (for me) was pretty much defined, only the addition of waves (Iiley was probably the first to describe them) changed since then. Another milestone bot, imho, was Cigaret, it used waves and probably some ScanLogTracing, but I never could understand the code (mini = obfuscated, for me) and Iiley never managed to explain it well enough. Iirc, the GF gun mania started with David Alves and his Duelist series, Iiley's FloodGrapher, and then Kawigi's FloodMini.
Your right about the not-chosen-by-ABC part, tough ;). I made TronsGun in an attempt at being different, I tend to avoid the "mainstream" . :) --ABC
Cigaret uses a scan-log with "incremental" GFs. The only thing really different from Ali is that instead of picking the best N matches, like Ali does, Cigaret picks the best match (using squared euclidian distances) and uses that for backtracing what GF to fire at. I think DuelistMicro does much the same thing. Actually Paul didn't describe SandboxDT's gun back then. He described SandboxLump. But you're correct it was before my time. And FloodMini probably didn't spring out of empty air. I would think SandboxMini being open source had a lot to do with it. What FloodMini brought to me was an understanding on how simple you could do stuff that I had done in such extreme complexity in Marshmallow. The main difference probably was this wiki. With FloodMini I had both the bot code and the wiki-active author. I never really understood the code in SandboxMini, though I guess that today it would be like reading a textbook for children.
Anyhow, just because the first well-known GF guns were also using VisitCountStats there's no reason to merry the two concepts like they are the same. And just because the first well-known DynamicClustering gun used log backtracing there's no reason to merry those concepts either. I've seen people call DynamicClustering for PatternMatching just because they see the common log-back-tracing there. I work for a nomenclature/terminology where we try to identify the basic techniques and describe them in a way where people don't hesitate to combine them. I think the reason it felt like I was a genious innovator when I created Ali's gun was that what I did was parting with the established GFs == VisitCountStats misconceptions (or the DynamicClusteriung == ScanLogTracing if you like). If the wiki community had separated the basic concepts earlier I am sure we would been farther on our quest for better gunning today.
How would you defined Toad's gun?
- it clusters scans using information gain for one of the dimensions to build the segmentation tree. Those clusters are re-created each round.
- in each node it stores scans in an array according to the hitting GF
A "tree-based dynamic clustering/segmentation gf gun"? You made it, you name it :).
I agree with you PEZ, GF gun does not necessarily mean VisitCountStats. But still, if you go to the TC results page, f.e., most (if not all) bots with GF in the gun type column are "traditional GF guns", as in Waves+VisitCountStats+GFs... -- ABC
I think ToadsGun has a nice ring to it =) About the "traditional GF" gun stuff... for me, "traditional" implies that it is something common, so "traditional GF" implies segmented stats arrays using GuessFactors. Heck, even "traditional learning gun" would probably seem like it means that kind of gun to me, too. But I like PEZ's point that it may be good to be more descriptive in our terminology, even if it were only for the sake of newcomers. -- Voidious
Yes, ABC. That's why I have started to not brand my gun "GF" in the TC listings. I've been itching to change all the other "GF" references too, but I'm waiting for a consensus. Instead of "Gun Type" I would like for the column to be labeled "Gun Tecnhiques". Then CC's entry would say "VCS, GF, VG". And Ali's would be "DC, GF". Shadow's would be "DC, GF" too?
Florent, that gun sounds really interesting. I've been wanting to create something like that for a while, but I'm totally in the dark at how to start. You have described it in more detail somewhere? If not, you should. You're breaking new ground. On promising land I would say.
Shadow's gun is not GF, It's exactly like DCBot, with slightly different dimensions and a VG-like decision for the gunHeat dimension. Shadow's movement is DC/GF wavesurfing. -- ABC
I described the segmentation tree when I was working on it Toad/SegmentationTree, and there is also the source code for it Toad/SegmentationTreeCode. If you have more questions I'll try to make a more detailed description. -- Florent
Reading that page I realize for how long I have banged my head against this GF is not VSC issue. =) FS seems like an odd alternative. Ali is using that. It's the clusering that's dynamic. -- PEZ
Yeah, now that I know what VCS is, that's a better descriptor. Although I would not call what Ali does "segmentation" at all. They're just attributes, and the clustering takes the place of segmentation completely. -- Voidious
Wow, that's a novel way to think of it. =) Seriously. What's in segmentation for you? For me it is the dimensions (or attributes) you use to try define the the targeting "situation". Ali and CC uses the same dimensions for that. If they are "just" attributes for Ali then they are "just" attributes in CC too. Clustering, wether predetermined or dynamic, acts on these dimensions/attributes. Right? -- PEZ
Well, to me - and I think the literal definition is this - segmenting means breaking each of those attributes up into segments. Ali records the raw value itself, instead, and uses a formula on that to dynamically decide on which scans are closest. I would say that Ali and CC both use attributes / dimensions, but only CC actually segments them. Still just a matter of terminology ;), but that's how I understand the term "segmentation". -- Voidious
Ouch, that Lukious 1.10DCape is slooooooow :-( , but it's gun is better than expected :-) -- GrubbmGait
Yeah, I knew it would be slow - sorry. Dooki's gun is the part of him that isn't slow. ;) I've got two clients running, for what it's worth, but they are surely wasting some cycles on ola.Puffin until he's stabilized. Considering how much more work has gone into polishing Dooki's gun, I think the results are quite promising! I may resurrect Lukious from post-rewrite Dookious. (A DC arms race with Simonton sounds fun. =)) -- Voidious
- Oh no! After being embarrassed in the micro race by Skilgannon, I'm feeling a bit shy to step up to another plate right now. -- Simonton
- What I learned from that race was to NEVER be afraid to revert (to the original code, copy-paste). So many ideas I had there actually hurt my rating. And also, the bottom bots are more important than the top bots. Dodging NanoAndew better than anybody else is what got me my no. 1. -- Skilgannon
Also, other conversations aside, your name isn't showing up on the changes list. But agreed, Dookious' movement is impressive. --Chase-san
I'm wondering what you would get in the rumble with Raiko's gun....just for fun, and to see how far I have to go in my movement. -- Skilgannon
- Feel free to post that combo as a temp RoboRumble entrant - such tests can give you a lot of info! Dookious is pretty well structured, so it's very easy to plug the gun or movement into another tank with just a couple minutes of work. Actually, throwing Dooki's gun onto your movement might give you the same perspective with a lot less CPU time (since Dooki's gun is a reasonable speed while his movement is on the slow side). -- Voidious
Question: I understand everything in your distance formula except this one. What's the general idea here? And also, is it true that your "position ratio" is
min(2, WallDistance / MEA). Thanks for sharing all your secrets! :) -- Simonton
v4 += 1D * Math.abs(Math.min(
Math.abs(Math.min(scan1.getGunHeat(), 1.4) - Math.min(scan2.getGunHeat(), 1.4)), Math.abs((1.4 - (Math.min(scan1.getGunHeat(), 1.4))) - Math.min(scan2.getGunHeat(), 1.4)) ) / .7);
Wow, I barely recognize that code! With VisitCountStats, I can wait firing waves higher, but DynamicClustering kinda requires a different approach. The idea here is to compare the "ticks away from firing time" for each scan. I usually fire at 1.9, so I assume a max gun heat of 1.4. So gun heat of 1.3 would be the same distance from firing time as 0.1. There might be an off-by-1 issue there, as well. For the position ratio (just my internal name for WallDistance - should probably rename it), I cap it at 2 in the calculation because I never use higher values, but it's actually capped at 1 in the gun. I think of it more as the GuessFactor you would be at when you hit the wall. -- Voidious
One thing I didn't even consider happening is that Luki's gun + Dooki's movement would be the PremierLeague champ, but that is apparently the case, sweeping Dookious, Phoenix, and Shadow. Its only loss is to SilverSurfer, which, interestingly, is a bot Dookious handles quite easily. -- Voidious
Well, this is interesting: I'm finally rewriting Lukious to be based off of the rewritten Dookious. I found a bug where instead of trimming the scan log whenever by 1 when it went over the max size, it wipes the whole scan log instead. This might explain why it still did well in the rumble (probably reset once or twice per match), but the TC scores just weren't up to par. -- Voidious
- Huh. Yeah. That would make a difference =) -- Simonton
- Um..does this mean that now it will be even slower? =) -- Skilgannon
- I doubt it will be slower. Dooki's gun was almost twice as fast after the rewrite; I imagine Lukious will inherit some of the speed improvements. I don't think his gun was that slow in the first place, either. (I know you're half-kidding, but just FYI. =)) -- Voidious
- I use a quite simple approach to improve my movement execution speed and it has no downsides. After calculating the danger for one way (e.g. clocwise direction) Garm checks for every other way if the danger is greater or equal than the one from the first way. If that is the case the rest of the way will be skipped and you don't need to do any more calculations for that way. In most cases this method saves you from calculating all the six ways for the second wave, which results in a big speed improvement! You can even improve this methode if you start with the way which had the lowest danger in the last tick. --Krabb
- That's a really good idea! So the movement option I chose last time, I calculate that as normal, predicting danger across first two waves; for the second and third options, I first compare the highest level danger, because if that's greater than total danger of the first, I can just stop. Right? Given how much higher the first wave is weighted, I bet that is often. Thanks much, Krabb! -- Voidious
- I'm not sure I understand. Surely you don't need to recalculate the option you chose last time, because it hasn't changed? Shouldn't the precise prediction know where you're going to go? -- Skilgannon
- Yeah, I thought that too once, until I went to implement it. Since I weight the first and second wave based on time to impact, the danger actually does change from tick to tick even for the same option. -- Voidious
- edit conflict This is true, if the center of your orbit has not changed. I think Dookious uses the source of the last wave fired, so he shouldn't have to re-calculate until the enemy fired again. So you should still be able to store something like the bin your going to hit & spare yourself the precise prediction, right? -- Simonton
- Sure, you could cache just the precise prediction, which I have considered. I think I couldn't find any way to do it that didn't seem ugly and a lot of overhead for a moderate speed increase. -- Voidious
I've gotten really lazy on posting the results of my gun work, but it's mainly because I haven't really made any serious progress. I've got a dynamic weighting system that I think works and is bug free. I've tested it vs unweighted results and seen solid proof that it works, but I've also since fixed some major bugs with it since then. The gun performs reasonably well as long as I don't compare it to Dookious, Phoenix, David Alves/PhoenixDC, or Simonton/DCResearch. =) While I have every intention of chasing those top guns, I think I should leave it for now and get to work with the movement. -- Voidious
See, the reason you sit at the top of the rumble is because you're all about the top of the rumble. I love the challenges. I love trying to be the best in the TC's. I, though, will soon focus on the MC's. Maybe then it'll be time to watch out. (Not that David isn't already giving you enough pressure) -- Simonton
I do quite enjoy the TC's, as well, but you're right, I'll always put the RoboRumble (or real battles) over the challenges - for a few reasons. One reason (by example): Dookious and CassiusClay were outscoring Ascendant in the TC's long before their guns' rumble performance were even within 10 points of Ascendant. (At the time of Tyranius, I think Dooki / CC were already the TC kings.) Another reason: I'm sweating over my DC gun here, running tons of tests, when it is already within maybe 10-15 RoboRumble points of Dookious; meanwhile, that clearly leaves him about 70 points behind in the movement department. And if the strength of DC is in learning quicker / more accurately on limited amounts of data, movement could well be where it has more potential to outshine VisitCountStats. As for the MC's, I think they have traditionally been a far less accurate indicator of good movement than the TC's are of targeting, though I think MC2K7 could be a major step up in that regard.
Still, the fact that there aren't as many rumble points in gunnery is one of the reasons I do like the TCs. It's simply fun to make a great gun, even if it won't pay off in the current RoboRumble system. Plus it keeps you and David busy... =)
Hmm ... as a side note - is it more important to decay stats for movement? My hunch is that there's a lot more learning guns out there than there are learning movements. (I don't have a great idea how to do this well in a DC system yet.) -- Simonton
Yes, it's definitely more important - though I'm saying that based mostly on experience using VisitCountStats. I also am not sure of a good way to decay movement stats with DynamicClustering. -- Voidious
Only use the last N scans instead of your full 40000? --David Alves
I think that's what I do in the current Lukious movement, but I'm not entirely sure it's the best way to go about it... There really is no exact equivalent of the low RollingAverage I've found to work well in VCS. -- Voidious
I think the only thing you could do is using absolute game time as an attribute. --Krabb
- Or have you tried to number every wave? --Krabb
- Good call - upon reviewing, that is actually what the current Lukious movement does (total ticks as an attribute). I'm still not sure it's really such a great way to do it, though. -- Voidious
- What I would do (if I ever get into the DC world) is to build my 50 (or whatever) point cluster using a smaller log, but then weight the points according to how recent they are, and how close they were to that wave. -- Skilgannon
The way I do it is I use a LinkedList to store my scans, always inserting them at the top. I remove them from the bottom when the log is too big. Then I just only look at the top N if I want to do Anti-surfing. (Though I haven't done that in a while, DC seems better suited to hitting random movers than surfers) --David Alves
I'd like to see a Lukious 1.20DCape ... I can't help but wonder what it will score now that your aren't completely clearing the logs every now and again. Also, it's getting boring seeing Dookious at the top of the PL the whole time =) -- Skilgannon
- Oops, I'd forgotten about that comment. I doubt it would've been any different than 1.10DCape, but I might try a 1.21Dcape... -- Voidious
The 1.21 gun doesn't use GuessFactors, path retracing, or raw bearing offsets to project a firing angle. =) So that's kinda neat, though it's nothing too major. That was showing an improvement in the TargetingChallengeRM. A bigger improvement came when I ditched the current dynamic weighting system and used the weights / attributes from 1.19.13 in the /Rewrite, which were just intuitively set / tuned.
The firing angle stuff calls it a "GuessVector" in my code, but it's also just like a 2D GuessFactor. I store the X component and the Y component (relative to the original abs bearing) of the displacement vector and divide that by bullet time; a value of (8, 0) would be like GF1, (-8, 0) would be GF-1, (0, 8) would be moving directly away from me. Once I have the N closest scans, I convert each scan's GuessVector into a firing angle for the current scan, ignoring any that would result in the enemy going out of bounds.
I didn't expect this to be a radical improvement over GuessFactors or path retracing, but I think it's cool and I'm glad it works pretty well. In a melee implementation, it might be good to store the displacement relative to the bot's original heading as opposed to relative to its absolute bearing to me, as enemy movements are not relative to me lke they are in 1v1.
So this is sort of like GuessFactor2D? The only thing you're using this for is checking the out-of-bounds, correct? -- Skilgannon
- I think GuessFactor2D is a good description, but I think it's a bit different from what that page actually describes (I have no DynamicSegmentation - or segmentation at all). Yeah, the main advantage is out of bounds checking; in fact, I think it might be mathematically equivalent to GuessFactors other than the out of bounds checking, but I'm not completely sure. -- Voidious
Why not just store angle and distance, instead of x and y? Wouldn't that make the out-of-bounds check simpler? Pre-release Firebird did that, but doing the bounds checking didn't seem to help me, so I dropped it. Does it yield any benefit in your case? --David Alves
Yeah, storing the end distance (well, multiplier of original distance) plus GuessFactor would give you equivalent information. I didn't think of that until afterwards or I might've done that instead. I got 87.00 over 50 seasons (up from 86.84) in the TCRM with this as the only change - probably just barely beyond the margin of error for the numbers of seasons, and since I also feel better about it this way, I'm keeping it. -- Voidious
Hang on, if you're precise-predicting the MEA, this shouldn't really do anything, should it? You should never get an out-of-bounds shot because the MEA takes that into account....or does it? Unless in your history they were *retreating*, and now they are up against a wall or something...but shouldn't wall attributes take care of this? -- Skilgannon
Right, you'd never get an out of bounds shot with precise MEA anyway. But the past shot's GF = 1 might equal .2 radians while the current GF = 1 = .02 radians. In the new system, if the enemy were up against a wall, that data would be ignored for GF=1 instead of turning a .2 radians into a .02 radians. -- Voidious
Just looking at the TargetingChallengeRM, I noticed that Lukious has the top score against FloodMini and Cigaret, both of whom react to bullet fire. This makes me believe that you are clustering on a GunHeat dimension, which I have found *does* help against the bullet dodgers, but reduces the score drastically against almost everybody else. I'd be interested on how Lukious does in the challenge (and in the rumble) without this dimension included in the gun... -- Skilgannon
Another quick suggestion, this time to speed Lukious up: how about 'caching' the locations of scans in the gun, instead of re-calculating them every scan? By this I mean, for a dimension that is weighted more heavily, multiply the location by the weight when storing it instead of multiplying the distance by the weight each and every time you retrieve the scan? Even your "distance from zero gunheat" could be kept in a linear fashion, and just that would remove 5 decisions (from the Math.min calls) for every scan you processed, approximately 100,000 per firing tick by the end of the match. Cleaning up stuff like this would allow you to aim more than once per fire, and possibly even do what you do in Dookious with waiting for last tick to line up with this tick. Although adding a tree would still be best ;-) -- Skilgannon
From old wiki Lukious/Rewrite
What exactly do you mean by "keeping the list of closest scans sorted"? Closest in terms of match closeness, time, or what? I'm not seeing how any of those gives you a speedup, you still have to check all stored scans, right? --David Alves
Ok, my bad for being so terse. Yes, match closeness. And yes, you still have to check all the scans. It's just one element - the updating of maximum distance among the closest scans - that is sped up here. As you're iterating through the 20,000 (or whatever) total scans, you need to keep track of what the maximum distance in the closestScans collection is, so that you can check if the distance for the current scan (in the loop through the 20,000) is less than that maximum; if it is, you sub the new scan into the closestScans array and then update the new maxDistance. I used to just iterate through all 50 elements in closestScans to find the maximum again after I changed one out. Now, I keep the closestScans list sorted by distance; if a new scan has a distance < maxDist, I start from the end of closestScans, shuffle the existing scans up, and insert the new scan into the correct spot. So maxDistance is just the distance on the last element in closestScans.
So you have far fewer total iteretions in all those recalculate-maxDistance loops and a lot fewer data reads, but a lot more data writes (updating references in closestScans). I actually wouldn't be 100% sure off the top of my head which "should be" worse, but I was able to go from ~10,000 to ~25,000 scans without skipping turns after I made that change.
Thanks for the update! I know that the non-surfer scores are way more important, but improving the surfer scores is so fun! I don't understand the whole Gaussian distribution thing, I may have to look into that, just to learn what it means. But a note ... keeping your cluster sorted could result in O(log n), if you used a tree instead of an array to store them. -- Simonton
- Or better yet, a heap! -- Simonton
- Good idea! I'll look into it. -- Voidious
- For what it's worth, I greatly overestimated the value of optimizing that portion of the getClosestScans method, anyway. =) -- Voidious
Added another nice speed optimization to my list of tips above, took me from 25k to 40k scans. -- Voidious
Added info for 1.19.20 and noted for 1.19.15 that I'd enabled a bunch of attributes that weren't previously. I think all these extra attributes might be a problem - some of them have mutual information, yet they are being weighted individually, as if they are all independent variables. Thus in 1.19.21, 5 of them are disabled. Also, the entropy based weights in 1.19.21 are not quite linear (Math.pow(x, 0.75), not quite square root either). Fingers crossed =) I know entropy is a widely used way of judging what we want to judge - the information gain of a specific attribute in a data set - but I'm starting to doubt it is really what I want to use to determine my dynamic weights. I'll stick with it a little longer... -- Voidious
- Another thing I want to try, very similar to something Simonton has mentioned: instead of measuring total entropy of a segmented VisitCountStats buffer, take the entropy of a closest scans cluster found with just that attribute; then the weight for that attribute could be a RollingAverage of the information gain (measured vs maximum entropy or an unsegmented buffer) each time I do that. (You'd still have to cut the GuessFactors into bins to get the entropy.) It should be roughly as expensive as aiming to do this for all attributes - both would compute distance for each attribute across all scans - so I could just do it on a different tick. Even limiting it to like past 1,000 scans could work just as well. It would also be interesting to see if this really gives any different results from my current setup. -- Voidious
- This would still have the issue with mutual information skewing the overall weights - e.g., dist last 8 ticks and time since zero velocity probably measure many of the same situations the same way, so weighting them each by individual entropy will net you more weighting than you want. But if you did limit this part to 500 or 1,000 scans (instead of 40,000), that might be enough time to do some fancy mutual information calculations in there, as well. At first glance, that seems much easier in this setup than in the VisitCountStats buffers setup, where you'd need every combination of two-segment buffers in order to track / calculate mutual information. Hmm! -- Voidious
- Umm, or would you? I may need to get out the pen and paper tonight to strengthen my grasp of entropy and such. =) -- Voidious
- What attributes do you plan to segment across within these clusters? Or by "an unsegmented buffer" do you mean a buffer of all scans...? -- Simonton
I dunno if i understand DC-guns correctly, but what about segmenting your scans (or should i say clustering :) )? Searching over ~20k scans gives me a headache :/ You wrote something about sorting your scans above, just ignore this post if you meant exactly this... I think it could be done in the same way like we segment our GF-stats: Put every scan in a multidimensional array (with as many dimensions as you have attributes, a multidimensional array of ArrayLists would do the job). If you want to find the closest scan to your current scan, just search in segments around the segment in which the current scan is located. That should save you alot of needless searching :D You could first calculate the distance of your current scan to the "middle" of each nearby segment (I hope you get what i mean) and than search in all segments whith a short distance. May be it is faster to store your scans in a tree structure, but I don't know much about woods :P --Krabb
- Hmm ... this idea could be just what we need to make room for more serious analyses of the data (every firing tick)! -- Simonton
- On second thought, let's see ... 26,000 scans, we'll say 5 dimensions ... 11,881,376,000,000,000,000,000 array positions, 46 bytes per scan ... 497,078,232,001.5 terabytes of memory. Maybe that's why we don't do that ... But we could do it in, like, 2 dimensions to perhaps cut off totally wrong scans ... -- Simonton
- I think you got me wrong, we dont need one array positions for each scan in each dimension. The idear was to put similar scans in one segment. If you have three attributes (=3 dimensions): distance, speed and acceleration, and you devide ech dimension in 5 parts you would have 5^3= 125 array positions. Ech position is one ArrayList of scans (instead of your bin-array in a gf gun). You would store your scan attributes and your gf(or whatever you use for aiming) in each "scan" class. You need to calculate the distance of your new scan (which should be used for aiming) to each segment in order to skip all scans in segments which are far away. --Krabb
- (Edit Conflict) I don't think that's quite right. From what I understand, it would be 5 dimensions ^ 4 slices each (or whatever you decide), like current VisitCountStats guns. But instead of an array of doubles, you'd have an array of scans. Then, to find the closest scans, you wouldn't search your whole backlog - just findthe segment that matches the current scan, then search that one and all the neighboring groups for the closest scans. You're assuming the closest scans will always be close in the segmentation, which is probably true. It might work really well! -- Voidious
- mom :) -Krabb
- wow. -- Simonton
- Yes, but a simple array would probably be faster. As we don't want to sort the scans entirely. --Krabb
- Wowie, that is quite a tree. Its like it was made for Clustering. --Chase-san
- Krabb - you think arrays could do it faster than O(log n)? -- Simonton
- Hmm, i don't know much about such things (for now) :) But inserting a new scan would be independent of n. I can't say much about searching, because it depends on the number of scans in each segment wich is dependent on the number of slices in each dimension. But i think it is faster! But you could use a tree for each array point (instead of the proposed ArrayList), wich should speed things up a bit. --Krabb
- If you assume, that the scans are distributed steadly and that the closest scan is in the neighboring segments, you should get O(log(n/slices^dimension)*3^dimension) for the seach (if you use trees for each array point), wich should be faster :). But the closest scan is not necessarily in the 9 neighboring segments and the scans are not distributed steadly... --Krabb
Surprisingly, the methodology I use in 1.19.24 is not that much code or that ugly. It is about as intensive as aiming. With 10 attributes, I'm doing 54 getClosestScans (10 with one attribute, 44 with two), cluster size 50 and log size 1,000, and an equal number of entropy calculations. The results are nothing too impressive, though. My next test is using this system for the info gain but not doing all the mutual information stuff - this also lets you use much bigger log size for the initial information gain calculation. -- Voidious