Understanding Depth-First Branch and Bound implementation to StarCraft 2

Understanding Depth-First Branch and Bound implementation to StarCraft 2 - python

The problem is that I'm finding it difficult to understand how DFBB works, what the parameters and output should be for this case.
I'm working on creating an AI for the game StarCraft 2 that will handle the build order in the game (for team Terran). I was planning to follow the approach described in the link (see below) which followed a very similar thing that I was going for. To summarize what I'm planning to do:
A list of different type of buildings that need to be built will be given to me. Buildings cost minerals and gas (this is the currency in the game), some buildings have prerequisites (meaning other buildings need to be built before it's possible to build it) and they take a certain amount of time to build.
In the article they used Depth-First Branch and Bound to figure out the optimal build order, meaning the fastest way possible to build the buildings in that list. This was their pseudocode:
Where the state S is represented by S = (current game time, resources available, actions in progress but not completed, worker income data). How S´ is derived is described article and it is done through three functions so that bit I understand.
As mentioned earlier I'm struggling to understand what the starting status S, goal G, time limit t and bound b should be represented by in the pseudocode that they are describing.
I only know three things for sure: the list of buildings that needs to be built, what consumables I have at the moment (minerals and gas), resources (that is buildings I already have in the game). This should then be applied to the algorithm somehow, but it is unclear what the input should be to the function. The output should be a list sorted in the right order so if I where to building the buildings in the order they come in then it should all work out and it should be the optimal possible time it can be done in.
For example should I iterate through the list buildings and run DFBB on every element with the goal then being seeing if the building can be built. But what should the time limit be set too and what does bound mean in this case? Is it simply the cost?
Please explain how this function should be run on the list in order to find the optimal path of building it. The article is fairly easy to read, but I need some help understanding how it is meant to work and how I can apply it to my problem.
Link to article: https://ai.dmi.unibas.ch/research/reading_group/churchill-buro-aiide2011.pdf

Starting Status S is the initial state at the start of the game. I believe you have 100 minearls and Command center and 12? SCVs, so that's your start.
The Goal here is the list of building you want to have. The satisfies condition is are all building in goal also in S.
The time limit is the amount of time you are willing to spend to get the result. If yous set it to 5 seconds it will probably give you a sub-optimal solution, but it will do it in 5 seconds. If the algorithm finishes the search it will return earlier. If you don't care leave it out, but make sure you write solutions to a file in case something happens.
Bound b is the in-game time limit for building everything. You initially set it to infinite or some obvious value (like 10 minutes?). When you find a solution the b gets updated so every new solution you find MUST be faster (in-game) than the previous one.
A few notes. Make sure that the possible action (children in step 9) includes doing nothing (wait for more resources) and building an SCV.
Another thing that might be missing is a correct modelling of SCV movement speed. The units need to move to a place to build something and it also takes time for them to get back to mining.

Related

Simultaneous recursion on equivalent graphs using Python

I have a structure, looking a lot like a graph but I can 'sort' it. Therefore I can have two graphs, that are equivalent, but one is sorted and not the other. My goal is to compute a minimal dominant set (with a custom algorithm that fits my specific problem, so please do not link to other 'efficient' algorithms).
The thing is, I search for dominant sets of size one, then two, etc until I find one. If there isn't a dominant set of size i, using the sorted graph is a lot more efficient. If there is one, using the unsorted graph is much better.
I thought about using threads/multiprocessing, so that both graphs are explored at the same time and once one finds an answer (no solution or a specific solution), the other one stops and we go to the next step or end the algorithm. This didn't work, it just makes the process much slower (even though I would expect it to just double the time required for each step, compared to using the optimal graph without threads/multiprocessing).
I don't know why this didn't work and wonder if there is a better way, that maybe doesn't even required the use of threads/multiprocessing, any clue?

If you don't want an algorithm suggestion, then lazy evaluation seems like the way to go.
Setup the two in a data structure such that with a class_instance.next_step(work_to_do_this_step) where a class instance is a solver for one graph type. You'll need two of them. You can have each graph move one "step" (whatever you define a step to be) forward. By careful selection (possibly dynamically based on how things are going) of what a step is, you can efficiently alternate between how much work/time is being spent on the sorted vs unsorted graph approaches. Of course this is only useful if there is at least a chance that either algorithm may finish before the other.
In theory if you can independently define what those steps are, then you could split up the work to run them in parallel, but it's important that each process/thread is doing roughly the same amount of "work" so they all finish about the same time. Though writing parallel algorithms for these kinds of things can be a bit tricky.

Sounds like you're not doing what you describe. Possibly you're waiting for BOTH to finish somehow? Try doing that, and seeing if the time changes.

Is tree datatype what I need in this case ? [TicTacToe Game]

I'm trying myself on an algorithm that plays ticTacToe against itself and learns out of the winning conditions. When it wins, it checks again all the moves it made and increases the probability for the next time the same situation comes.
I never did something like that before. So my idea is that I need every combination of possible Moves.
In the first round the PC has to chose from a list of 9 elements, each representing one of the tiles on the game. Then the other player can chose from 8. But: there has to be 9 different lists player two can chose from. When player one chose number 2 , player two is allowed to chose from the list of elements which does not include number 2.
So I need in the first row 1 list of 9 Elements. In the Second I need 9 lists of 8 elements each and so on.
This becomes pretty big, so I need to create those combinations automatically.
My idea was to create lists which contains either more lists or the elements to chose from. Then I can navigate through those lists to tell the player out of which list (or path in a big list of lists) to chose from. I‘m not really sure if there is an easy way to do this, especially the creating of those lists. I couldn’t find a way yet. Then I saw the tree datatype, which seems to be powerful, but I’m not sure if this is the right one that I search for. Hope you can give me advice
Edit: to make it clear, I know there is this minmax algorithm etc. What I wanted to do is let the game play a lot against itself and Let it find their own way in learning. Just by getting the result if he won or not.

The approach you plan to follow might be considered as an Ant Colonization Algorithm. As your description points out the idea is to explore available paths according to some heuristic and backtracking the path followed to increase/decrease the probability of that same path to be taken again in subsequent iterations, effectively weighting the graph edges (the state tree of TicTacToe in this case). At the end of the process the winning paths will have a greater weight than the loosing ones which would allow your engine to play TicTacToe well by following the heaviest edges. Here are some links if you're interested: wiki, seminar slides.
IMO the nature of the algorithm requires some kind of tree/graph data structure to ease backtracking and neighbor discovery and I would personally go for that instead of using lists of lists. To that effect you may try the NetworkX library, for example.
Separately I agree with #martin-wettstein comments that taking advantage of board symmetries would reduce the number of board states to be considered and would improve performance at the cost of a slightly more complicated logic.
Indeed I implemented the same approach as you some time ago and it was really fun, so good luck at it.

Graph search - find most productive route [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm working on a graph search problem that can be distilled to the following simpler example:
Updated to clarify based on response below
The Easter Bunny is hopping around the forest collecting eggs. He knows how many eggs to expect from every bush, but every bush has a unique number of eggs. It takes the Easter Bunny 30 minutes to collected from any given bush. The easter bunny searches for eggs 5 days a week, up to 8 hours per day. He typically starts and ends in his burrow, but on Tuesday he plans to end his day at his friend Peter Rabbit's burrow. Mrs. Bunny gave him a list of a few specific bushes to visit on specific days/times - these are intermediate stops that must be hit, but do not list all stops (maybe 1-2 per day). Help the Easter Bunny design a route that gives him the most eggs at the end of the week.
Given Parameters: undirected graph (g), distances between nodes are travel times, 8 hours of time per day, 5 working days, list of (node,time,day) tuples (r) , list of (startNode, endNode, day) tuples (s)
Question: Design a route that maximizes the value collected over the 5 days without going over the allotted time in any given day.
Constraints: visit every node in r on the prescribed time/day. for each day in s, start and end at the corresponding nodes, whose collection value is 0. Nodes cannot be visited more than once per week.
Approach: Since there won't be very many stops, given the time at each stop and the travel times (maybe 10-12 on a large day) my first thought was to brute force all routes that start/stop at the correct points, and just run this 5 times, removing all visited nodes. From there, separately compute the collected value of each allowable route. However, this doesn't account for the fact that my "best" route on day one may ruin a route that would be best on day 5, given required stops on that day.
To solve that problem I considered running one long search by concatenating all the days and just starting from t = 0 (beginning of week) to t = 40 (end of week), with the start/end points for each day as intermediate stops. This gets too long to brute force.
I'm struggling a little with how to approach the problem - it's not a TSP problem - I'm only going to visit a fraction of all nodes (maybe 50 of 200). It's also not a dijkstra's pathing problem, the shortest path typically would be to go nowhere. I need to maximize the total collected value in the allotted time making the required intermediate stops. Any thoughts on how to proceed would be greatly appreciated! Right now I've been approaching this using networkx in python.
Edit following response
In response to your edit - I'm looking for an approach to solve the problem - I can figure out the code later, I'm leaning towards A* over MDFS, because I don't need to just find one path (that will be relatively quick), I need to find an approximation of the best path. I'm struggling to create a heuristic that captures the time constraint (stay under time required to be at next stop) but also max eggs. I don't really want the shortest path, I want the "longest" path with the most eggs. In evaluating where to go next, I can easily do eggs/min and move to the bush with the best rate, but I need to figure out how to encourage it to slowly move towards the target. There will always be a solution - I could hop to the first bush, sit there all day and then go to the solution (there placement/time between is such that it is always solvable)

The way the problem is posed doesn't make full sense. It is indeed a graph search problem to maximise a sum of numbers (subject to other constraints) and it possibly can be solved via brute force as the number of nodes that will end up being traversed is not necessarily going to climb to the hundreds (for a single trip).
Each path is probably a few nodes long because of the 30 min constraint at each stop. With 8 hours in a day and negligible distances between the bushes that would amount to a maximum of 16 stops. Since the edge costs are not negligible, it means that each trip should have <<16 stops.
What we are after is the maximum sum of 5 days harvest (max of five numbers). Each day's harvest is the sum of collected eggs over a "successful" path.
A successful path is defined as the one satisfying all constraints which are:
The path begins and ends on the same node. It is therefore a cycle EXCEPT for Tuesday. Tuesday's harvest is a path.
The cycle of a given day contains the nodes specified in Mrs Bunny's
list for that day.
The sum of travel times is less than 8 hrs including the 30min harvesting time.
Therefore, you can use a modified Depth First Search (DFS) algorithm. DFS, on its own can produce an exhaustive list of paths for the network. But, this DFS will not have to traverse all of them because of the constraints.
In addition to the nodes visited so far, this DFS keeps track of the "travel time" and "eggs" collected so far and at each "hop" it checks that all constraints are satisfied. If they are not, then it backtracks or abandons the traversed path. This backtracking action "self-limits" the enumerated paths.
If the reasoning is so far inline with the problem (?), here is why it doesn't seem to make full sense. If we were to repeat the weekly harvest process for M times to determine the best visiting daily strategy then we would be left with the problem of determining a sufficiently large M to have covered the majority of paths. Instead we could run the DFS once and determine the route of maximum harvest ONCE, which would then lead to the trivial solution of 4*CycleDailyHarvest + TuePathHarvest. The other option would be to relax the 8hr constraint and say that Mr Bunny can harvest UP TO 8hr a day and not 8hr exactly.
In other words, if all parameters are static, then there is no reason to run this process multiple times. For example, if each bush was to give "up to k eggs" following a specific distribution, maybe we could discover an average daily / weekly visiting strategy with the largest yield. (Or my perception of the problem so far is wrong, in which case, please clarify).
Tuesday's task is easier, it is as if looking for "the path between source and target whose time sum is approximately 8hrs and sum of collected eggs is max". This is another sign of why the problem doesn't make full sense. If everything is static (graph structure, eggs/bush, daily harvest interval) then there is only one such path and no need to examine alternatives.
Hope this helps.
EDIT (following question update):
The update doesn't radically change the core of the previous response which is "Use a modified DFS (for the potential of exhaustively enumerating all paths / cycles) and encode the constraints as conditions on metrics (travel time, eggs harvested) that are updated on each hop". It only modifies the way the constraints are represented. The most significant alteration is the "visit each bush once per week". This would mean that the memory of DFS (the set of visited nodes) is not reset at the end of a cycle or the end of a day but at the end of a week. Or in other words, the DFS now can start with a pre-populated visited set. This is significant because it will reduce the number of "viable" path lengths even more. In fact, depending on the structure of the graph and eggs/bush the problem might even end up being unsolvable (i.e. zero paths / cycles satisfying the conditions).
EDIT2:
There are a few "problems" with that approach which I would like to list here with what I think are valid points not yet seen by your viewpoint but not in an argumentative way:
"I don't need to just find one path (that will be relatively quick), I need to find an approximation of the best path." and "I want the "longest" path with the most eggs." are a little bit contradicting statements but on average they point to just one path. The reason I am saying this is because it shows that either the problem is too difficult or not completely understood (?)
A heuristic will only help in creating a landscape. We still have to traverse the landscape (e.g. steepest descent / ascent) and there will be plenty of opportunity for oscillations as the algorithm might get trapped between two "too-low", "too-high" alternatives or discovery of local-minima / maxima without an obvious way of moving out of them.
A*s main objective is still to return ONE path and it will have to be modified to find alternatives.
When operating over a graph, it is impossible to "encourage" the traversal to move towards a specific target because the "traversing agent" doesn't know where the target is and how to get there in the sense of a linear combination of weights (e.g. "If you get too far, lower some Xp which will force the agent to start turning left heading back towards where it came from". When Mr Bunny is at his burrow he has all K alternatives, after the first possible choice he has K-M1 (M1
The MDFS will help in tracking the different ways these sums are allowed to be created according to the choices specified by the graph. (Afterall, this is a graph-search problem).
Having said this, there are possibly alternative, sub-optimal (in terms of computational complexity) solutions that could be adopted here. The obvious (but dummy one) is, again, to establish two competing processes that impose self-control. One is trying to get Mr Bunny AWAY from his burrow and one is trying to get Mr Bunny BACK to his burrow. Both processes are based on the above MDFS and are tracking the cost of MOVEAWAY+GOBACK and the path they produce is the union of the nodes. It might look a bit like A* but this one is reset at every traversal. It operates like this:
AWAY STEP:
Start an MDFS outwards from Mr Bunny's burrow and keep track of distance / egg sum, move to the lowestCost/highestReward target node.
GO BACK STEP:
Now, pre-populate the visited set of the GO BACK MDFS and try to get back home via a route NOT TAKEN SO FAR. Keep track of cost / reward.
Once you reach home again, you have a possible collection path. Repeat the above while the generated paths are within the time specification.
This will result in a palette of paths which you can mix and match over a week (4 repetitions + TuesdayPath) for the lowestCost / highestReward options.
It's not optimal because you might get repeating paths (the AWAY of one trip being the BACK of another) and because this quickly eliminates visited nodes it might still run out of solutions quickly.

Genetic Algorithm in Optimization of Events

I'm a data analysis student and I'm starting to explore Genetic Algorithms at the moment. I'm trying to solve a problem with GA but I'm not sure about the formulation of the problem.
Basically I have a state of a variable being 0 or 1 (0 it's in the normal range of values, 1 is in a critical state). When the state is 1 I can apply 3 solutions (let's consider Solution A, B and C) and for each solution I know the time that the solution was applied and the time where the state of the variable goes to 0.
So I have for the problem a set of data that have a critical event at 1, the solution applied and the time interval (in minutes) from the critical event to the application of the solution, and the time interval (in minutes) from the application of the solution until the event goes to 0.
I want with a genetic algorithm to know which is the best solution for a critical event and the fastest one. And if it is possible to rank the solutions acquired so if in the future on solution can't be applied I can always apply the second best for example.
I'm thinking of developing the solution in Python since I'm new to GA.
Edit: Specifying the problem (responding to AMack)
Yes is more a less that but with some nuances. For example the function A can be more suitable to make the variable go to F but because exist other problems with the variable are applied more than one solution. So on the data that i receive for an event of V, sometimes can be applied 3 ou 4 functions but only 1 or 2 of them are specialized for the problem that i want to analyze. My objetive is to make a decision support on the solution to use when determined problem appear. But the optimal solution can be more that one because for some event function A acts very fast but in other case of the same event function A don't produce a fast response and function C is better in that case. So in the end i pretend a solution where is indicated what are the best solutions to the problem but not only the fastest because the fastest in the majority of the cases sometimes is not the fastest in the same issue but with a different background.

I'm unsure of what your question is, but here are the elements you need for any GA:
A population of initial "genomes"
A ranking function
Some form of mutation, crossing over within the genome
and reproduction.
If a critical event is always the same, your GA should work very well. That being said, if you have a different critical event but the same genome you will run into trouble. GA's evolve functions towards the best possible solution for A Set of conditions. If you constantly run the GA so that it may adapt to each unique situation you will find a greater degree of adaptability, but have a speed issue.
You have a distinct advantage using python because string manipulation (what you'll probably use for the genome) is easy, however...
python is slow.
If the genome is short, the initial population is small, and there are very few generations this shouldn't be a problem. You lose possibly better solutions that way but it will be significantly faster.
have fun...

You should take a look at the GARAGe Michigan State. They are a GA research group with a fair number of resources in terms of theory, papers, and software that should provide inspiration.

To start, let's make sure I understand your problem.
You have a set of sample data, each element containing a time series of a binary variable (we'll call it V). When V is set to True, a function (A, B, or C) is applied which returns V to it's False state. You would like to apply a genetic algorithm to determine which function (or solution) will return V to False in the least amount of time.
If this is the case, I would stay away from GAs. GAs are typically used for some kind of function optimization / tuning. In general, the underlying assumption is that what you permute is under your control during the algorithm's application (i.e., you are modifying parameters used by the algorithm that are independent of the input data). In your case, my impression is that you just want to find out which of your (I assume) static functions perform best in a wide variety of cases. If you don't feel your current dataset provides a decent approximation of your true input distribution, you can always sample from it and permute the values to see what happens; however, this would not be a GA.
Having said all of this, I could be wrong. If anyone has used GAs in verification like this, please let me know. I'd certainly be interested in learning about it.

Efficient scheduling of university courses

I'm currently working on a website that will allow students from my university to automatically generate valid schedules based on the courses they'd like to take.
Before working on the site itself, I decided to tackle the issue of how to schedule the courses efficiently.
A few clarifications:
Each course at our university (and I assume at every other
university) comprises of one or more sections. So, for instance,
Calculus I currently has 4 sections available. This means that, depending on the amount of sections, and whether or not the course has a lab, this drastically affects the scheduling process.
Courses at our university are represented using a combination of subject abbreviation and course code. In the case of Calculus I: MATH 1110.
The CRN is a code unique to a section.
The university I study at is not mixed, meaning males and females study in (almost) separate campuses. What I mean by almost is that the campus is divided into two.
The datetimes and timeranges dicts are meant to decreases calls to datetime.datetime.strptime(), which was a real bottleneck.
My first attempt consisted of the algorithm looping continuously until 30 schedules were found. Schedules were created by randomly choosing a section from one of the inputted courses, and then trying to place sections from the remaining courses to try to construct a valid schedule. If not all of the courses fit into the schedule i.e. there were conflicts, the schedule was scrapped and the loop continued.
Clearly, the above solution is flawed. The algorithm took too long to run, and relied too much on randomness.
The second algorithm does the exact opposite of the old one. First, it generates a collection of all possible schedule combinations using itertools.product(). It then iterates through the schedules, crossing off any that are invalid. To ensure assorted sections, the schedule combinations are shuffled (random.shuffle()) before being validated. Again, there is a bit of randomness involved.
After a bit of optimization, I was able to get the scheduler to run in under 1 second for an average schedule consisting of 5 courses. That's great, but the problem begins once you start adding more courses.
To give you an idea, when I provide a certain set of inputs, the amount of combinations possible is so large that itertools.product() does not terminate in a reasonable amount of time, and eats up 1GB of RAM in the process.
Obviously, if I'm going to make this a service, I'm going to need a faster and more efficient algorithm. Two that have popped up online and in IRC: dynamic programming and genetic algorithms.
Dynamic programming cannot be applied to this problem because, if I understand the concept correctly, it involves breaking up the problem into smaller pieces, solving these pieces individually, and then bringing the solutions of these pieces together to form a complete solution. As far as I can see, this does not apply here.
As for genetic algorithms, I do not understand them much, and cannot even begin to fathom how to apply one in such a situation. I also understand that a GA would be more efficient for an extremely large problem space, and this is not that large.
What alternatives do I have? Is there a relatively understandable approach I can take to solve this problem? Or should I just stick to what I have and hope that not many people decide to take 8 courses next semester?
I'm not a great writer, so I'm sorry for any ambiguities in the question. Please feel free to ask for clarification and I'll try my best to help.
Here is the code in its entirety.
http://bpaste.net/show/ZY36uvAgcb1ujjUGKA1d/
Note: Sorry for using a misleading tag (scheduling).

Scheduling is a very famous constraint satisfaction problem that is generally NP-Complete. A lot of work has been done on the subject, even in the same context as you: Solving the University Class Scheduling Problem Using Advanced ILP Techniques. There are even textbooks on the subject.
People have taken many approaches, including:
Dynamic programming
Genetic algorithms
Neural networks
You need to reduce your problem-space and complexity. Make as many assumptions as possible (max amount of classes, block based timing, ect). There is no silver bullet for this problem but it should be possible to find a near-optimal solution.
Some semi-recent publications:
QUICK scheduler a time-saving tool for scheduling class sections
Scheduling classes on a College Campus

Did you ever read anything about genetic programming? The idea behind it is that you let the 'thing' you want solved evolve, just by itsself, until it has grown to the best solution(s) possible.
You generate a thousand schedules, of which usually zero are anywhere in the right direction of being valid. Next, you change 'some' courses, randomly. From these new schedules you select some of the best, based on ratings you give according to the 'goodness' of the schedule. Next, you let them reproduce, by combining some of the courses on both schedules. You end up with a thousand new schedules, but all of them a tiny fraction better than the ones you had. Let it repeat until you are satisfied, and select the schedule with the highest rating from the last thousand you generated.
There is randomness involved, I admit, but the schedules keep getting better, no matter how long you let the algorithm run. Just like real life and organisms there is survival of the fittest, and it is possible to view the different general 'threads' of the same kind of schedule, that is about as good as another one generated. Two very different schedules can finally 'battle' it out by cross breeding.
A project involving school schedules and genetic programming:
http://www.codeproject.com/Articles/23111/Making-a-Class-Schedule-Using-a-Genetic-Algorithm
I think they explain pretty well what you need.
My final note: I think this is a very interesting project. It is quite difficult to make, but once done it is just great to see your solution evolve, just like real life. Good luck!

The way you're currently generating combinations of sections is probably throwing up huge numbers of combinations that are excluded by conflicts between more than one course. I think you could reduce the number of combinations that you need to deal with by generating the product of the sections for only two courses first. Eliminate the conflicts from that set, then introduce the sections for a third course. Eliminate again, then introduce a fourth, and so on. This should see a more linear growth in the processing time required as the number of courses selected increases.

This is a hard problem. It you google something like 'course scheduling problem paper' you will find a lot of references. Genetic algorithm - no, dynamic programming - yes. GAs are much harder to understand and implement than standard DP algos. Usually people who use GAs out of the box, don't understand standard techniques. Do some research and you will find different algorithms. You might be able to find some implementations. Coming up with your own algorithm is way, way harder than putting some effort into understanding DP.

The problem you're describing is a Constraint Satisfaction Problem. My approach would be the following:
Check if there's any uncompatibilities between courses, if yes, record them as constraints or arcs
While not solution is found:
Select the course with less constrains (that is, has less uncompatibilities with other courses)
Run the AC-3 algorithm to reduce search space
I've tried this approach with sudoku solving and it worked (solved the hardest sudoku in the world in less than 10 seconds)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.