I am using the Google OR Tools vehicle routing implementation and am trying to incorporate traffic times into my time matrix by using the Google Maps API. However, the Google Maps API has limitations on how big of time matrices it can build, how many requests can be done in certain amounts of time, etc.
I know that the Google OR Tools VRP expects this time matrix, but I don't need the travel times between all combinations of all origins and destinations. For example, I am inputting pickup/dropoff pairs, for which it does not make sense to calculate the travel time from each dropoff to its assigned pickup. Additionally, perhaps I could also not calculate the travel time between locations that are far away (I'd establish some maximum distance) from one another. It would reduce the computational complexity to not have to call the API for these combinations and instead have certain constants as placeholders in the time matrix for these combinations.
Can this routing model be run in loops, such that for the first iteration I only calculate the travel times between the most likely assignments and inside each loop each driver gets assigned a pickup/dropoff pair and then in the next loop the travel times between already made assignments don't need to be calculated anymore? I don't even know if this would change the computation time.
Has anyone else had this problem before? I'd be interested in hearing any advice and/or additional heuristics to use.
VRP travel matrix required input is all the possible distance between visit location. You can reduce the complexity of the problem by assuming the distance between A to B is equal to B to A, this will also reduce the API call to the Google Map API. To be noted, the travel matrix shape must always be symmetrical.
The distance between locations is required for the VRP solving heuristic to find the next optimal nodes to be visited.
If you are certain that there are some locations that will not be visited after visiting some location, you can set the distance between those locations as Big M (i.e, sys.maxsize()). However, be careful with the direction constraint (pickup dropoff constraints), if you set Big M between the 2 locations that are linked by the constraint, the solver will definitely fail.
Related
Query:
I want to estimate the trajectory of a person wearing an IMU between point a and point b. I know the exact location of point a and point b in an x,y,z space and the time it takes the person to walk between the points.
Is it possible to reconstruct the trajectory of the person moving from point a to point b using the data from an IMU and the time?
This question is too broad for SO. You could write a PhD thesis answering it, and I know people who have.
However, yes, it is theoretically possible.
However, there are a few things you'll have to deal with:
Your system is going to discretize time on some level. The result is that your estimate of position will be non-smooth. Increasing sampling rates is one way to address this, but this frequently increases the noise of the measurement.
Possible paths are non-unique. Knowing the time it takes to travel from a-b constrains slightly the information from the IMUs, but you are still left with an infinite family of possible routes between the two. Since you mention that you're considering a person walking between two points with z-components, perhaps you can constrain the route using knowledge of topography and roads?
IMUs function by integrating accelerations to velocities and velocities to positions. If the accelerations have measurement errors, and they always do, then the error in your estimate of the position will grow over time. The longer you run the system for, the more the results will diverge. However, if you're able to use roads/topography as a constraint, you may be able to restart the integration from known points in space; that is, if you can detect 90 degree turns on a street grid, each turn gives you the opportunity to tie the integrator back to a feasible initial condition.
Given the above, perhaps the most important question you have to ask yourself is how much error you can tolerate in your path reconstruction. Low-error estimates are going to require better (i.e. more expensive) sensors, higher sampling rates, and higher-order integrators.
I have J projects, running over T weeks, each project using S amount of resources in week t (based on a pre-defined matrix). I have to allocate these projects to I clusters, where cluster capacity C_i is known. Furthermore, each project belongs to a certain subgroup G.
This appears to be a 0-1 multiple knapsack problem, right? However, there are some differences. All the projects have to be assigned to exactly 1 cluster, and once it is assigned, it cannot be moved from it. This usually results in over-packing of the knapsacks (clusters), in violation with "knapsack cannot exceed capacity" constraint, creating only unfeasible solutions. Thus, it has an impact on the objectives.
My objectives are, in order of priority:
1) Minimize the total occurrence of resource requests exceeding the cluster's capacity. In layman's terms, minimize the number of times that the clusters' capacities are violated.
2) Resource request be maximally spread across clusters.
3) Projects in the same group be maximally spread across clusters.
Now, for my questions:
Am I right to assume this is a 0-1 Multiple Knapsack Problem? Am I also right to assume it is linear? I could not find any similar case studies in the literature so far of this exact variation of the problem.
I have implemented some beginner level code that generates random solutions for project to cluster allocations, and created a Pareto-optimal front. The next step is the implementation of a simple multi-objective optimization algorithm. I am not sure how to even begin, as I have not encountered anything close to this in my literature. I am quite the beginner in Python, so even reading through library documentations for PyGMO, DEAP, or even SciPy seems too complex for me. Any suggestions?
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm working on a graph search problem that can be distilled to the following simpler example:
Updated to clarify based on response below
The Easter Bunny is hopping around the forest collecting eggs. He knows how many eggs to expect from every bush, but every bush has a unique number of eggs. It takes the Easter Bunny 30 minutes to collected from any given bush. The easter bunny searches for eggs 5 days a week, up to 8 hours per day. He typically starts and ends in his burrow, but on Tuesday he plans to end his day at his friend Peter Rabbit's burrow. Mrs. Bunny gave him a list of a few specific bushes to visit on specific days/times - these are intermediate stops that must be hit, but do not list all stops (maybe 1-2 per day). Help the Easter Bunny design a route that gives him the most eggs at the end of the week.
Given Parameters: undirected graph (g), distances between nodes are travel times, 8 hours of time per day, 5 working days, list of (node,time,day) tuples (r) , list of (startNode, endNode, day) tuples (s)
Question: Design a route that maximizes the value collected over the 5 days without going over the allotted time in any given day.
Constraints: visit every node in r on the prescribed time/day. for each day in s, start and end at the corresponding nodes, whose collection value is 0. Nodes cannot be visited more than once per week.
Approach: Since there won't be very many stops, given the time at each stop and the travel times (maybe 10-12 on a large day) my first thought was to brute force all routes that start/stop at the correct points, and just run this 5 times, removing all visited nodes. From there, separately compute the collected value of each allowable route. However, this doesn't account for the fact that my "best" route on day one may ruin a route that would be best on day 5, given required stops on that day.
To solve that problem I considered running one long search by concatenating all the days and just starting from t = 0 (beginning of week) to t = 40 (end of week), with the start/end points for each day as intermediate stops. This gets too long to brute force.
I'm struggling a little with how to approach the problem - it's not a TSP problem - I'm only going to visit a fraction of all nodes (maybe 50 of 200). It's also not a dijkstra's pathing problem, the shortest path typically would be to go nowhere. I need to maximize the total collected value in the allotted time making the required intermediate stops. Any thoughts on how to proceed would be greatly appreciated! Right now I've been approaching this using networkx in python.
Edit following response
In response to your edit - I'm looking for an approach to solve the problem - I can figure out the code later, I'm leaning towards A* over MDFS, because I don't need to just find one path (that will be relatively quick), I need to find an approximation of the best path. I'm struggling to create a heuristic that captures the time constraint (stay under time required to be at next stop) but also max eggs. I don't really want the shortest path, I want the "longest" path with the most eggs. In evaluating where to go next, I can easily do eggs/min and move to the bush with the best rate, but I need to figure out how to encourage it to slowly move towards the target. There will always be a solution - I could hop to the first bush, sit there all day and then go to the solution (there placement/time between is such that it is always solvable)
The way the problem is posed doesn't make full sense. It is indeed a graph search problem to maximise a sum of numbers (subject to other constraints) and it possibly can be solved via brute force as the number of nodes that will end up being traversed is not necessarily going to climb to the hundreds (for a single trip).
Each path is probably a few nodes long because of the 30 min constraint at each stop. With 8 hours in a day and negligible distances between the bushes that would amount to a maximum of 16 stops. Since the edge costs are not negligible, it means that each trip should have <<16 stops.
What we are after is the maximum sum of 5 days harvest (max of five numbers). Each day's harvest is the sum of collected eggs over a "successful" path.
A successful path is defined as the one satisfying all constraints which are:
The path begins and ends on the same node. It is therefore a cycle EXCEPT for Tuesday. Tuesday's harvest is a path.
The cycle of a given day contains the nodes specified in Mrs Bunny's
list for that day.
The sum of travel times is less than 8 hrs including the 30min harvesting time.
Therefore, you can use a modified Depth First Search (DFS) algorithm. DFS, on its own can produce an exhaustive list of paths for the network. But, this DFS will not have to traverse all of them because of the constraints.
In addition to the nodes visited so far, this DFS keeps track of the "travel time" and "eggs" collected so far and at each "hop" it checks that all constraints are satisfied. If they are not, then it backtracks or abandons the traversed path. This backtracking action "self-limits" the enumerated paths.
If the reasoning is so far inline with the problem (?), here is why it doesn't seem to make full sense. If we were to repeat the weekly harvest process for M times to determine the best visiting daily strategy then we would be left with the problem of determining a sufficiently large M to have covered the majority of paths. Instead we could run the DFS once and determine the route of maximum harvest ONCE, which would then lead to the trivial solution of 4*CycleDailyHarvest + TuePathHarvest. The other option would be to relax the 8hr constraint and say that Mr Bunny can harvest UP TO 8hr a day and not 8hr exactly.
In other words, if all parameters are static, then there is no reason to run this process multiple times. For example, if each bush was to give "up to k eggs" following a specific distribution, maybe we could discover an average daily / weekly visiting strategy with the largest yield. (Or my perception of the problem so far is wrong, in which case, please clarify).
Tuesday's task is easier, it is as if looking for "the path between source and target whose time sum is approximately 8hrs and sum of collected eggs is max". This is another sign of why the problem doesn't make full sense. If everything is static (graph structure, eggs/bush, daily harvest interval) then there is only one such path and no need to examine alternatives.
Hope this helps.
EDIT (following question update):
The update doesn't radically change the core of the previous response which is "Use a modified DFS (for the potential of exhaustively enumerating all paths / cycles) and encode the constraints as conditions on metrics (travel time, eggs harvested) that are updated on each hop". It only modifies the way the constraints are represented. The most significant alteration is the "visit each bush once per week". This would mean that the memory of DFS (the set of visited nodes) is not reset at the end of a cycle or the end of a day but at the end of a week. Or in other words, the DFS now can start with a pre-populated visited set. This is significant because it will reduce the number of "viable" path lengths even more. In fact, depending on the structure of the graph and eggs/bush the problem might even end up being unsolvable (i.e. zero paths / cycles satisfying the conditions).
EDIT2:
There are a few "problems" with that approach which I would like to list here with what I think are valid points not yet seen by your viewpoint but not in an argumentative way:
"I don't need to just find one path (that will be relatively quick), I need to find an approximation of the best path." and "I want the "longest" path with the most eggs." are a little bit contradicting statements but on average they point to just one path. The reason I am saying this is because it shows that either the problem is too difficult or not completely understood (?)
A heuristic will only help in creating a landscape. We still have to traverse the landscape (e.g. steepest descent / ascent) and there will be plenty of opportunity for oscillations as the algorithm might get trapped between two "too-low", "too-high" alternatives or discovery of local-minima / maxima without an obvious way of moving out of them.
A*s main objective is still to return ONE path and it will have to be modified to find alternatives.
When operating over a graph, it is impossible to "encourage" the traversal to move towards a specific target because the "traversing agent" doesn't know where the target is and how to get there in the sense of a linear combination of weights (e.g. "If you get too far, lower some Xp which will force the agent to start turning left heading back towards where it came from". When Mr Bunny is at his burrow he has all K alternatives, after the first possible choice he has K-M1 (M1
The MDFS will help in tracking the different ways these sums are allowed to be created according to the choices specified by the graph. (Afterall, this is a graph-search problem).
Having said this, there are possibly alternative, sub-optimal (in terms of computational complexity) solutions that could be adopted here. The obvious (but dummy one) is, again, to establish two competing processes that impose self-control. One is trying to get Mr Bunny AWAY from his burrow and one is trying to get Mr Bunny BACK to his burrow. Both processes are based on the above MDFS and are tracking the cost of MOVEAWAY+GOBACK and the path they produce is the union of the nodes. It might look a bit like A* but this one is reset at every traversal. It operates like this:
AWAY STEP:
Start an MDFS outwards from Mr Bunny's burrow and keep track of distance / egg sum, move to the lowestCost/highestReward target node.
GO BACK STEP:
Now, pre-populate the visited set of the GO BACK MDFS and try to get back home via a route NOT TAKEN SO FAR. Keep track of cost / reward.
Once you reach home again, you have a possible collection path. Repeat the above while the generated paths are within the time specification.
This will result in a palette of paths which you can mix and match over a week (4 repetitions + TuesdayPath) for the lowestCost / highestReward options.
It's not optimal because you might get repeating paths (the AWAY of one trip being the BACK of another) and because this quickly eliminates visited nodes it might still run out of solutions quickly.
I am attempting to create a program to schedule asynchronous carpool scheduling. For example, given a single long street represented by this line:
A----------------------B---------------------------C-------------------------D
If Alice, Bob, and Chad all need to be at point D at the same, and start from points A,B,and C respectively, but Bob doesn't have a car, the program needs to output that the best solution is for Alice to pick Bob up. If the Google maps API is used to determine distance and time between two points (precision may be approximate), what is the best way to determine the most efficient set of actions?
In this simplified scenario, there is a rather limited number of possibilities, (Alice picks Bob up, Chad picks Bob up, Bob walks) and it is fairly simple to iterate and evaluate them based off of time or distance traveled, but when the number of possibilities numbers in the thousands or the millions, that approach doesn't scale well. I have been working for months trying to develop a more sophisticated algorithm, but due to my limited knowledge I have not been very successful
I suggest to generalize the problem, here's how I understand it:
Given a graph G that describes connections between locations L={l_0, ..., l_n} on a map, and a set of participants P = {p_0, ..., p_n} find the best itinerary I={l_s, ..., l_d} for each participant to travel from their respective source location l_s to a common destination l_d such that all participants reach the destination independent of their means of travel. Some participants may need to take a detour to pick up other participants, in which case we want to find the solution with the lowest total cost (as in travel time required)
Here's a rough sketch of how I would attempt to solve this problem:
build a graph of all nodes and how they are connected (e.g. in your example the graph is G = [(A,B),(B,C),(C,D)]).
add a weight to each edge which describes say its distance in some normalized way (metric, constant speed/time, ...)
for each participant specify the maximum speed available to them depending on the mode they can travel with (e.g. car, bike, walk).
for the participants with car (A,C) calculate the actual time it takes for each to reach the target destination from their respective source locations, using a shortest path algorithm such as A* (a star). This is actually a two-step process, step 1: calculate the shortest path, step 2: calculate the time it takes to travel each edge.
Google's map API would probably already provide all or most of the data you need to do these calculations (or even do the calculations for you).
As a result you get the best itinerary for each participant. Now we have to consider participants without car (here, B). For that we have to find the participant to take a detour such that the detour incurs the lowest cost. Thus,
calculate for each of of A,C two shortest path routes, i.e. step 1: source location => detour location (B), step 2: detour location (B) => destination location (D)
select the best option, i.e. minimize for total travel time
Since you tagged this as a Python question, you may want to check out NetworkX or graph-tool to implement.
Disclaimer: Obviously this leaves out quite a lot of details, and may even be completely off. Feedback appreciated. To paraphrase Donald Knuth, I only thought about this, I didn't implement or test it.
We are working on a project which involves running a shortest path algorithm on a big map.
We are using AStar with the Air Distance heaurstic for now.
Our project involves receiving updates to links in the database.
Currently we restart the search for each link update or at every predefined interval.
Is there a way to update the AStar algorithm to update the search without restarting the search on every update received? Is there a better algorithm suited for this task?
Disclosure: This is part of students project.
Thank you.
You might be looking for a routing algorithm (that by nature deals with constantly changing graphs).
One way to achieve it is with Distance Vector Routing Protocol (which is a distributed version of Bellman Ford algorithm) and works as follows1:
periodically, every vertex sends its "distances vector" to its
neighbors [the vector indicates how much it 'costs' to travel from the sending vertex, to each other vertex.
Its neighbors try to update their routing tables [through which edge is it best to move to the each target]
for your case, each node knows what is the fastest way to get to its neighbors (1 if the graph is unweighted) and it (the vertex) adds this number to each entree in the distance vector, in order to know how to and how much time it will take, to get to the destination. Every time a modification arrives, the relevant node will invoke a new iteration of the protocol until it re-converges.
Note however, that this algorithm is uninformed (but deals well with changing graphs, with certain limitations, there is still the count to infinity problem)
(1) The explanation of the algorithm is based on an explanation I provided some time back in this thread, with some modifications. (It is the same suggested algorithm after all).