Activity selection using Greedy Algorithm in Python - python

Given the problem, I have the following approach however, I am not able to to get all the test cases
Problem Statement: A club has planned to organize several event. The volunteers are given a list of activities and the starting time and ending time of those activities.
Write a python function that accepts the activity list, start_time list and finish_time list. The function should find out and return the list of maximum number of activities that can be performed by a single person.
Assume that a person can work only on a single activity at a time. If an activity performed by a person ends at x unit time then he/she can take up the next activity which is starting at any time greater than or equal to x+1.
def find_maximum_activities(activity_list,start_time_list, finish_time_list):
activities = list(zip(activity_list, start_time_list, finish_time_list))
activities.sort(key = lambda x: x[2])
finish = 0
result = []
for i in activities:
if finish <= i[1]:
result.append(i[0])
finish = i[2]
return result
activity_list=[1,2,3,4,5,6,7]
start_time_list=[1,4,2,3,6,8,6]
finish_time_list=[2,6,4,5,7,10,9]
result=find_maximum_activities(activity_list,start_time_list, finish_time_list)
print("The maximum set of activities that can be completed:",result)

You are missing to update the finish variable.
activities.sort(key=lambda x: x[1])
finish = -1
result = []
for i in activities:
if finish <= i[0]:
result.append(d[i])
finish = i[1]
Try the above snippet.

I don't believe this is a greedy problem.
IMO, it is a DP problem.
Given an Activity you should've computed the answer for each activity that starts after this activity.
So process the activities in decreasing order of start time.
Therefore answer for a given activity will be 1 + max(Answer for all activity that start after this ends).
Make max(Answer for all activity that start after this ends) an O(1) | O(log(n)) operation.

Related

Is there a quick way to loop through lists that need to be sorted after each iteration of a for loop?

Objective
I am currently in the process of creating a simulation in Python to test various load balances for agents to obtain incoming work within service level. I will simplify the scenario in hopes that someone can steer me in the right direction.
Our calls are broken up into priorities: Alpha, Beta, Gamma. Alpha being the highest priority and Gamma being the least priority. To ensure we don't miss service levels, sometimes we will "lock down" our staff so they can only handle Alpha volume.
What I've done
All classes are built: Day, Interval, Call, Agent.
I've successfully created a day composed of 48, half hour intervals. Each interval has its own staff, average handle time, and forecast volume attributes. There are various methods that I've added to track if the call is owned or completed and when the agent becomes available next.
Simulating the day
The calls are in a list sorted by the time they are presented (ascending). I am looping through this list and finding how many calls are unhandled before or at the time the call is offered. At time (t), there may exist any number of calls in the buffer. This list is sorted by oldest call of highest priority. So at calls[0] is the call that has needs to be obtained first and foremost:
for call in curr_day.Calls:
for active_call in [c for c in sorted(curr_day.Calls, key=lambda x: x.Priority, reverse=False) if c.Presented_Time <= call.Presented_Time and call.Completed is False and call.Owned is False]:
Once I get the list of available calls, I then loop through my agents and find a suitable match for the oldest call of highest priority. The agent list is sorted by Busy_Until. At agents[0] is the agent who has the lowest value for Busy_Until and is first to get another call.
for agent in [agt for agt in sorted(curr_day.Staff, key=lambda x: x.Busy_Until, reverse=False)]:
To check if the call is a match for the agent, I have some if statements to validate first.
if active_call.Priority in agent.Lb:
if agent.Busy_Until <= active_call.Presented_Time:
agent.assign_call(active_call, active_call.Presented_Time)
break
else:
agent.assign_call(active_call, agent.Busy_Until)
break
Notes
The act of assigning an alarm to an agent changes the agent's Busy_Until attribute to the time they got the call + average handle time of the call. The call is then tagged as owned, so it can't be seen again by other agents. Agent.Lb is a list of all priorities an agent is open to. If the agent's Busy_Until attribute is less than or equal to the call's presented time, give it to them. If the agent's Busy_Until attribute is greater than the call's presented time, give it to them when they're done.
Issues
This method takes WAY too long to complete. In fact, I didn't see the program complete execution after waiting 10 minutes. We see about 6,000 calls a day, so looping through all of them is very time intensive. I don't see any other way to do this; though, I know there has to be an elegant, efficient way to do so.
Complete code (minus classes)
curr_day = Day()
for call in curr_day.Calls:
for active_call in [c for c in sorted(curr_day.Calls, key=lambda x: x.Priority, reverse=False) if c.Presented_Time <= call.Presented_Time and c.Completed is False and c.Owned is False]:
for agent in [agt for agt in sorted(curr_day.Staff, key=lambda x: x.Busy_Until, reverse=False)]:
if active_call.Priority in agent.Lb:
if agent.Busy_Until <= active_call.Presented_Time:
agent.assign_alarm(active_alarm, active_alarm.Presented_Time)
break
else:
agent.assign_call(active_call, agent.Busy_Until)
break
To anyone that can provide even a little bit of guidance, I thank you.
I was able to find a solution that satisfies the problem. Because each alarm is part of a bucket and each agent is assigned to a bucket, I just created a dictionary of lists. At dict['Alpha'][0] resides the most important call for alphas, dict['Beta'][0], the most important call for betas, etc.
I created a class for the Buffer. Upon initialization, a blank dictionary is created for each priority in each bucket. New calls are added to this buffer in their appropriate indexes by method of appending. Here is the class:
class Buffer:
def __init__(self):
self.Alarms = self.__create()
def __create(self):
queue = defaultdict(list)
for pri in range(30):
queue[pri] = []
return queue
def add(self, alarm_obj):
self.Alarms[alarm_obj.Priority].append(alarm_obj)
def get_next_for_agent(self, agent_obj):
for pri in agent_obj.Lb:
if len(self.Alarms[pri]) != 0:
return self.Alarms[pri].pop(0)
Here is the main code:
curr_day = Day()
alarm_buffer = Buffer()
for interval in curr_day.Intervals:
for alarm in interval.Alarms:
alarm_buffer.add(alarm)
active = interval.get_active_staff(alarm.Presented_Time)
for agent in active:
a = alarm_buffer.get_next_for_agent(agent)
if a is not None:
if a.Presented_Time <= agent.Busy_Until:
obtained = agent.Busy_Until
else:
obtained = a.Presented_Time
agent.assign_alarm(a, obtained)
break
curr_day.export_results()

Constrained-random assigning tasks to people for a week, non-consecutively

I would like to assign randomly tasks from a list of 8 tasks to 4 people every day in a week with these conditions:
everyone gets exactly 2 tasks per day (order doesn't matter) AND
can't assign task to the same person on 2+ consecutive days (a person can't get the same tasks next day) AND
can't assign same task to people on the same day AND
a person can't do the same task more than 2 times in a week
Here is my code for one single day. But how to program the code for the 7 days in a week, enforcing the above conditions?
import random
tasks = ['task1','task2','task3','task4','task5','task6','task7','task8',]
people = ['person1', 'person2', 'person3', 'person4']
random.shuffle(tasks)
tasks = zip(*[iter(tasks)]*2)
for n,person in enumerate(people):
print person, tasks[n]
There are lots of ways to approach this, but one would be to just allocate them at random, check if they meet your rules, and if not then reallocate them.
I would probably do this by defining a couple of functions that you can use to check if any given allocation matches your rules.
For example:
import random
def no_consecutives(allocation):
"""Check that there are no consecutive list items"""
for i in range(1, len(allocation)):
if allocation[i] == allocation[i-1]:
return False
return True
def no_more_than_twice(allocation):
"""Check that no list item appears more than twice"""
for i in allocation:
if allocation.count(i) > 2:
return False
return True
tasks = ['task1','task2','task3','task4','task5','task6','task7','task8']
people = ['person1', 'person2', 'person3', 'person4']
answer = {}
i = 0
while i < 4:
allocations = random.choices(tasks, k=7)
if no_consecutives(allocations) and no_more_than_twice(allocations):
answer[people[i]] = allocations
i += 1
print(answer)
Edit: Now that I've shown you how to do it, and you've edited your question to change the conditions, I'll let you take it from here.

Create a list of actions based on probability of each action

I have 3 actions and their probabilities:
walk:5
talk:1
run:2
I need to do ALL of them.
But the most important action should be(but not must) executed first and ONLY ONCE.
So walk has 5 times more chance to be executed before talk.
Also I can't run if walk has never been executed before.
My current solution is expensive but it works. Now I compose a list and the first action will be inserted at the beginning of the list:
actions_poll = ['walk']*5 + ['run']*2 + ['talk']*1
flow_control= []
while len(flow_control) != 3:
action = roll one action from action pool
if action not in flow_control:
* check if action is run and walk is in flow control
flow_control.append(action)
I guess using a list with amount in actions_poll is not the best way, also trying again and again the loop can run for a long time when walk is 5000 and talk is 1.
Suggestions?
You could use np.random.choice to sample the actions according to their probabilities (using the parameter p=) and set to check that all actions has been sampled, like this
import numpy as np
actions = ['walk', 'talk','run']
weights = np.array([5,1,2])
flows_control = set()
flows_decision = []
while len(flows_control) < len(actions):
action = np.random.choice(actions, p=weights/weights.sum(), size=1)[0]
flows_control.add(action)
flows_decision.append(action)
If you want your flows_decision to be a list of unique decisions, simply do:
np.random.choice(actions, p=weights/weights.sum(), size=len(actions), replace=False)

Building a greedy task scheduler - Python algorithm

Working on the following Leetcode problem: https://leetcode.com/problems/task-scheduler/
Given a char array representing tasks CPU need to do. It contains
capital letters A to Z where different letters represent different
tasks.Tasks could be done without original order. Each task could be
done in one interval. For each interval, CPU could finish one task or
just be idle.
However, there is a non-negative cooling interval n that means between
two same tasks, there must be at least n intervals that CPU are doing
different tasks or just be idle.
You need to return the least number of intervals the CPU will take to
finish all the given tasks.
Example:
Input: tasks = ["A","A","A","B","B","B"], n = 2
Output: 8
Explanation: A -> B -> idle -> A -> B -> idle -> A -> B.
I've written code that passes the majority of the Leetcode tests cases, but is failing on a very large input. Here's my code:
import heapq
from collections import Counter
class Solution(object):
def leastInterval(self, tasks, n):
CLOCK = 0
if not tasks:
return len(tasks)
counts = Counter(tasks)
unvisited_tasks = counts.most_common()[::-1]
starting_task, _ = unvisited_tasks.pop()
queue = [[0, starting_task]]
while queue or unvisited_tasks:
while queue and CLOCK >= queue[0][0]:
_, task = heapq.heappop(queue)
counts[task] -= 1
if counts[task] > 0:
heapq.heappush(queue, [CLOCK + 1 + n, task])
CLOCK += 1
if unvisited_tasks:
t, _ = unvisited_tasks.pop()
heapq.heappush(queue, [0, t])
else:
# must go idle
if queue:
CLOCK += 1
return CLOCK
Here's the (large) input case:
tasks = ["G","C","A","H","A","G","G","F","G","J","H","C","A","G","E","A","H","E","F","D","B","D","H","H","E","G","F","B","C","G","F","H","J","F","A","C","G","D","I","J","A","G","D","F","B","F","H","I","G","J","G","H","F","E","H","J","C","E","H","F","C","E","F","H","H","I","G","A","G","D","C","B","I","D","B","C","J","I","B","G","C","H","D","I","A","B","A","J","C","E","B","F","B","J","J","D","D","H","I","I","B","A","E","H","J","J","A","J","E","H","G","B","F","C","H","C","B","J","B","A","H","B","D","I","F","A","E","J","H","C","E","G","F","G","B","G","C","G","A","H","E","F","H","F","C","G","B","I","E","B","J","D","B","B","G","C","A","J","B","J","J","F","J","C","A","G","J","E","G","J","C","D","D","A","I","A","J","F","H","J","D","D","D","C","E","D","D","F","B","A","J","D","I","H","B","A","F","E","B","J","A","H","D","E","I","B","H","C","C","C","G","C","B","E","A","G","H","H","A","I","A","B","A","D","A","I","E","C","C","D","A","B","H","D","E","C","A","H","B","I","A","B","E","H","C","B","A","D","H","E","J","B","J","A","B","G","J","J","F","F","H","I","A","H","F","C","H","D","H","C","C","E","I","G","J","H","D","E","I","J","C","C","H","J","C","G","I","E","D","E","H","J","A","H","D","A","B","F","I","F","J","J","H","D","I","C","G","J","C","C","D","B","E","B","E","B","G","B","A","C","F","E","H","B","D","C","H","F","A","I","A","E","J","F","A","E","B","I","G","H","D","B","F","D","B","I","B","E","D","I","D","F","A","E","H","B","I","G","F","D","E","B","E","C","C","C","J","J","C","H","I","B","H","F","H","F","D","J","D","D","H","H","C","D","A","J","D","F","D","G","B","I","F","J","J","C","C","I","F","G","F","C","E","G","E","F","D","A","I","I","H","G","H","H","A","J","D","J","G","F","G","E","E","A","H","B","G","A","J","J","E","I","H","A","G","E","C","D","I","B","E","A","G","A","C","E","B","J","C","B","A","D","J","E","J","I","F","F","C","B","I","H","C","F","B","C","G","D","A","A","B","F","C","D","B","I","I","H","H","J","A","F","J","F","J","F","H","G","F","D","J","G","I","E","B","C","G","I","F","F","J","H","H","G","A","A","J","C","G","F","B","A","A","E","E","A","E","I","G","F","D","B","I","F","A","B","J","F","F","J","B","F","J","F","J","F","I","E","J","H","D","G","G","D","F","G","B","J","F","J","A","J","E","G","H","I","E","G","D","I","B","D","J","A","A","G","A","I","I","A","A","I","I","H","E","C","A","G","I","F","F","C","D","J","J","I","A","A","F","C","J","G","C","C","H","E","A","H","F","B","J","G","I","A","A","H","G","B","E","G","D","I","C","G","J","C","C","I","H","B","D","J","H","B","J","H","B","F","J","E","J","A","G","H","B","E","H","B","F","F","H","E","B","E","G","H","J","G","J","B","H","C","H","A","A","B","E","I","H","B","I","D","J","J","C","D","G","I","J","G","J","D","F","J","E","F","D","E","B","D","B","C","B","B","C","C","I","F","D","E","I","G","G","I","B","H","G","J","A","A","H","I","I","H","A","I","F","C","D","A","C","G","E","G","E","E","H","D","C","G","D","I","A","G","G","D","A","H","H","I","F","E","I","A","D","H","B","B","G","I","C","G","B","I","I","D","F","F","C","C","A","I","E","A","E","J","A","H","C","D","A","C","B","G","H","G","J","G","I","H","B","A","C","H","I","D","D","C","F","G","B","H","E","B","B","H","C","B","G","G","C","F","B","E","J","B","B","I","D","H","D","I","I","A","A","H","G","F","B","J","F","D","E","G","F","A","G","G","D","A","B","B","B","J","A","F","H","H","D","C","J","I","A","H","G","C","J","I","F","J","C","A","E","C","H","J","H","H","F","G","E","A","C","F","J","H","D","G","G","D","D","C","B","H","B","C","E","F","B","D","J","H","J","J","J","A","F","F","D","E","F","C","I","B","H","H","D","E","A","I","A","B","F","G","F","F","I","E","E","G","A","I","D","F","C","H","E","C","G","H","F","F","H","J","H","G","A","E","H","B","G","G","D","D","D","F","I","A","F","F","D","E","H","J","E","D","D","A","J","F","E","E","E","F","I","D","A","F","F","J","E","I","J","D","D","G","A","C","G","G","I","E","G","E","H","E","D","E","J","B","G","I","J","C","H","C","C","A","A","B","C","G","B","D","I","D","E","H","J","J","B","F","E","J","H","H","I","G","B","D"]
n = 1
My code is outputting an interval count of 1002, and the correct answer is 1000. Because the input size is so large, I'm having trouble debugging by hand on where this is going wrong.
My algorithm essentially does the following:
Build a mapping of character to number of occurrences
Start with the task that occurs the largest number of times.
When you visit a task, enqueue the next task to be CLOCK + interval iterations later, because my premise is that you want to visit a task as soon as you're able to do so.
If can't visit an already-visited task, enqueue a new one, and do so without incrementing the clock.
If you have elements in the queue, but not enough time has passed, increment the clock.
At the end, the CLOCK variable describes how long (in other words, how many "intervals") passed before you're able to run all tasks.
Can someone spot the bug in my logic?
Consider a case where the delay n=1, and you have a task distribution like so, for which the least number of cycles is just the length of the list (the tasks could be run like "ABCABC...D"):
{"A": 100, "B": 100, "C": 99, "D": 1 } # { "task": <# of occurrences>, ...
Using your algorithm, you would process all the cases of "A" and "B" first, since you want to move onto the next task in the same type as soon as possible, without considering other task types. After processing those two, you're left with:
{"C": 99, "D": 1}
which results in at least 96 idle cycles.
To fix this, the ideal task configuration would be something like a round robin of sorts.

How to find top 10 elements in MapReduce

I am trying to write a Python MapReduce job on some datasets I have to find certain statistics. This is a example of the input data and the form it comes in:
exchange, stock_symbol, date, stock_price_open,stock_price_high,stock_price_low, stock_price_close, stock_volume,stock_price_adj_close.
I need to use the find the top 10 days on which the most stock was traded which is calculated from: stock_price_close * stock_volume
Here is the code I have right now:
from mrjob.job import MRJob
class MapReduce(MRJob):
def mapper(self, _, line):
values = line.split(',')
amount = int(float(values[6]) * float(values[7]))
code = values[1]
date = values[2]
list = (code, date, amount)
yield(None, list)
if __name__ == '__main__':
MapReduce.run()
I'm having trouble implementing a Reducer method for this job however, and not sure how the Reducer will work and find the top 10 elements only. Can anyone help me out here?
Make this a multi-step job. The end result of the first step is per day, the total amount traded. The second gets the totals, sorts them, and returns the top 10.

Categories

Resources