Persons Matching program or algorithm - python

In a networking event there is a pool of persons. Each person should meet all other persons in a 5 person setting for 10 minutes.
For example if the pool is having 60 persons. Each person should meet other 4 persons in the same table for 10 minutes. after some time this person should have met all other 59 persons.
Is there any ready-made algorithm/program in Python or Excel that one can input the pool and have the output as a list of 5 persons-lists that is satisfy the condition: each person has met the whole pool with minimal repetitions?
Thanks

There is no general algorithm for this.
As https://www.dmgordon.org/cover/ explains, this is called a covering design. Optimal covering designs for the problem you are interested in, 5 element sets covering all 2 element sets, are known for many number of vertices, v. See https://ljcr.dmgordon.org/cover.php?vopt=%3C%3D&v=100&kopt=%3D&k=5&topt=%3D&t=2&sizeopt=%3D&size=&creator=&method=&time=A&submit=search for a list, and when they were discovered. The variety of different sources for that list, including papers within the last 20 years, should demonstrate that this is a hard problem in general.

Related

Recursive function with growing number of calls

So the idea is I'm using steam API to get list of friends of the given user, to gather some ID's for the data analysis. Each time I get friendlist of a user I want to get the 5 friends of his 5 friends. So first I get 5 friends of first user. And then I get the 5 friends of 5 friends so it's 5 -> 25 -> 125 and so on up until some points for example 6 times to get 15 625 ID's. And the question is how to do it because I don't really know how to make this really work. I'm not so good at recursion
Basicly you can imagine a person as a node who has n neighboring nodes (= friends) and you start at one (= yourself) and move on to your neighbor nodes (=friends) then you move on to their neighboring nodes and so on while always keeping track of which nodes you have already visited. This way you are gradually moving away from your start node, until the whole network is explored (you don't want that in your case) or until a certain distance (= nodes between you and your friends) is reached, so for example up to the 6th level as you've described in your post.
The network of friends builds a graph data structure and what you want to do is a well known graph algorithm called breadth-first search. In the wikipedia article you will find some pseudo code and if you google for breadth-first search you will find many, many resources and implementations in any language you need.
By the way, no need for recursion here, so don't use it.

python create random list of numbers along with a fixed increment

I have a long-running (several hours) script that periodically sends queries to a server. The server is very sensitive to load, so the queries are sparse (not more than 1 every 3 minutes).
The server will always take exactly 10 minutes to process the query. So I can check the result of query 1 any time after 10 minutes of sending it.
So there are two types of operations, "sending query" and "checking result of query". I want all operations to happen at random intervals (subject to the constraint than there are at least 3 minutes between adjacent operations)
Following the advice in this answer (https://stackoverflow.com/a/51918697/10690958) , I can generate a time-series of integers such that there is a gap of at least 3 between them. Lets all be series 1.
I can also generate a similar time-series of status checking queries (3 minutes between them). Lets call this series 2.
Now series 1 is randomly spaced. Series 2 is also randomly spaced. But there is a correlation between series 1 and 2 ,i.e. "response time"="query time"+10 minutes.
This the union of series 1 and 2 wont be random. Furthermore there is a (very small) possiblility of collision. For example, query 2 might be going out exactly when one is checking the result of query 1.
Is there a way to make union of the two sequences also perfectly random , as well as avoid the possibility of collisions. Ideally all traffic to the server (whether query or status check) should be at perfectly random intervals.
I realize that the title is not very descriptive, but could not figure out a better way to describe the situation. Please edit if you think you have a better description.
For example:
query_sequence=set([3,8,12,21,37])
check_result_sequence=set([13,18,22,31,47])
server_traffic=query_sequence.union(check_result_sequence)
But their union (server_traffic) is not random , since
check_result_sequence=query_sequence+10
P.S.:
Generating time-points with more granularity might help with reducing probability of collisions (as mentioned in the comment). As regards randomness of the union of two sequences, I dont see any satisfactory solution. What I finally decided to do was
check_result_sequence=query_sequence+10+( 5*random.random())
This adds a random "jitter" to the responses sequence, and so should help with reducing correlation between the two sequences.
1) I hardly see the necessity to randomize the interval between the requests
2) You could do a single list: a list which represent the available moments to submit a request
server_traffic=set([3,8,12,15,19,23,26,30,34,40])
for x in range(4):
send_query(server_traffic)
while(True):
send_result_request(server_traffic)
send_query(server_traffic)
Then every time you decide if you want to send a query, or to check the result, with your own policy. This should make everything easier

Create balanced tennis rounds

We are a group of 20 people and we like to go play 2 vs 2 tennis matches. Each of us plays one match each round and we do 5 rounds in total, so everyone plays 5 matches. Matches have two restrictions:
Everyone has a different level (from 1 to 5), so the matches must be balanced: two players with levels 5 and 5 shoulnd't be matched with two levels 1. So between the two teams, the difference in level must be lower or equal to 1.5.
Ej.: level 1.5 and level 2 vs level 2 and level 2.5. The difference in level between teams is 1 so the match is accepted.
If two players play together in one match, they must not play toghether again in the following rounds.
I managed to create a python script that does the specified above, but it takes about 20 minutes to finish depending on the level of the people :/. What I do is shuffle the list with every one in it, break it into 5 lists of 4 people, check if conditions are satisfied and repeat for every round.
I tried modeling the problem to solve it with linear programming (LP) but I don't know which is my function to optimize to begin with... Any ideas on how to do this with or without LP?
Thanks in advance!
You could use a dummy objective or even try to minimize the max of the difference in levels.
My MIP model is not completely trivial, but it solves quite fast (about a second or so using a commercial solver).
The results look ok at first sight:
I assumed two players cannot be in the same team more than once. I.e. not just in the same game. That is in my case you can play against another player more than once.
A more complex example can be found here.

Many to one relationship on a junction table (many to many) or a custom field type?

Currently, I have two tables, Exercise and WorkoutPlan. These will have a many to many relationship.
Exercise
- ID
- Name
...
WorkoutPlan
- ID
- Name
- Exercises (Many to Many with Exercise through WorkoutPlanExercise)
...
In this Many to Many relationship table, I need to store information about a number of sets, such as there min_rest, max_rest, min_repetitions and max_repetitions.
Where I'm stuck is, is trying to figure out the best solution to do this. My first solution is to have another table (WorkoutPlanExerciseSet) that has a many to one relationship with the many to many table (WorkoutPlanExercise), as shown below.
WorkoutPlanExercise
- ID
- ExerciseID
- WorkoutPlanID
- Sets (One to Many with WorkoutPlanExerciseSet)
WorkoutPlanExerciseSet
- WorkoutPlanExerciseID
- MinRepititions
- MaxRepititions
- MinRest
- MaxRest
My second solution is to store all the information about the exercise set, in a single row in the many to many relationship table (WorkoutPlanExercise). For example:
WorkoutPlanExercise
ID ExerciseID WorkoutPlanID Sets Repititions Rest
1 1 1 3 10-12, 10-10, 12-12 90-120, 60-90, 30-30
To note, both the rest time and number of repetitions, can be a range or a single number. For the second solution, I think I would create a custom Django Form Field.
Which is better? Is the former bad database design? Is the latter bad application design?
If it makes any difference, I wish to be able to easily display the information in a user friendly manner, such as:
Example Workout Plan
Exercise Sets Repetitions Rest
Pull Ups 3 10 - 12 90 - 120
8 - 10 30
6 - 8 30
I guess second. Read about custom through field here.
UPDATE: see comments.
UPDATE 2, 3:
Actually, both are very nice. It depends on how you want to process data, stored in Repetitions and Rest fields. If you want to do heavy manipulations and calculations with data, e. g. calculate total rest time for WorkoutPlan or total number of repetitions for Exercise, then using former approach will be slightly easier.
UPDATE 4:
Storing data of the same kind as CSV in one field is bad idea. You will have a lot of fun if you have to change schema in the future. Use first approach. Also link1, link2.

Algorithm in Python to store and search daily occurrence for thousands of numbered events?

I'm investigating solutions of storing and querying a historical record of event occurrences for a large number of items.
This is the simplified scenario: I'm getting a daily log of 200 000 streetlamps (labeled sl1 to sl200000) which shows if the lamp was operational on the day or not. It does not matter for how long the lamp was in service only that it was on a given calendar day.
Other bits of information are stored for each lamp as well and the beginning of the Python class looks something like this:
class Streetlamp(object):
"""Class for streetlamp record"""
def __init__(self, **args):
self.location = args['location']
self.power = args['power']
self.inservice = ???
My py-foo is not too great and I would like to avoid a solution which is too greedy on disk/memory storage. So a solution with a dict of (year, month, day) tuples could be one solution, but I'm hoping to get pointers for a more efficient solution.
A record could be stored as a bit stream with each bit representing a day of a year starting with Jan 1. Hence, if a lamp was operational the first three days of 2010, then the record could be:
sl1000_up = dict('2010': '11100000000000...', '2011':'11111100100...')
Search across year boundaries would need a merge, leap years are a special case, plus I'd need to code/decode a fair bit with this home grown solution. It seems not quiet right. speed-up-bitstring-bit-operations, how-do-i-find-missing-dates-in-a-list and finding-data-gaps-with-bit-masking where interesting postings I came across. I also investigated python-bitstring and did some googling, but nothing seems to really fit.
Additionally I'd like search for 'gaps' to be possible, e.g. 'three or more days out of action' and it is essential that a flagged day can be converted into a real calendar date.
I would appreciate ideas or pointers to possible solutions. To add further detail, it might be of interest that the back-end DB used is ZODB and pure Python objects which can be pickled are preferred.
Create a 2D-array in Numpy:
import numpy as np
nbLamps = 200000
nbDays = 365
arr = np.array([nbLamps, nbDays], dtype=np.bool)
It will be very memory-efficient and you can aggregate easily the days and lamps.
In order to manipulate the days even better, have a look at scikits.timeseries. They will allow you to access the dates with datetime objects.
I'd probably dictionary the lamps and have each of them contain a list of state changes where the first element is the time of the change and the second the value that's valid since that time.
This way when you get to the next sample you do nothing unless the state changed compared to the last item.
Searching is quick and efficient as you can use binary search approaches on the times.
Persisting it is also easy and you can append data to an existing and running system without any problems too as well as dictionary the lamp state lists to further reduce resource usage.
If you want to search for a gap you just go over all the items and compare the next and prev times - and if you decided to dictionary the state lists then you'll be able to do it just once for every different list rather then every lamp and then get all the lamps that had the same "offline" states with just one iteration which may sometimes help

Categories

Resources