This question already has answers here:
Secret santa algorithm
(9 answers)
Closed 7 years ago.
I am trying to write a script that pairs up men and women for a secret Santa type event. So I have 2 lists of boys and girls, and want to carry out 2 way matching, but at the moment I can only seem to figure out how to do 1 way matching.
Furthermore the problem I have is this... in the example below if Kedrick gets Annabel, then Annabel can't get Kedrick. Kedrick has to get someone else from the list.
My current implementation is as follows, how can I extend its functionality to meet the abovementioned requirements?
boys = ['Kedrick','Jonathan','Tim','Philip','John','Quincy'];
girls = ['Annabel','Janet','Jocelyn','Pamela','Priscilla','Viviana'];
matches = []
for i in boys:
rand - randint(0, len(girls-1)
fullname = "{} matched with {}".format(i, girls(rand)
del girls(rand)
matches.append(fullname)
print matches
This could probably be done with fewer loops and a lot less code but here is my solution! Created 2 dict's to store names and their targets (dict's could be combined or done at the same time to cut down on memory issues, but with a program this size I don't think you would ever run into this issue!
boys = ['Kedrick','Jonathan','Tim','Philip','John','Quincy'];
girls = ['Annabel','Janet','Jocelyn','Pamela','Priscilla','Viviana'];
matchesBoys = {i:{'to':''} for i in boys}
matchesGirls = {i:{'to':''} for i in girls}
for name in boys:
giveTo = girls[random.randint(0, len(girls)-1)]
girls.remove(giveTo)
matchesBoys[name]['to']=giveTo
for name in matchesGirls:
giveTo = boys[random.randint(0, len(boys)-1)]
boys.remove(giveTo)
matchesGirls[name]['to']=giveTo
del boys, girls
for i in matchesBoys:
print "%s matched with %s"%(i, matchesBoys[i]['to'])
for i in matchesGirls:
print '%s matched with %s'%(i, matchesGirls[i]['to'])
Shuffle both lists and put them in a ring, with every other element being from the first or second list. Each person gives a gift to the one on their right. Something similar to a list like this:
[Girl, Boy, Girl, Boy, ..., Boy]
The last element gives a gift to the first.
It works under the assumption that both lists have the same amount of elements and that there are at least four elements in total, otherwise the problem is unsolvable.
This gives one solution that fulfills your constraints. The general solution to the problem is to find a directed bipartite graph between the sets where each vertex have exactly two edges, one incoming and one outgoing. Perhaps the solution to that problem also always creates a ring?
This is an implementation that creates a circle with alternate boy and girl. See #Emil Vickstom's answer for an explanation of the idea.
from random import shuffle
boys = ['Kedrick','Jonathan','Tim','Philip','John','Quincy'];
girls = ['Annabel','Janet','Jocelyn','Pamela','Priscilla','Viviana'];
shuffle(boys)
shuffle(girls)
circle = [person for pair in zip(boys, girls) for person in pair]
print(' -> '.join(circle + circle[:1]))
Output:
Tim -> Priscilla -> Quincy -> Annabel -> John -> Janet -> Kedrick ->
Jocelyn -> Philip -> Pamela -> Jonathan -> Viviana -> Tim
Related
Problem:
Once upon a day, Mary bought a one-way ticket from somewhere to somewhere with some flight transfers.
For example: SFO->DFW DFW->JFK JFK->MIA MIA->ORD.
Obviously, transfer flights at a city twice or more doesn't make any sense. So Mary will not do that.
Unfortunately, after she received the tickets, she messed up the tickets and she forgot the order of the ticket.
Help Mary rearrange the tickets to make the tickets in correct order.
Input:
The first line contains the number of test cases T, after which T cases follow.
For each case, it starts with an integer N. There are N flight tickets follow.
Each of the next 2 lines contains the source and destination of a flight ticket.
Output:
For each test case, output one line containing "Case #x: itinerary", where x is the test case number (starting from 1) and the itinerary is a sorted list of flight tickets that represent the actual itinerary.
Each flight segment in the itinerary should be outputted as pair of source-destination airport codes.
Sample Input: Sample Output:
2 Case #1: SFO-DFW
1 Case #2: SFO-DFW DFW-JFK JFK-MIA MIA-ORD
SFO
DFW
4
MIA
ORD
DFW
JFK
SFO
DFW
JFK
MIA
My question:
I am a beginner in the field of competitive programming. My question is how to interpret the given input in this case. How did Googlers program this input? When I write a function with a Python array as its argument, will this argument be in a ready-to-use array format or will I need to deal with the above mentioned T and N numbers in the input and then arrange airport strings in an array format to make it ready to be passed in the function's argument?
I have looked up at the following Google Kickstart's official Python solution to this problem and was confused how they simply pass the ticket_list argument in the function. Don't they need to clear the input from the numbers T and N and then arrange the airport strings into an array, as I have explained above?
Also, I could not understand how could the methods first and second simply appear if no Class has been initialized? But I think this should be another question...
def print_itinerary(ticket_list):
arrival_map = {}
destination_map = {}
for ticket in ticket_list:
arrival_map[ticket.second] += 1
destination_map[ticket.first] += 1
current = FindStart(arrival_map)
while current in destination_map:
next = destination_map[current]
print current + "-" + next
current = next
You need to implement it yourself to read data from standard input and write results to standard output.
Sample code for reading from standard input and writing to standard output can be found in the coding section of the FAQ on the KickStart Web site.
If you write the solution to this problem in python, you can get T and N as follows.
T = int(input())
for t in range(1, T + 1):
N = int(input())
...
Then if you want to get the source and destination of the flight ticket as a list, you can use the same input method to get them in the list.
ticket_list = [[input(), input()] for _ in range(N)]
# [['MIA', 'ORD'], ['DFW', 'JFK'], ['SFO', 'DFW'], ['JFK', 'MIA']]
If you want to use first and second, try a namedtuple.
Pair = namedtuple('Pair', ['first', 'second'])
ticket_list = [Pair(input(), input()) for _ in range(N)]
Dr. Smith was killed in the studio with a knife by one of his heirs! Create a script to find the murderer! Make sure to show your answer.
The following people are Smith's heirs: Aiden, Tori, Lucas, Isabelle.
The following people were in the studio: Lucas, Natalie, Tori.
The following people own a knife: Isabelle, Tori, Natalie.
My code:
heirs = ["Aiden", "Tori", "Lucas", "Isabelle"]
ppleinstudio = ["Lucas", "Natalie", "Tori"]
knife = ["Isabelle", "Tori", "Natalie"]
# killer is the one who exists in three of the lists
# merge the lists
merged = [*heirs,*ppleinstudio,*knife]
L1=[]
for i in merged:
if i not in L1:
L1.append(i)
else:
print(i,end=' ')
output:
Lucas Tori Isabelle Tori Natalie
What am I missing to get it to look for the repeating name?
I am not sure that the code you implemented is doing what you wanted it to do, maybe you should try to check the contents of the merged list and see what happens as you iterate through the for loop.
Nevertheless, for the sake of providing a solution to your problem, if you are allowed to use sets you could easily solve this by doing the following:
heirs = ["Aiden", "Tori", "Lucas", "Isabelle"]
ppleinstudio = ["Lucas", "Natalie", "Tori"]
knife = ["Isabelle", "Tori", "Natalie"]
h_set = set(heirs)
s_set = set(ppleinstudio)
k_set = set(knife)
culprit = h_set.intersection(s_set.intersection(k_set)).pop()
print(culprit)
>> 'Tori'
But if this is some kind of homework you should probably try to work your way to a solution on paper/whiteboard first, and figure out why your approach is not working.
You could do something like this, cycle through each entry in the merged list, and break the three requirements into three boolean statements:
heirs = ["Aiden", "Tori", "Lucas", "Isabelle"]
ppleinstudio = ["Lucas", "Natalie", "Tori"]
knife = ["Isabelle", "Tori", "Natalie"]
# killer is the one who exists in three of the lists
# merge the lists
merged = [*heirs,*ppleinstudio,*knife]
for person in merged:
is_heir = person in heirs
is_in_studio = person in ppleinstudio
has_knife = person in knife
if(is_heir and is_in_studio and has_knife):
print(person)
break
Output:
Tori
This will be a little inefficient because if you print out the contents of merged, you'll notice that there are duplicate names, but seeing as your question doesn't mention anything about efficiency - this will get the job done just fine.
If you are concerned about this inefficiency you can use the set operator on the merged list and iterate over that instead:
merged = set(merged)
Let's say i have a file containing data on users and their favourite movies.
Ace: FANTASTIC FOUR, IRONMAN
Jane: EXOTIC WILDLIFE, TRANSFORMERS, NARNIA
Jack: IRONMAN, FANTASTIC FOUR
and based of that, the program I'm about to write returns me the name of the users that likes the same movies.
Since Ace and Jack likes the same movie, they will be partners hence the program would output:
Movies: FANTASTIC FOUR, IRONMAN
Partners: Ace, Jack
Jane would be exempted since she doesn't have anyone who shares the same interest in movies as her.
The problem I'm having now is figuring out on how Radix Sort would help me achieve this as I've been thinking whole day long. I don't have much knowledge on radix sort but i know that it compares elements one by one but I'm terribly confused in cases such as FANTASTIC FOUR being arranged first in Ace's data and second in Jack's data.
Would anyone kindly explain some algorithms that i could understand to achieve the output?
Can you show us how you sort your lists ? The quick and dirty code below give me the same output for sorted Ace and Jack.
Ace = ["FANTASTIC FOUR", "IRONMAN"]
Jane = ["EXOTIC WILDLIFE", "TRANSFORMERS", "NARNIA"]
Jack = ["IRONMAN", "FANTASTIC FOUR"]
sorted_Ace = sorted(Ace)
print (sorted_Ace)
sorted_Jack = sorted(Jack)
print (sorted_Jack)
You could start comparing elements one by one from here.
I made you a quick solution, it can show you how you can proceed as it's not optimized at all and not generalized.
Ace = ["FANTASTIC FOUR", "IRONMAN"]
Jane = ["EXOTIC WILDLIFE", "TRANSFORMERS", "NARNIA"]
Jack = ["IRONMAN", "FANTASTIC FOUR"]
Movies = []
Partners = []
sorted_Ace = sorted(Ace)
sorted_Jane = sorted(Jane)
sorted_Jack = sorted(Jack)
for i in range(len(sorted_Ace)):
if sorted_Ace[i] == sorted_Jack[i]:
Movies.append(sorted_Ace[i])
if len(Movies) == len(sorted_Ace):
Partners.append("Ace")
Partners.append("Jack")
print(Movies)
print(Partners)
Edit: just found out that I'm using py 2.6.2 (work installed so I can't do much about that)
So I'm trying to find the best way to sort a list based on 2 different class attributes
This list is basically some info for moving people from room to room in a company where some people might be part of a chain move
(i.e. Joe Blow has to move before we can move Jane Doe into Joe's spot and Jane has to move before John Wick can move into Jane's spot etc.)
I get all the info something like below but there can also be people that aren't part of the chain move like Dan Man in the example below.
John Wick 303.10 -> 415.09
Dan Man 409.08 -> 221.02
Joe Blow 225.06 -> 512.01
Jane Doe 415.09 -> 225.06
I have all the relevant info split into a class with
startRoom
endRoom
originalString
So that part isn't an issue but when I try to "brute force" sort it like below: (Note, I do the list(chains) as it is previously a set to make sure I don't get doubles in there)
def sortChains():
global chains
#convert the set of chains to a list for list functions
chains = list(chains)
for x, move1 in enumerate(chains):
for y, move2 in enumerate(chains):
if move1.startRoom == move2.endRoom:
temp = chains[y]
chains.remove(move2)
chains.insert(x,temp)
continue
My problem is the sorting. One part of the problem is finding the person that is at the start of the chain and then sorting correctly after that.
Any ideas/help is totally appreciated. And yes I know a double loop while moving stuff in the loop isn't the best but it's been the best I could think of at the time.
First, you have to create a dependency graph and determine (a) which person has to move before some other person can move, and (b) which persons can move right now. We can use a 1:1 mapping here, but in the more general case, you might have to use a 1:n, n:1, or n:m mapping.
moves = {"John Wick": ("303.10", "415.09"),
"Dan Man": ("409.08", "221.02"),
"Joe Blow": ("225.06", "512.01"),
"Jane Doe": ("415.09", "225.06")}
# or dict((move.originalString, (move.startRoom, move.endRoom)) for move in list_of_moves)
# mapping {initial room -> name}
rooms = {start: name for (name, (start, end)) in moves.items()}
# Python 2.6: dict((start, name) for (name, (start, end)) in moves.items())
# mapping {moves_first: moves_after}
before = {rooms[end]: name for name, (start, end) in moves.items() if end in rooms}
# Python 2.6: dict((rooms[end], name) for name, (start, end) in moves.items() if end in rooms)
# persons that can move now
can_move = set(moves) - set(before.values())
Now, we can see who can move, move that person, and then update the persons who can move based on what person had to wait for that person to move, if any.
result = []
while can_move:
# get person that can move, add to result
name = can_move.pop()
result.append(name)
# add next to can_move set
if name in before:
can_move.add(before.pop(name))
Afterwards, result is ['Joe Blow', 'Jane Doe', 'John Wick', 'Dan Man']
Complexity should be O(n), but of course, this will fail if there are cyclic dependencies.
def do(moves):
"""RETURNS: [0] Sequence of persons to move.
[1] Remainder
"""
# (following line copied from 'tobias_k', replaced 'rooms' with 'current_db')
# map: target position to person who occupies it
current_db = { start: name for (name, (start, end)) in moves.items() }
# maintain set of persons who are free to move to their target location
liberated_set = set()
# map occupier of a location -> set of people who would take his place.
liberation_db = defaultdict(set)
# whosoever wants to move to a free place -> liberated.
# else -> liberation_db
for name, (start, end) in moves.items():
occupier = current_db.get(start)
if occupier is None: liberated_set.add(name)
else: liberation_db[occupier].add(name)
sequence = []
while liberated_set:
# add people to the sequence who are free to move
sequence.extend(liberated_set)
# get new set of people who are free to move to their target
# because their target position is no longer occupied.
new_liberated_set = set()
for occupier in liberated_set:
if not occupier in liberation_db: continue
new_liberated_set.extend(liberation_db[occupier])
del liberation_db[occupier]
liberated_set = new_liberated_set
return sequence, set(liberation_db.values())
The title for this one was quite tricky.
I'm trying to solve a scenario,
Imagine a survey was sent out to XXXXX amount of people, asking them what their favourite football club was.
From the response back, it's obvious that while many are favourites of the same club, they all "expressed" it in different ways.
For example,
For Manchester United, some variations include...
Man U
Man Utd.
Man Utd.
Manchester U
Manchester Utd
All are obviously the same club however, if using a simple technique, of just trying to get an extract string match, each would be a separate result.
Now, if we further complication the scenario, let's say that because of the sheer volume of different clubs (eg. Man City, as M. City, Manchester City, etc), again plagued with this problem, its impossible to manually "enter" these variances and use that to create a custom filter such that converters all Man U -> Manchester United, Man Utd. > Manchester United, etc. But instead we want to automate this filter, to look for the most likely match and converter the data accordingly.
I'm trying to do this in Python (from a .cvs file) however welcome any pseudo answers that outline a good approach to solving this.
Edit: Some additional information
This isn't working off a set list of clubs, the idea is to "cluster" the ones we have together.
The assumption is there are no spelling mistakes.
There is no assumed length of how many clubs
And the survey list is long. Long enough that it doesn't warranty doing this manually (1000s of queries)
Google Refine does just this, but I'll assume you want to roll your own.
Note, difflib is built into Python, and has lots of features (including eliminating junk elements). I'd start with that.
You probably don't want to do it in a completely automated fashion. I'd do something like this:
# load corrections file, mapping user input -> output
# load survey
import difflib
possible_values = corrections.values()
for answer in survey:
output = corrections.get(answer,None)
if output = None:
likely_outputs = difflib.get_close_matches(input,possible_values)
output = get_user_to_select_output_or_add_new(likely_outputs)
corrections[answer] = output
possible_values.append(output)
save_corrections_as_csv
Please edit your question with answers to the following:
You say "we want to automate this filter, to look for the most likely match" -- match to what?? Do you have a list of the standard names of all of the possible football clubs, or do the many variations of each name need to be clustered to create such a list?
How many clubs?
How many survey responses?
After doing very light normalisation (replace . by space, strip leading/trailing whitespace, replace runs of whitespace by a single space, convert to lower case [in that order]) and counting, how many unique responses do you have?
Your focus seems to be on abbreviations of the standard name. Do you need to cope with nicknames e.g. Gunners -> Arsenal, Spurs -> Tottenham Hotspur? Acronyms (WBA -> West Bromwich Albion)? What about spelling mistakes, keyboard mistakes, SMS-dialect, ...? In general, what studies of your data have you done and what were the results?
You say """its impossible to manually "enter" these variances""" -- is it possible/permissible to "enter" some "variances" e.g. to cope with nicknames as above?
What are your criteria for success in this exercise, and how will you measure it?
It seems to me that you could convert many of these into a standard form by taking the string, lower-casing it, removing all punctuation, then comparing the start of each word.
If you had a list of all the actual club names, you could compare directly against that as well; and for strings which don't match first-n-letters to any actual team, you could try lexigraphical comparison against any of the returned strings which actually do match.
It's not perfect, but it should get you 99% of the way there.
import string
def words(s):
s = s.lower().strip(string.punctuation)
return s.split()
def bestMatchingWord(word, matchWords):
score,best = 0., ''
for matchWord in matchWords:
matchScore = sum(w==m for w,m in zip(word,matchWord)) / (len(word) + 0.01)
if matchScore > score:
score,best = matchScore,matchWord
return score,best
def bestMatchingSentence(wordList, matchSentences):
score,best = 0., []
for matchSentence in matchSentences:
total,words = 0., []
for word in wordList:
s,w = bestMatchingWord(word,matchSentence)
total += s
words.append(w)
if total > score:
score,best = total,words
return score,best
def main():
data = (
"Man U",
"Man. Utd.",
"Manch Utd",
"Manchester U",
"Manchester Utd"
)
teamList = (
('arsenal',),
('aston', 'villa'),
('birmingham', 'city', 'bham'),
('blackburn', 'rovers', 'bburn'),
('blackpool', 'bpool'),
('bolton', 'wanderers'),
('chelsea',),
('everton',),
('fulham',),
('liverpool',),
('manchester', 'city', 'cty'),
('manchester', 'united', 'utd'),
('newcastle', 'united', 'utd'),
('stoke', 'city'),
('sunderland',),
('tottenham', 'hotspur'),
('west', 'bromwich', 'albion'),
('west', 'ham', 'united', 'utd'),
('wigan', 'athletic'),
('wolverhampton', 'wanderers')
)
for d in data:
print "{0:20} {1}".format(d, bestMatchingSentence(words(d), teamList))
if __name__=="__main__":
main()
run on sample data gets you
Man U (1.9867767507647776, ['manchester', 'united'])
Man. Utd. (1.7448074166742613, ['manchester', 'utd'])
Manch Utd (1.9946817328797555, ['manchester', 'utd'])
Manchester U (1.989100008901989, ['manchester', 'united'])
Manchester Utd (1.9956787398647866, ['manchester', 'utd'])