Return possible permutations for given array - python

For a given list with string digits, I want to return the different string numbers that can be generated using all the elements in the list (so if there are 5 elements, the number should consist of 5 digits).
The task is to return the possible permutations, the smallest permutation and the maximum permutation in a list.
answer should be converted to integers
If '0' is present in the input, it will produce string numbers with leading zeroes, this is not taken into account when they are converted to integers.
This is my code now:
from itertools import permutations
def proc_arr(arr):
lst = [] # define new list
list_of_tuples = list(permutations(arr, len(arr))) # now they are tuples in a list
# convert to integers in list
separator = [map(str,x) for x in list_of_tuples]
together = [int(''.join(s)) for s in separator]
# append to new list and return the len of possible combinations, min and max value
lst.append(len(together))
lst.append(min(together))
lst.append(max(together))
#print(lst)
return lst
proc_arr(['1','2','2','3','2','3'])
However, I don't understand why I do don't get the right amount of permutations.
input: proc_arr(['1', '2', '2', '3', '2', '3']) output: [60, 122233, 332221]
and I get [720, 122233, 332221]
another example of input & output
input: proc_arr(['1','2','3','0','5','1','1','3']) output: [3360, 1112335, 53321110]

You appear to be counting the same permutation multiple times because you re treating digits that appear multiple times as distinct. That is, for example, 122233 and 122233 are each counted because one has the "first" 3 first, and the second does not.
One solution would be to compute how many duplicates you will have and eliminate them from your count. In your example, there are 3 2s, so there are 1*2*3=6 ways they can be arranged whilst leaving everything else the same; thus, your answer is 6 times too big due to the 2's. Similarly for the 2 3s: 1*2=2 ways, so divide your answer by 2. This gets you to 720/6/2 = 60, the correct answer.

Related

How to make two lists out of three on Python3

The goal is: turn 3 lists of the same length into 2 lists, the length of which does not differ by more than 1, elements from different source lists must alternate and their order must not be violated. For example: if ‘A’ went before ' B’ in the source list, then ‘B’ cannot go before ‘A’ in the final list.
So, I decided to wrote a function:
def list_3_2(list1,list2,list3):
#split all lists in one list
a=sum((list(i) for i in zip(list1,list2,list3)),[])
#I want to separate list "a" to two new lists: l1 and l2
l1=[]
l2=[]
#//////
return(l1,l2)
list_3_2(['1','2','3'],['4','5','6'],['7','8','9'])
Then I faced the problem of separation. I found some similar issues, but their lists were structured differently, like main_list=['3 5', '1 2', '1 7']. I get the another result, after the unification: a=['1', '4', '7', '2', '5', '8', '3', '6', '9']. How could i separate this list?
Maybe I'm misunderstanding the question.
from itertools import chain
def list_3_2(l1, l2, l3):
x = list(chain.from_iterable(zip(l1, l2, l3)))
m = len(x) // 2
return (x[:m], x[m:])
Zip the lists, concatenate the resulting tuples into a single list, then split that in half.
Well, according to the finalizing information you shared, I believe you were extremely close to the solution, you just needed to split the final array. What do you think about this?
def list_3_2(list1,list2,list3):
a=sum((list(i) for i in zip(list1,list2,list3)),[])
return a[:len(a)//2], a[len(a)//2:]
print(list_3_2(['1','2','3'],['4','5','6'],['7','8','9']))
Basically, splitting the list into two parts, where one is at most 1 element larger than the other, corresponds to splitting an arbitrary list at the index of the rounded-down midpoint. The parts will be of different length iff the original array (chain of inputs) has odd length.
The midpoint index of an array is len(array)//2 - len(array) is the full length and //2 is the operation of "division in half rounded down".
Lastly, to split the array using this midpoint index, we use the splicing mechanism in python. The syntax is as follows:
a[:m] = all elements of a from index 0 (= the beginning) to index m-1.
a[m:] = all elements of a from index m to index len(a) - 1 (= the end).
Hey try that hope is that what you want
def list_3_2(list1,list2,list3):
#split all lists in one list
#I want to separate list "a" to two new lists: l1 and l2
a=[]
[a.extend(j) for j in (list(list(i) for i in zip(list1,list2,list3)))]
#//////
div=int(len(a)/2)
l1=a[0:div]
l2=a[div:]
return l1,l2
print(list_3_2(['1','2','3'],['4','5','6'],['7','8','9']))

getting common numbers in each string in list

So I have this list below. I am doing an ip calculator and am currently doing supernetting. I currently have the ip's as binary in a list.
I want to get the common numbers in a list and add them to a new variable so that I'll have a new string with all the common ones etc like it should stop adding the common numbers to a string when the same index's of each binary number stop being the same. I can't figure out how to compare each one though? I tried for loops and everything and it doesn't behave as I want it to.
['11001101011001000000000000000000',
'11001101011001000000000100000000',
'11001101011001000000001000000000',
'11001101011001000000001100000000']
my output for this should be 11001101 . 01100100 . 000000
If I understand you correctly, you are looking for the longest common prefix of all the strings. There are probably more elegant and/or faster ways, but you could e.g. just zip the different strings and takewhile they are all the same, i.e. have only one element as a set.
>>> from itertools import takewhile
>>> lst = ['11001101011001000000000000000000',
... '11001101011001000000000100000000',
... '11001101011001000000001000000000',
... '11001101011001000000001100000000']
...
>>> ''.join(t[0] for t in takewhile(lambda t: len(set(t)) == 1, zip(*lst)))
'1100110101100100000000'
>>> '.'.join(_[i:i+8] for i in range(0, len(_), 8)) # just adding dots...
'11001101.01100100.000000'
Breaking this down a bit:
zip(*lst) iterates the "slices" through all the strings in the list, e.g. ('1', '1', '1', '1') for the first position
takewhile takes that sequence and -- as the name suggests -- takes elements as long as the given condition is true
lambda t: len(set(t)) == 1 is that condition, converting the slice through the strings to a set and checking whether that set has just one element; for ('0', '0', '1', '1'), the set will be {'0', '1'} and thus takewhile stops
''.join(t[0] for ...) joins the same elements back to a string; here, t[0] is just the first element of the tuple of same elements
the last line is just to add the . after 8 digits; here, _ is the result of the previous line

Need help speeding up this function

Input: A list of lists of various positions.
[['61097', '12204947'],
['61097', '239293'],
['61794', '37020977'],
['61794', '63243'],
['63243', '5380636']]
Output: A sorted list that contains the count of unique numbers in a list.
[4, 3, 3, 3, 3]
The idea is fairly simple, I have a list of lists where each list contains a variable number of positions (in our example there is only 2 in each list, but lists of up to 10 exist). I want to loop through each list and if there exists ANY other list that contains the same number then that list gets appended to the original list.
Example: Taking the input data from above and using the following code:
def gen_haplotype_blocks(df):
counts = []
for i in range(len(df)):
my_list = [item for item in df if any(x in item for x in df[i])]
my_list = list(itertools.chain.from_iterable(my_list))
uniq_counts = len(set(my_list))
counts.append(uniq_counts)
clear_output()
display('Currently Running ' +str(i))
return sorted(counts, reverse=True)
I get the output that is expected. In this case when I loop through the first list ['61097', '12204947'] I find that my second list ['61097', '239293'] both contain '61097' so these who lists get concatenated together and form ['61097', '12204947', '61097', '239293']. This is done for every single list outputting the following:
['61097', '12204947', '61097', '239293']
['61097', '12204947', '61097', '239293']
['61794', '37020977', '61794', '63243']
['61794', '37020977', '61794', '63243', '63243', '5380636']
['61794', '63243', '63243', '5380636']
Once this list is complete, I then count the number of unique values in each list, append that to another list, then sort the final list and return that.
So in the case of ['61097', '12204947', '61097', '239293'], we have two '61097', one '12204947' and one '239293' which equals to 3 unique numbers.
While my code works, it is VERY slow. Running for nearly two hours and still only on line ~44k.
I am looking for a way to speed up this function considerably. Preferably without changing the original data structure. I am very new to python.
Thanks in advance!
Too considerably improve the speed of your program, especially for larger data set. The key is to use a hash table, or a dictionary in Python's term, to store different numbers as the key, and the lines each unique number exist as value. Then in the second pass, merge the lists for each line based on the dictionary and count unique elements.
def gen_haplotype_blocks(input):
unique_numbers = {}
for i, numbers in enumerate(input):
for number in numbers:
if number in unique_numbers:
unique_numbers[number].append(i)
else:
unique_numbers[number] = [i]
output = [[] for _ in range(len(input))]
for i, numbers in enumerate(input):
for number in numbers:
for line in unique_numbers[number]:
output[i] += input[line]
counts = [len(set(x)) for x in output]
return sorted(counts, reverse=True)
In theory, the time complexity of your algorithm is O(N*N), N as the size of the input list. Because you need to compare each list with all other lists. But in this approach the complexity is O(N), which should be considerably faster for a larger data set. And the trade-off is extra space complexity.
Not sure how much you expect by saying "considerably", but converting your inner lists to sets from the beginning should speed up things. The following works approximately 2.5x faster in my testing:
def gen_haplotype_blocks_improved(df):
df_set = [set(d) for d in df]
counts = []
for d1 in df_set:
row = d1
for d2 in df_set:
if d1.intersection(d2) and d1 != d2:
row = row.union(d2)
counts.append(len(row))
return sorted(counts, reverse=True)

creating and sorting a dynamic list of lists

This is a more challenging problem I'm trying to solve. I know that I can create a list of empty lists sortedList =[[],[],[]] and in this case sortedList at index 0 has an empty list, same at index 1 and 2.
I am tasked with gathering input from a user and creating a list of words and stopping when the user types stop. I managed this well enough by doing:
def wordList():
unsortedList=[]
promptUser=""
while promptUser !="stop":
promptUser = input("Type words, one at a time. When you are done, type stop: ")
unsortedList.append(promptUser)
if promptUser =="stop":
unsortedList.pop(-1)
#print(wordList)
wordList()
I had to use sloppy code to not include the word stop using the pop method. Not pretty but it works. My real issue is this. I need a while or for loop to go through the unsortedList and look at every word and evaluate it for the count of each item in the list.
Conceptually I'm okay here, but the challenge is that based on the assessment of each item in the unsortedList, I should create a sortedList that takes all user input grouped by length and creates a new list for each length group so a list of lists dynamically created based on user input, and grouped based on the number of characters.
So I know that a list of lists will follow the index order, the first list will be index 0 and so on. I also understand that it is possible to go through the unsortedList and get a character count of each item in the list. In theory with that information, I could take all words of length n and insert them into a sublist, then find all the words with a different length n and put them in a sublist.
High level my unsortedList will contain various words that can be sorted based on character length. I can assume no word will exceed 10 characters, but empty strings are possible.
Go through this unsortedList and create a sortedList that itself contains sublists holding groupings based on all of the items from the unsortedList so perhaps the returned sortedList might have:
[[],[a,n,t],[red,eye],[toad,roar,flap],[closer,smarter,faster]]
I understand the individual logical steps, but the actual iteration through the unsortedlist, using the evaluations to then create a sortedList with the individual grouped sublists is just beyond me. I learn well by looking at complete code examples, but I just can't find anything here that does this to study so any help is appreciated. Sorry for the lengthy post.
If I understood question right, you want to turn unsorted_list into list of lists, where each list contains only values of equal length. This can be achieved like this:
from collections import defaultdict
unsorted_list = ['a', 'n', 't', 'red', 'eye', 'toad', 'roar', 'flap']
def group_strings_by_length(lst):
grouped = defaultdict(list)
for s in lst:
grouped[len(s)].append(s)
return list(grouped.values())
grouped_strings = group_strings_by_length(unsorted_list)
print(grouped_strings) #[['a', 'n', 't'], ['red', 'eye'], ['toad', 'roar', 'flap']]
this code will find the longest letter in your list, create another list which is that long and sort every word into its bucket. I also fixed your read in loop a bit, no need for a weird way of checking
def wordList():
unsortedList=[]
promptUser=input("Type words, one at a time. When you are done, type stop: ")
while promptUser !="stop":
unsortedList.append(promptUser)
promptUser = input("Type words, one at a time. When you are done, type stop: ")
sortedList = []
#will go from 0 up to and including length of longest letter in your list
for x in range(0,max([len(x) for x in unsortedList])+1):
#Creates an empty entry
sortedList.append([])
#Goes through an unsorted list
for s in unsortedList:
#If length of a word is equal to x it adds it to its bucket
if len(s) == x:
sortedList[x].append(s)
print(sortedList)
Input: ['a', 'eye', 'flap', 'n', 'red', 'roar', 't', 'toad']
Output: [[], ['a', 'n', 't'], [], ['eye', 'red'], ['flap', 'roar', 'toad']]

Check if there are 2 or 3 elements with same value in a list/tuple/etc.

I have a 5-element list and I want to know if there are 2 or 3 equal elements (or two equal and three equal). This "check" would be a part of if condition. Let's say I'm too lazy or stupid to write:
if (a==b and c==d and c==e) or .......... or .........
i know it might be written like this, but not exactly how:
if (a==b and (c==b and (c==e or ....
How do I do it? I also know that you can write something similar to this:
if (x,y for x in [5element list] for y in [5element list] x==y, x not y:
If you just want to check for multiple occurences and the objects are of an hashable type, the following solution could work for you.
You could create a list of your objects
>>>l = [a, b, c, d, e]
Then, you could create a set from the same list and compare the length of the list and the length of the set. If the set has less elements than the list, you know you must have multiple occurences.
>>>if (len(set(l)) < len(l)):
...
Use count. You just want [i for i in myList if myList.count(i) > 1]. This list contains the repeated elements, if it's non-empty you have repeated elements.
Edit: SQL != python, removed 'where', also this'll get slow for bigger lists, but for 5 elements it'll be fine.
You can use collections.Counter, which counts the occurrence of every element in your list.
Once you have the count, just check that your wanted values (2 or 3) are present among the counts.
from collections import Counter
my_data = ['a', 'b', 'a', 'c', 'd']
c=Counter(my_data)
counts = set(c.values())
print 2 in counts or 3 in counts

Categories

Resources