I want to create a function that take a lsit as argument, for example:
list = ['a','b','a','d','e','f','a','b','g','b']
and returns a specific number of list elements ( i chose the number) such that no number occurs twice. For example if i chose 3:
new_list = ['a','b','d']
I tried the following:
def func(j, list):
new_list=[]
for i in list:
while(len(new_list)<j):
for k in new_list:
if i != k:
new_list.append(i)
return new_list
But the function went through infinite loop.
def func(j, mylist):
# dedup, preserving order (dict is insertion-ordered as a language guarantee as of 3.7):
deduped = list(dict.fromkeys(mylist))
# Slice off all but the part you care about:
return deduped[:j]
If performance for large inputs is a concern, that's suboptimal (it processes the whole input even if j unique elements are found in first j indices out of an input where j is much smaller than the input), so the more complicated solution can be used for maximum efficiency. First, copy the itertools unique_everseen recipe:
from itertools import filterfalse, islice # At top of file, filterfalse for recipe, islice for your function
def unique_everseen(iterable, key=None):
"List unique elements, preserving order. Remember all elements ever seen."
# unique_everseen('AAAABBBCCDAABBB') --> A B C D
# unique_everseen('ABBCcAD', str.lower) --> A B C D
seen = set()
seen_add = seen.add
if key is None:
for element in filterfalse(seen.__contains__, iterable):
seen_add(element)
yield element
else:
for element in iterable:
k = key(element)
if k not in seen:
seen_add(k)
yield element
now wrap it with islice to only pull off as many elements as required and exiting immediately once you have them (without processing the rest of the input at all):
def func(j, mylist): # Note: Renamed list argument to mylist to avoid shadowing built-in
return list(islice(unique_everseen(mylist), j))
Try this.
lst = ['a','b','a','d','e','f','a','b','g','b']
j = 3
def func(j,list_):
new_lst = []
for a in list_:
if a not in new_lst:
new_lst.append(a)
return new_lst[:j]
print(func(j,lst)) # ['a', 'b', 'd']
I don't know why someone does not post a numpy.unique solution
Here is memory efficient way(I think 😉).
import numpy as np
lst = ['a','b','a','d','e','f','a','b','g','b']
def func(j,list_):
return np.unique(list_).tolist()[:j]
print(func(3,lst)) # ['a', 'b', 'd']
list is a reserved word in python.
If order of the elements is not a concern then
def func(j, user_list):
return list(set(user_list))[:j]
it's bad practice to use "list" as variable name
you can solve the problem by just using the Counter lib in python
from collections import Counter
a=['a','b','a','d','e','f','a','b','g','b']
b = list(Counter(a))
print(b[:3])
so your function will be something like that
def unique_slice(list_in, elements):
new_list = list(Counter(list_in))
print("New list: {}".format(new_list))
if int(elements) <= len(new_list):
return new_list[:elements]
return new_list
hope it solves your question
As others have said you should not Shadow built-in name 'list'. Because that could lead to many issues. This is a simple problem where you should add to a new list and check if the element was already added.
The [:] operator in python lets you separate the list along an index.
>>>l = [1, 2, 3, 4]
>>>l[:1]
[1]
>>>l[1:]
[2, 3, 4]
lst = ['a', 'b', 'a', 'd', 'e', 'f', 'a', 'b', 'g', 'b']
def func(number, _list):
out = []
for a in _list:
if a not in out:
out.append(a)
return out[:number]
print(func(4, lst)) # ['a', 'b', 'd', 'e']
Related
so the below code should be talking the unique characters in msg and the unique characters and making a list containing two sublists. An example would be
crack_the_code('hello there', 'abccd eabfb')
should return
[['h', 'e', 'l', 'o', 't', 'r'], ['a', 'b', 'c', 'd', 'e', 'f']].
What I have tried to do below is made three lists and then ran a for loop to check if i is in the new list (unique) if not adds it to the list, same was done for unique_code.
Then finally put the two lists together and returned but when I print I get none. Any help would be appreciated.
def crack_the_code(msg, code):
unique = []
unique_code = []
cracked = []
for i in msg:
if i not in unique:
unique.extend(i)
for item in code:
if item not in unique_code:
unique_code.extend(item)
cracked = unique.append(unique_code)
return cracked
print(crack_the_code('hello there', 'abcd eabfb'))
You get None, because unique.append(unique_code) mutates unique and does not return a modified list, but None (as all functions mutating the input should). You can do return [unique, unique_code] instead.
After having fixed your return, you should use a better algorithm. Whenever you check if i not in unique, this linearly checks the list unique for the value i, making it O(n^2) in total.
This is using the itertools recipe unique_everseen, which keeps the original order and is O(n), because it uses a set to keep track of already seen letters:
from itertools import filterfalse
def unique_everseen(iterable):
"List unique elements, preserving order. Remember all elements ever seen."
# unique_everseen('AAAABBBCCDAABBB') --> A B C D
seen = set()
seen_add = seen.add
for element in filterfalse(seen.__contains__, iterable):
seen_add(element)
yield element
def crack_the_code(msg, code):
return [list(unique_everseen(msg)), list(unique_everseen(code))]
If you cannot use itertools, you can also write it yourself (probably slightly slower):
def unique_everseen(iterable):
"List unique elements, preserving order. Remember all elements ever seen."
# unique_everseen('AAAABBBCCDAABBB') --> A B C D
seen = set()
seen_add = seen.add
for element in iterable:
if element not in seen:
seen_add(element)
yield element
And if you don't care about the order, just use set:
def crack_the_code(msg, code):
return [list(set(msg)), list(set(code))]
Swap your extend with append and your append with extend. I think you got them confused in terms of functionality.
You append an element to a list.
You extend a list to another list.
Also, you used [item] in the second for loop but you were adding [i] into the list. Change that to [item] then below code works:
def crack_the_code(msg, code):
unique = []
unique_code = []
cracked = []
for i in msg:
if i not in unique:
unique.append(i)
for item in code:
if item not in unique_code:
unique_code.append(item)
cracked = unique + unique_code
return cracked
print(crack_the_code('hello there', 'abcd eabfb'))
I have a bit of code that runs many thousands of times in my project:
def resample(freq, data):
output = []
for i, elem in enumerate(freq):
for _ in range(elem):
output.append(data[i])
return output
eg. resample([1,2,3], ['a', 'b', 'c']) => ['a', 'b', 'b', 'c', 'c', 'c']
I want to speed this up as much as possible. It seems like a list comprehension could be faster. I have tried:
def resample(freq, data):
return [item for sublist in [[data[i]]*elem for i, elem in enumerate(frequencies)] for item in sublist]
Which is hideous and also slow because it builds the list and then flattens it. Is there a way to do this with one line list comprehension that is fast? Or maybe something with numpy?
Thanks in advance!
edit: Answer does not necessarily need to eliminate the nested loops, fastest code is the best
I highly suggest using generators like so:
from itertools import repeat, chain
def resample(freq, data):
return chain.from_iterable(map(repeat, data, freq))
This will probably be the fastest method there is - map(), repeat() and chain.from_iterable() are all implemented in C so you technically can't get any better.
As for a small explanation:
repeat(i, n) returns an iterator that repeats an item i, n times.
map(repeat, data, freq) returns an iterator that calls repeat every time on an element of data and an element of freq. Basically an iterator that returns repeat() iterators.
chain.from_iterable() flattens the iterator of iterators to return the end items.
No list is created on the way, so there is no overhead and as an added benefit - you can use any type of data and not just one char strings.
While I don't suggest it, you are able to convert it into a list() like so:
result = list(resample([1,2,3], ['a','b','c']))
import itertools
def resample(freq, data):
return itertools.chain.from_iterable([el]*n for el, n in zip(data, freq))
Besides faster, this also has the advantage of being lazy, it returns a generator and the elements are generated step by step
No need to create lists at all, just use a nested loop:
[e for i, e in enumerate(data) for j in range(freq[i])]
# ['a', 'b', 'b', 'c', 'c', 'c']
You can just as easily make this lazy by removing the brackets:
(e for i, e in enumerate(data) for j in range(freq[i]))
The function satisfiesF() takes a list L of strings as a paramenter. function f takes a string as a parameter returns true or false. Function satisfiesF() modifies L to contain only those strings,s for which f(s) returns true.
I have two different programs aimed to produce the same output. But I am getting different outputs.
First program:
def f(s):
return 'a' in s
def satisfiesF(L):
k=[]
for i in L:
if f(i)==True:
k.append(i)
L=k
print L
print
return len(L)
L = ['a', 'b', 'a']
print satisfiesF(L)
print L
Output:
['a', 'a']
2
['a', 'b', 'a']
Second program:
def f(s):
return 'a' in s
def satisfiesF(L):
for i in L:
if f(i)==False:
L.remove(i)
print L
print
return len(L)
L = ['a', 'b', 'a']
print satisfiesF(L)
print L
output:
['a', 'a']
2
['a', 'a']
Please explain why these are giving differnt outputs.
In your second function you are seeing 2 as the length and all the elements in L outside the function because you are setting a local variableL which is a reference to k, your L created outside the function is not affected. To see the change in L you would need to use L[:] = k, then printing L will give you ['a', 'a'] outside the function as you are changing the original list object L list passed in to the function.
In the first you are directly modifying L so you see the changes in L outside the function.
Also never iterate over a list you are removing element from, if you make
L = ['a', 'b', 'a','a','d','e','a'], you will get behaviour you won't expect. Either make a copy for i in L[:] or use reversed for i in reversed(L):
In the first function, you assign over L in satisfiesF(), but you never modify the original list. When you write L=k, that makes the reference L now refer to the same list as k. It doesn't assign to the original list.
In contrast, in the second function you modify L without reassigning to it.
Also, as a side note, you shouldn't modify a list while you iterate over it.
As a second side note, you can rewrite satisfiesF as a one-liner:
L = [item for item in L if f(item)]
This was down voted mistakenly. The question was changed. So, the answer got outdated. Following is the answer for changed question:
L=k
Above would mean that we lost the reference to L.
So, Try this:
To the 1st program, comment the above assignment, do below, to retain reference to L:
# L=k
del L[:]
L[:] = k
Now both programs will output same, below one:
['a', 'a']
2
['a', 'a']
Best of luck.
In the question, there are two Ls. A global one and a local one. The
print L
statement prints the GLOBAL L, which you did not mutate in the programme.
Therefore, in order to let the programme knows that you want to mutate the global L, instead of the local L, you can add the line
globals()['L'] = L
to your first programme. I hope this can help!
In the first program, if you want to mutate the original list L and see the change made by your function, you should replace L = K in your code with L[:] = k:
def satisfiesF(L):
k=[]
for i in L:
if f(i)==True:
k.append(i)
# new code --------
L[:] = k # before: L = K
# -----------------
print L
print
return len(L)
This will give you ['a', 'a'] outside the function.
About mutating a list within a loop in the second program...
Just to remember that during a "for" loop, python keeps track of where it is in the list using an internal counter that is incremented at the end of each iteration.
When the value of the counter reaches the current length of the list, the loop terminates. This means that if you are mutating the list within the loop you can have surprising consequence.
For example, look at the for loop below:
my_list = ['a', 'b', 'c', 'd']
>>>print "my_list - before loop: ", my_list
my_list - before loop: ['a', 'b', 'c', 'd']
for char in my_list:
if char == 'a' or char == 'b':
my_list.remove(char)
>>>print "my_list - after loop: ", my_list
my_list - after loop: ['b', 'c', 'd']
Here, the hidden counter starts out at the index 0, discovers that "a" (in my_list[0]) is in the list, and remove it, reducing the length of my_list to 3 (from the initial 4). The counter is then incremented to 1, and the code proceeds to check the "if" condition, at the position my_list[1], in the mutated list (now of len == 3). This means that you will skip the "b" character (present now at the index 0) even if it had to be remove it.
One solution for this is to use slicing to clone and create a new list where you can remove items from it:
cloneList = my_list[:]
I have this program I'm writing where i have two unequal lists, one of the lists has other lists nested inside it so i flattened it and now i'm trying to compare the values in the two lists to find matching pairs and then append it back to the original unflattened list but the program dosen't still produce the expected result i want i've approached this problem in two different ways but i'm still arriving at the same answer here's what i've tried so far
List1 = [['A'],['b']]
List2 = ['a','A','c','A','b','b','A','b' ]
flattened = list(itertools.chain(*List1))
''' a counter i created to keep List1 going out of range and crashing
the program
'''
coun = len(flattened)
coun-=1
x = 0
for idx, i in enumerate(List2):
if i in flattened:
List1[x].append(List2[idx])
if x < coun:
x +=1
print(List1)
and this is the second approach i tried using itertools to zip the two unequal lists
import itertools
List1 = [['A'],['b']]
List2 = ['a','A','c','A','b','b','A','b' ]
flattened = list(itertools.chain(*List1))
''' a counter i created to keep List1 going out of range and crashing
the program
'''
coun = len(flattened)
coun-=1
x = 0
for idx,zipped in enumerate(itertools.zip_longest(flattened,List2)):
result = filter(None, zipped)
for i in result:
if flattened[x] == List2[idx]:
List1[x].append(List2[idx])
if x < coun:
x +=1
print(List1)
Both programs produce the output
[['A', 'A'], ['b', 'A', 'b', 'b', 'A', 'b']]
But I'm trying to arrive at
[['A', 'A', 'A', 'A'], ['b', 'b', 'b', 'b']]
I don't even know if i'm approaching this in the right way but i know the problem is coming from the flattened list not being the same length as List2 but i can't seem to find any way around it...by the way I'm still a newbie in Python so please try and explain your answers so i can learn from you. Thanks
EDIT: This is how i get and set the properties of the objects using the values entered by the user thought it lacks type checking now but that can be added later
class criticalPath:
def __init__(self):
'''
Initialize all the variables we're going to use to calculate the critical path
'''
self.id = None
self.pred = tuple()
self.dur = None
self.est = None
self.lst = None
#list to store all the objects
self.all_objects = list()
def create_objects(self):
return criticalPath()
def get_properties(self):
'''
This functions gets all the input from the user and stores the
activity name a string, the predecessor in a tuple and the duration
in a string
'''
r = criticalPath()
Object_list = list()
num_act = int(input('How many activities are in the project:\n'))
for i in range(num_act):
name = input('what is the name of the activity {}:\n'.format(i+1))
activity_object = r.create_objects()
pred = input('what is the predecessor(s) of the activity:\n')
pred = tuple(pred.replace(',', ''))
dur = input('what is the duration of the activity:\n')
#sets the properties of the objects from what was gotten from the user
activity_object.set_properties(name, pred, dur)
#****
Object_list.append(activity_object)
self.all_objects.append(activity_object)
return Object_list
def set_properties(self, name, predecessor, duration):
self.id = name
self.pred = predecessor
self.dur = duration
so all_objects and Object_list is a list of all the objects created
If you values are immutable use a collections.Counter dict to count the occurrences of elements in List2 and add occurrence * :
List1 = [['A'],['b']]
List2 = ['a','A','c','A','b','b','A','b' ]
from collections import Counter
# gets the frequency count of each element in List2
c = Counter(List2)
# create frequency + 1 objects using the value from our Counter dict
# which is how many times it appears in List2
print([sub * (c[sub[0]] + 1) for sub in List1])
[['A', 'A', 'A', 'A'], ['b', 'b', 'b', 'b']]
You can change the original object using [:]:
List1[:] = [sub * c[sub[0]]+sub for sub in List1]
To do it using enumerate and updating List1:
from collections import Counter
c = Counter(List2)
from copy import deepcopy
# iterate over a copy of List1
for ind, ele in enumerate(deepcopy(List1)):
# iterate over the sub elements
for sub_ele in ele:
# keep original object and add objects * frequency new objects
List1[ind].extend([sub_ele] * c[sub_ele])
print(List1)
If you have mutable values you will need to make copies or create new objects in the generator expression depending on how they are created:
from copy import deepcopy
for ind, ele in enumerate(deepcopy(List1)):
for sub_ele in ele:
List1[ind].extend(deepcopy(sub_ele) for _ in range(c[sub_ele]))
print(List1)
There is no need to check for objects as objects not in List2 will have a value of 0 so 0 * object == no object added.
Based on the edit you can either check every node against every node or use a dict grouping common nodes:
Checking every node:
from copy import deepcopy
for ind, st_nodes in enumerate(starting_nodes):
for node in object_list:
if st_nodes[0].id == node.pred:
starting_nodes[ind].append(deepcopy(node))
print(starting_nodes)
using a dict grouping all nodes by the pred attribute:
from copy import deepcopy
from collections import defaultdict
nodes = defaultdict(list)
for node in object_list:
nodes[node.pred].append(node)
for ind, st_nodes in enumerate(starting_nodes):
starting_nodes[ind].extend(deepcopy(nodes.get(st_nodes[0].id,[])))
For larger input the dict option should be more efficient.
Try this:
matches = set(List1) & set(List2)
I have a nested list:
nested_list = [['a', 3], ['a', 1], ['a', 5]]
How do I iterate over this list, select the sublist with the max integer value?
holder = []
for entry in nested_list:
tmp = sublist with max entry[2] value
holder.append(tmp)
I am stuck on coding the second line.
try:
max(nested_list, key=lambda x: x[1])
or
import operator
max(nested_list, key=operator.itemgetter(1))
If the first item will always be 'a', you can just do
max(nested_list)
If you're willing to dive into some type checking and you want to do this for arbitrary sublists (Of one level only. something like [12, 'a', 12, 42, 'b']), you can do something like.
import numbers
max(nested_list, key=lambda x: max(i for i in x
if isinstance(i, numbers.Integral)))
In any case, if you're not sure that the elements of nested_list are in fact lists, you can do
import collections
max((s for s in nested_list
if isinstance(s, collections.Sequence)),
key=some_key_function)
and just pass it a key function of your own devising or one of the other ones in this answer.
In terms of the lambda x: x[1] vs. operator.itemgetter(1) question, I would profile. In princible, itemgetter should be the one right way but I've seen operator solutions get outperformed by lambda function do to 'bugs' (I use the term loosely, the code still works) in operator. My preference would be for itemgetter if performance doesn't matter (and probably if it does) but some people like to avoid the extra import.
Does this do what you want?
biggest = nested_list[0]
for entry in nested_list:
if entry[1] > biggest[1]:
biggest = entry
If the list is as simple as you suggest:
>>> nested_list = [['a', 3], ['a', 1], ['a', 5], ['a',2]]
>>> k = sorted(nested_list)
>>> k[-1]
['a', 5]
>>>