How would you 1) insert 1's between any two adjacent 5's? and then 2) insert a value into the second list at the same index as the 1 that was inserted into the first list?
For example,
list1 = [ 5, 1, 5, 5, 5, 1, etc.]
would become
list1 = [ 5, 1, 5, 1, 5, 1, 5, 1, etc.]
And,
list2 = [ val, val, val, val, val, val, etc.]
would become
list2 = [ val, val, val, Nval, val, Nval, etc.]
(Nval up above = the added value)
I'm a beginner so help is greatly appreciated :O)
You'll want to look at pairs of consecutive values. To do that, let's pair the list with the last item cut off (list[:-1]) with again the list, but with the first item cut off (list[1:]). (The slice notation used here is being introduced in the official Python tutorial and explained in this answer.)
zip(list1[:-1], list1[1:])
(The zip function turns a pair of sequences into a sequence of pairs and is introduced here in the tutorial and documented here.)
Let's see which these pairs are (5, 5):
[pair == (5, 5) for pair in zip(list1[:-1], list1[1:])]
The feature used here is a list comprehension, a way of writing a (new) list by giving a rule to construct it from an existing iterable.
What are the indices of these pairs? Let's number them with enumerate:
[n for (n, pair) in enumerate(zip(list1[:-1], list1[1:])) if pair == (5, 5)]
This is another list comprehension, this time with a condition for the elements (a "predicate"). Note that enumerate returns pairs of numbers and values (which are our original pairs), and that we use implicit unpacking to get them into the loop variables n and pair respectively.
list.insert(i, new_value) takes the index after which the new value shall be inserted. Thus to find the positions where to insert into the original list (and into list2), we need to add 1 to the pair indices:
idxs = [n + 1 for (n, pair) in enumerate(zip(list1[:-1], list1[1:])) if pair == (5, 5)]
for i in reversed(idxs):
list1.insert(idxs, 1)
list2.insert(idxs, 'Nval')
(We insert in reverse order, so as to not move the pairs between which we have yet to insert.)
You can recover the insertion indices with a single list comprehension. You are looking for indices i such that 5 == list1[i-1] == list1[i].
You then need to insert in decreasing order of indices.
list1 = [5, 1, 5, 5, 5, 1]
list2 = [val, val, val, val, val, val]
indices = [i for i in range(1, len(list1)) if 5 == list1[i-1] == list1[i]]
for i in reversed(indices):
list1.insert(i, 1)
list2.insert(i, Nval)
print(list1) # [5, 1, 5, 1, 5, 1, 5, 1]
print(list2) # [val, val, val, Nval, val, Nval, val, val]
You can use itertools.groupby:
import itertools
list1 = [5, 1, 5, 5, 5, 1]
copied = iter(['val' for _ in list1])
grouped = [[a, list(b)] for a, b in itertools.groupby(list1)]
new_result = [list(b) if a != 5 else [[5, 1] if c < len(b) - 1 else [5] for c, _ in enumerate(b)] for a, b in grouped]
final_result = [[i] if not isinstance(i, list) else i for b in new_result for i in b]
new_copied = [[next(copied)] if len(i) == 1 else [next(copied), 'Nval'] for i in final_result]
list2, list2_copied = list(itertools.chain(*final_result)), list(itertools.chain(*new_copied))
Output:
[5, 1, 5, 1, 5, 1, 5, 1]
['val', 'val', 'val', 'Nval', 'val', 'Nval', 'val', 'val']
list1 = [5,1,5,5,5,1,5,5,1,5,5,5,5]
list2 = []
templist = []
for idx,val in enumerate(list1):
if (idx+1) <= (len(list1)-1):
if list1[idx+1] == 5 and list1[idx] == 5:
templist.append(val)
templist.append(1)
list2.append(val)
list2.append(42)
else:
templist.append(val)
list2.append(val)
Gives me as output:
templist [5, 1, 5, 1, 5, 1, 5, 1, 5, 1, 5, 1, 5, 1, 5, 1, 5, 1]
list2 [5, 1, 5, 42, 5, 42, 5, 1, 5, 42, 5, 1, 5, 42, 5, 42, 5, 42]
And to finish off:
list1 = templist
One-line solution based on zip and reduce:
from functools import reduce
new_val = 10 # value to use for list2 (Nval)
new_list1 = []
new_list2 = []
reduce(lambda x, y: ((y[0] == 5 and x == 5) and (new_list1.append(1) or new_list2.append(new_val))) or
new_list1.append(y[0]) or new_list2.append(y[1]) or y[0],
zip(list1, list2), (None, new_list1, new_list2))
Related
This question already has answers here:
How to insert multiple values by index into list at one time
(4 answers)
Closed 7 months ago.
I want to solve the following problem in one line. I looked at the itertools docs to find a specific function for this but no luck. Say
a = [1, 2, 3,4]
b = [5, 6, 7, 8]
I want to insert the elements of b into a, but each element into a specific index. So
insert_function(a, b, keys=[0,1, 2, 3])
should return
[5, 1, 6, 2, 7, 3, 8, 4]
One approach:
def insert_function(la, lb, keys=None):
ii = [i + k for i, k in enumerate(keys)]
i, j = 0, 0
ret = []
for r in range(len(la) + len(lb)):
if r in ii:
ret.append(lb[j])
j += 1
else:
ret.append(la[i])
i += 1
return ret
res = insert_function(a, b, keys=[0, 1, 2, 3])
print(res)
Output
[5, 1, 6, 2, 7, 3, 8, 4]
Or as an alternative use this one-liner list comprehension with O(n + m) time complexity:
def insert_function(la, lb, keys=None):
ii = set(i + k for i, k in enumerate(keys))
it_a, it_b = iter(a), iter(b)
return [next(it_a) if r not in ii else next(it_b) for r in range(len(la) + len(lb))]
res = insert_function(a, b, keys=[0, 1, 2, 3])
print(res)
The problem is that when you insert elements you shift the subsequent indices. That can be prevented by inserting from right to left:
def insert_function(a, b, keys):
# copy so original list is left intact
a = a.copy()
# sort the keys in reverse
for k,v in sorted(zip(keys, b), key=lambda x:x[0], reverse=True):
a.insert(k, v)
return a
# returns [5, 1, 6, 2, 7, 3, 8, 4]
print(insert_function(a, b, [0, 1, 2, 3]))
This should work!
def insert_function(a, b, keys):
assert len(b) == len(keys)
for i in range(len(b)):
a.insert(keys[i], b[i])
Say I have a list like this:
l = [1, 2, 3, 4, 5, 3]
how do I get the indexes of those 3s that have been repeated?
First you need to figure out which elements are repeated and where. I do it by indexing it in a dictionary.
Then you need to extract all repeated values.
from collections import defaultdict
l = [1, 2, 3, 4, 5, 3]
_indices = defaultdict(list)
for index, item in enumerate(l):
_indices[item].append(index)
for key, value in _indices.items():
if len(value) > 1:
# Do something when them
print(key, value)
Output:
3 [2, 5]
Another would be to filter them out like so:
duplicates_dict = {key: indices for key, indices in _indices.items() if len(indices) > 1}
you could use a dictionary comprehension to get all the repeated numbers and their indexes in one go:
L = [1, 2, 3, 4, 5, 3, 8, 9, 9, 8, 9]
R = { n:rep[n] for rep in [{}] for i,n in enumerate(L)
if rep.setdefault(n,[]).append(i) or len(rep[n])==2 }
print(R)
{3: [2, 5],
9: [7, 8, 10],
8: [6, 9]}
The equivalent using a for loop would be:
R = dict()
for i,n in enumerate(L):
R.setdefault(n,[]).append(i)
R = {n:rep for n,rep in R.items() if len(rep)>1}
Counter from collections could be used to avoid the unnecessary creation of single item lists:
from collections import Counter
counts = Counter(L)
R = dict()
for i,n in enumerate(L):
if counts[n]>1:
R.setdefault(n,[]).append(i)
find deplicates and loop through the list to find the corresponding index locations. Not the most efficient, but works
input_list = [1,4,5,7,1,2,4]
duplicates = input_list.copy()
for x in set(duplicates):
duplicates.remove(x)
duplicates = list(set(duplicates))
dict_duplicates = {}
for d in duplicates:
l_ind = []
dict_duplicates[d] = l_ind
for i in range(len(input_list)):
if d == input_list[i]:
l_ind.append(i)
dict_duplicates
If you want to access and use all of them, you can iterate over the position on the list, and specify this position in 'index' function.
l = [1, 2, 3, 4, 5, 3]
repeated_indexes = []
pos = 0
for item in l:
if item == 3:
index = l.index(item, pos)
repeated_indexes.append(index)
pos +=1
See documentation of index function here : https://docs.python.org/3/library/array.html#array.array.index
I have the following list of numbers:
List = [1, 2, 3, 4, 5, 6, 15]
I want the indexes of those numbers which are multiple of n, so I do:
def indexes(List, n):
# to enumerate the numbers
E = enumerate(List)
# filtering tuples
F = list(filter(lambda x: x[1] % n == 0, E))
return [ i[0] for i in F]
indexes(List, 2)
[1, 3, 5]
That's ok, but what happens when I add the variable m?
def Index( L, n, m):
# enumeration
E = enumerate(L)
# filtering tuples
F_n = list(filter(lambda x: x[1]%n == 0, E))
F_m = list(filter(lambda x: x[1]%m == 0, E))
L_n = [ l[0] for l in F_n]
L_m = [ J[0] for J in F_m]
return L_n + L_m
>>>Index(List, 2, 5):
[1, 3, 5]
Why that code doesn't returns [1, 3, 5, 4, 6]?
What is the mistake?
And how to create the function that returns that list?
You can use a list comprehension in combination with enumerate method.
Also, you can apply extended iterable unpacking operator in order to pass parameters as many you need.
List = [1, 2, 3, 4, 5, 6, 15]
def indexes(List, *vars):
return [index for index, item in enumerate(List) for i in vars if item % i == 0 ]
print(indexes(List, 2, 5))
Output
[1, 3, 5, 4, 6]
A more general and Pythonic approach that works for any number of variables is to use an any() or all() function that check the Truth value of the condition for all the arguments. If you want the index to belongs to an item that is divisible buy all the arguments you need all() other wise you can use any() that returns True right after it encounters a match.
def indexes(lst, *args):
return [i for i, j in enumerate(lst) if any(j % arg == 0 for arg in args)]
Demo:
>>> lst = [1, 2, 3, 4, 5, 6, 15, 99, 200, 13, 17, 400]
>>> indexes(lst, 99, 5, 2, 100)
[1, 3, 4, 5, 6, 7, 8, 11]
>>>
And with all():
>>> indexes(lst, 5, 2, 100)
[8, 11]
The issue is enumerate returns an iterator from an iterable. Once it is exhausted, you may not use it again. Therefore, you can simply define a new enumerate iterator:
lst = [1, 2, 3, 4, 5, 6, 15]
def Index( L, n, m):
# enumeration - notice we define 2 objects
E, F = enumerate(L), enumerate(L)
F_n = list(filter(lambda x: x[1]%n == 0, E))
F_m = list(filter(lambda x: x[1]%m == 0, F))
L_n = [ l[0] for l in F_n]
L_m = [ J[0] for J in F_m]
return L_n + L_m
res = Index(lst, 2, 5)
print(res)
[1, 3, 5, 4, 6]
Note there are better ways you can implement your algorithm.
Title is definitely confusing, so here's an example: Say I have a list of values [1,2,3,2,1,4,5,6,7,8]. I want to remove between the two 1s in the list, and by pythonic ways it will also end up removing the first 1 and output [1,4,5,6,7,8]. Unfortunately, due to my lack of pythonic ability, I have only been able to produce something that removes the first set:
a = [1,2,3,2,1,4,5,6,7]
uniques = []
junks = []
for value in a:
junks.append(value)
if value not in uniques:
uniques.append(value)
for value in uniques:
junks.remove(value)
for value in junks:
a.remove(value)
a.remove(value)
a[0] = 1
print(a)
[1,4,5,6,7]
Works with the first double occurrence and will not work with the next occurrence in a larger list. I have an idea which is to remove between the index of the first occurrence and the second occurrence which will preserve the second and not have me do some dumb thing like a[0] = 1 but I'm really not sure how to implement it.
Would this do what you asked:
a = [1, 2, 3, 2, 1, 4, 5, 6, 7, 8]
def f(l):
x = l.copy()
for i in l:
if x.count(i) > 1:
first_index = x.index(i)
second_index = x.index(i, first_index + 1)
x = x[:first_index] + x[second_index:]
return x
So the output of f(a) would be [1, 4, 5, 6, 7, 8] and the output of f([1, 2, 3, 2, 1, 4, 5, 6, 7, 8, 7, 6, 5, 15, 16]) would be [1, 4, 5, 15, 16].
if you want to find unique elements you can use set and list
mylist = list(set(mylist))
a = [1, 2, 3, 2, 1, 4, 5, 6, 7, 8, 7, 6, 5, 15, 16]
dup = [x for x in a if a.count(x) > 1] # list of duplicates
while dup:
pos1 = a.index(dup[0])
pos2 = a.index(dup[0], pos1+1)
a = a[:pos1]+a[pos2:]
dup = [x for x in a if a.count(x) > 1]
print a #[1, 4, 5, 15, 16]
A more efficient solution would be
a = [1, 2, 3, 2, 1, 4, 5, 6, 7, 8, 7, 6, 5, 15, 16]
pos1 = 0
while pos1 < len(a):
if a[pos1] in a[pos1+1:]:
pos2 = a.index(a[pos1], pos1+1)
a = a[:pos1]+a[pos2:]
pos1 += 1
print a #[1, 4, 5, 15, 16]
(This probably isn't the most efficient way, but hopefully it helps)
Couldn't you just check if something appears twice, if it does you have firstIndex, secondIndex, then:
a=[1,2,3,4,5,1,7,8,9]
b=[]
#do a method to get the first and second index of the repeated number then
for index in range(0, len(a)):
print index
if index>firstIndex and index<secondIndex:
print "We removed: "+ str(a[index])
else:
b.append(a[index])
print b
The output is [1,1,7,8,9] which seems to be what you want.
To do the job you need:
the first and the last position of duplicated values
all indexes between, to remove them
Funny thing is, you can simply tell python to do this:
# we can use a 'smart' dictionary, that can construct default value:
from collections import defaultdict
# and 'chain' to flatten lists (ranges)
from itertools import chain
a = [1, 2, 3, 2, 1, 4, 5, 6, 7]
# build dictionary where each number is key, and value is list of positions:
index = defaultdict(list)
for i, item in enumerate(a):
index[item].append(i)
# let's take first only and last index for non-single values
edges = ((pos[0], pos[-1]) for pos in index.values() if len(pos) > 1)
# we can use range() to get us all index positions in-between
# ...use chain.from_iterable to flatten our list
# ...and make set of it for faster lookup:
to_remove = set(chain.from_iterable(range(start, end)
for start, end in edges))
result = [item for i, item in enumerate(a) if i not in to_remove]
# expected: [1, 4, 5, 6, 7]
print result
Of course you can make it shorter:
index = defaultdict(list)
for i, item in enumerate([1, 2, 3, 2, 1, 4, 5, 6, 7]):
index[item].append(i)
to_remove = set(chain.from_iterable(range(pos[0], pos[-1])
for pos in index.values() if len(pos) > 1))
print [item for i, item in enumerate(a) if i not in to_remove]
This solution has linear complexity and should be pretty fast. The cost is
additional memory for dictionary and set, so you should be careful for huge data sets. But if you have a lot of data, other solutions that use lst.index will choke anyway, because they are O(n^2) with a lot of dereferencing and function calls.
I have a sorted list and would like to identify consecutive multiple numbers in that list. The list can contain consecutive multiples of different order, which makes it more difficult.
Some test cases:
[1,3,4,5] -> [[1], [3,4,5]]
[1,3,5,6,7] -> [[1], [3], [5,6,7]]
# consecutive multiples of 1 and 2 (or n)
[1,2,3,7,9,11] -> [[1,2,3], [7,9,11]
[1,2,3,7,10,12,14,25] -> [[1,2,3], [7], [10,12,14], [25]]
# overlapping consecutives !!!
[1,2,3,4,6,8,10] -> [[1,2,3,4], [6,8,10]
Now, I have no idea what I'm doing. What I have done is to group pairwise by the distance between numbers, which was a good start, but then I am having a lot of issues identifying which element in each pair goes where, i.e.
# initial list
[1,3,4,5]
# pairs of same distance
[[1,3], [[3,4], [4,5]]
# algo to get the final result ?
[[1], [3,4,5]]
Any help is greatly appreciated.
EDIT: Maybe mentioning what I want this for would make it more clear.
I want to transform something like:
[1,5,10,11,12,13,14,15,17,20,22,24,26,28,30]
into
1, 5, 10 to 15 by 1, 17, 20 to 30 by 2
Here is a version that incorporates #Bakuriu's optimization:
MINIMAL_MATCH = 3
def find_some_sort_of_weird_consecutiveness(data):
"""
>>> find_some_sort_of_weird_consecutiveness([1,3,4,5])
[[1], [3, 4, 5]]
>>> find_some_sort_of_weird_consecutiveness([1,3,5,6,7])
[[1, 3, 5], [6], [7]]
>>> find_some_sort_of_weird_consecutiveness([1,2,3,7,9,11])
[[1, 2, 3], [7, 9, 11]]
>>> find_some_sort_of_weird_consecutiveness([1,2,3,7,10,12,14,25])
[[1, 2, 3], [7], [10, 12, 14], [25]]
>>> find_some_sort_of_weird_consecutiveness([1,2,3,4,6,8,10])
[[1, 2, 3, 4], [6, 8, 10]]
>>> find_some_sort_of_weird_consecutiveness([1,5,10,11,12,13,14,15,17,20,22,24,26,28,30])
[[1], [5], [10, 11, 12, 13, 14, 15], [17], [20, 22, 24, 26, 28, 30]]
"""
def pair_iter(series):
from itertools import tee
_first, _next = tee(series)
next(_next, None)
for i, (f, n) in enumerate(zip(_first, _next), start=MINIMAL_MATCH - 1):
yield i, f, n
result = []
while len(data) >= MINIMAL_MATCH:
test = data[1] - data[0]
if (data[2] - data[1]) == test:
for i, f, n in pair_iter(data):
if (n - f) != test:
i -= 1
break
else:
i = 1
data, match = data[i:], data[:i]
result.append(match)
for d in data:
result.append([d])
return result
if __name__ == '__main__':
from doctest import testmod
testmod()
It handles all your current test cases. Give me new failing test cases if you have any.
As mentioned in comments below, I am assuming that the shortest sequence is now three elements since a sequence of two is trivial.
See http://docs.python.org/2/library/itertools.html for an explanation of the pairwise iterator.
I'd start out with a difference list.
length_a = len(list1)
diff_v = [list1[j+1] - list1[j] for j in range(length_a-1)]
so [1,2,3,7,11,13,15,17] becomes [1,1,4,4,2,2,2]
now it is easy
You can just keep track of your last output value as you go along:
in_ = [1, 2, 3, 4, 5]
out = [[in[0]]]
for item in in_[1:]:
if out[-1][-1] != item - 1:
out.append([])
out[-1].append(item)
I would group the list by its difference between index and value:
from itertools import groupby
lst = [1,3,4,5]
result = []
for key, group in groupby(enumerate(lst), key = lambda (i, value): value - i):
result.append([value for i, value in group])
print result
[[1], [3, 4, 5]]
What did I do?
# at first I enumerate every item of list:
print list(enumerate(lst))
[(0, 1), (1, 3), (2, 4), (3, 5)]
# Then I subtract the index of each item from the item itself:
print [ value - i for i, value in enumerate(lst)]
[1, 2, 2, 2]
# As you see, consecutive numbers turn out to have the same difference between index and value
# We can use this feature and group the list by the difference of value minus index
print list( groupby(enumerate(lst), key = lambda (i, value): value - i) )
[(1, <itertools._grouper object at 0x104bff050>), (2, <itertools._grouper object at 0x104bff410>)]
# Now you can see how it works. Now I just want to add how to write this in one logical line:
result = [ [value for i, value in group]
for key, group in groupby(enumerate(lst), key = lambda (i, value): value - i)]
print result
[[1], [3, 4, 5]]
Approach for identifying consecutive multiples of n
Let's have a look at this list,
lst = [1,5,10,11,12,13,14,15,17,21,24,26,28,30]
especially at the differences between neighbor elements and the differences of differences of three consecutive elements:
1, 5, 10, 11, 12, 13, 14, 15, 17, 21, 24, 26, 28, 30
4, 5, 1, 1, 1, 1, 1, 2, 4, 3, 2, 2, 2
1, -4, 0, 0, 0, 0, 1, 2, -1, -1, 0, 0
We see, that there are zeros in the third row, whenever there are connective multiples in the first row. If we think of it mathematically, the 2nd derivative of a functions's linear sections is also zero. So lets use this property...
The "2nd derivative" of a list lst can be calculated like this
lst[i+2]-2*lst[i+1]+lst[i]
Note that this definition of the second order difference "looks" two indexes ahead.
Now here is the code detecting the consecutive multiples:
from itertools import groupby
# We have to keep track of the indexes in the list, that have already been used
available_indexes = set(range(len(lst)))
for second_order_diff, grouper in groupby(range(len(lst)-2), key = lambda i: lst[i+2]-2*lst[i+1]+lst[i]):
# store all not-consumed indexes in a list
grp_indexes = [i for i in grouper if i in available_indexes]
if grp_indexes and second_order_diff == 0:
# There are consecutive multiples
min_index, max_index = grp_indexes[0], grp_indexes[-1] + 2
print "Group from ", lst[min_index], "to", lst[max_index], "by", lst[min_index+1]-lst[min_index]
available_indexes -= set(range(min_index, max_index+1))
else:
# The not "consumed" indexes in this group are not consecutive
for i in grp_indexes:
print lst[i]
available_indexes.discard(i)
# The last two elements could be lost without the following two lines
for i in sorted(available_indexes):
print lst[i]
Output:
1
5
Group from 10 to 15 by 1
17
21
Group from 24 to 30 by 2