How can I use enumerate to count backwards? - python

letters = ['a', 'b', 'c']
Assume this is my list. Where for i, letter in enumerate(letters) would be:
0, a
1, b
2, c
How can I instead make it enumerate backwards, as:
2, a
1, b
0, c

This is a great solution and works perfectly:
items = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
for idx, item in enumerate(items, start=-len(items)):
print(f"reverse index for {item}: {abs(idx)}")
Here is the OUTPUT of the above snippet:
reverse index for a: 7
reverse index for b: 6
reverse index for c: 5
reverse index for d: 4
reverse index for e: 3
reverse index for f: 2
reverse index for g: 1
Here is what happening in above snippet:
enumerate's start arg is given a negative value.
enumerate always takes a step forward.
Finally we use abs on idx to find absolute value, which is always positive.
If you want to start indexing from zero then use -len(items) + 1 to fix off-by-one error

Try this:
letters = ['a', 'b', 'c']
for i, letter in reversed(list(enumerate(reversed(letters)))):
print(i, letter)
Output:
2 a
1 b
0 c

Try this:
l = len(letters)
for i, letter in enumerate(letters):
print(l-i, letters)

I would try to make a reverse list first then you may use enumerate()
letters = ['a', 'b', 'c']
letters.reverse()
for i, letter in enumerate(letters)

The zip function creates a list of element-wise pairs for two parameter lists.
list(zip([i for i in range(len(letters))][::-1], letters))

letters = ['a', 'b', 'c']
for i, letter in zip(range(len(letters)-1, -1, -1), letters):
print(i, letter)
prints
2 a
1 b
0 c
Taken from answer in a similar question: Traverse a list in reverse order in Python

tl;dr: size - index - 1
I'll assume the question you are asking is whether or not you can have the index be reversed while the item is the same, for example, the a has the ordering number of 2 when it actually has an index of 0.
To calculate this, consider that each element in your array or list wants to have the index of the item with the same "distance" (index wise) from the end of the collection. Calculating this gives you size - index.
However, many programming languages start arrays with an index of 0. Due to this, we would need to subtract 1 in order to make the indices correspond properly. Consider our last element, with an index of size - 1. In our original equation, we would get size - (size - 1), which is equal to size - size + 1, which is equal to 1. Therefore, we need to subtract 1.
Final equation (for each element): size - index - 1

We can define utility function (in Python3.3+)
from itertools import count
def enumerate_ext(iterable, start=0, step=1):
indices = count(start, step)
yield from zip(indices, iterable)
and use it directly like
letters = ['a', 'b', 'c']
for index, letter in enumerate_ext(letters,
start=len(letters) - 1,
step=-1):
print(index, letter)
or write helper
def reverse_enumerate(sequence):
yield from enumerate_ext(sequence,
start=len(sequence) - 1,
step=-1)
and use it like
for index, letter in reverse_enumerate(letters):
print(index, letter)

Related

How to get count of non-repeating values in list

I know I can do something like below to get number of occurrences of elements in the list:
from collections import Counter
words = ['a', 'b', 'c', 'a']
Counter(words).keys() # equals to list(set(words))
Counter(words).values() # counts the elements' frequency
Outputs:
['a', 'c', 'b']
[2, 1, 1]
But I want to get the count 2 for b and c as b and c occur exactly once in the list.
Is there any way to do this in concise / pythonic way without using Counter or even using above output from Counter?
You could just make an algorithm that does that, here is a one liner (thanks #d.b):
sum(x for x in Counter(words).values() if x == 1)
Or more than one line:
seen = []
count = 0
for word in words:
if word not in seen:
count += 1
seen.append(word)

How to use enumerate in a list comprehension with two lists?

I just started to use list comprehension and I'm struggling with it. In this case, I need to get the n number of each list (sequence_0 and sequence_1) that the iteration is at each time. How can I do that?
The idea is to get the longest sequence of equal nucleotides (a motif) between the two sequences. Once a pair is finded, the program should continue in the nexts nucleotides of the sequences, checking if they are also equal and then elonganting the motif with it. The final output should be an list of all the motifs finded.
The problem is, to continue in the next nucleotides once a pair is finded, i need the position of the pair in both sequences to the program continue. The index function does not work in this case, and that's why i need the enumerate.
Also, I don't understand exactly the reason for the x and y between (), it would be good to understand that too :)
just to explain, the content of the lists is DNA sequences, so its basically something like:
sequence_1 = ['A', 'T', 'C', 'A', 'C']
def find_shared_motif(arq):
data = fastaread(arq)
seqs = [list(sequence) for sequence in data.values()]
motifs = [[]]
i = 0
sequence_0, sequence_1 = seqs[0], seqs[1] # just to simplify
for x, y in [(x, y) for x in zip(sequence_0[::], sequence_0[1::]) for y in zip(sequence_1[::], sequence_1[1::])]:
print(f'Pairs {"".join(x)} and {"".join(y)} being analyzed...')
if x == y:
print(f'Pairs {"".join(x)} and {"".join(y)} match!')
motifs[i].append(x[0]), motifs[i].append(x[1])
k = sequence_0.index(x[0]) + 2 # NAO ESTA DEVOLVENDO O NUMERO CERTO
u = sequence_1.index(y[0]) + 2
print(k, u)
# Determines if the rest of the sequence is compatible
print(f'Starting to elongate the motif {x}...')
for j, m in enumerate(sequence_1[u::]):
try:
# Checks if the nucleotide is equal for both of the sequences
print(f'Analyzing the pair {sequence_0[k + j]}, {m}')
if m == sequence_0[k + j]:
motifs[i].append(m)
print(f'The pair {sequence_0[k + j]}, {m} is equal!')
# Stop in the first nonequal residue
else:
print(f'The pair {sequence_0[k + j]}, {m} is not equal.')
break
except IndexError:
print('IndexError, end of the string')
else:
i += 1
motifs.append([])
return motifs
...
One way to go with it is to start zipping both lists:
a = ['A', 'T', 'C', 'A', 'C']
b = ['A', 'T', 'C', 'C', 'T']
c = list(zip(a,b))
In that case, c will have the list of tuples below
c = [('A','A'), ('T','T'), ('C','C'), ('A','C'), ('C','T')]
Then, you can go with list comprehension and enumerate:
d = [(i, t) for i, t in enumerate(c)]
This will bring something like this to you:
d = [(0, ('A','A')), (1, ('T','T')), (2, ('C','C')), ...]
Of course you can go for a one-liner, if you want:
d = [(i, t) for i, t in enumerate(zip(a,b))]
>>> [(0, ('A','A')), (1, ('T','T')), (2, ('C','C')), ...]
Now, you have to deal with the nested tuples. Focus on the internal ones. It is obvious that what you want is to compare the first element of the tuples with the second ones. But, also, you will need the position where the difference resides (that lies outside). So, let's build a function for it. Inside the function, i will capture the positions, and t will capture the inner tuples:
def compare(a, b):
d = [(i, t) for i, t in enumerate(zip(a,b))]
for i, t in d:
if t[0] != t[1]:
return i
return -1
In that way, if you get -1 at the end, it means that all elements in both lists are equal, side by side. Otherwise, you will get the position of the first difference between them.
It is important to notice that, in the case of two lists with different sizes, the zip function will bring a list of tuples with the size matching the smaller of the lists. The extra elements of the other list will be ignored.
Ex.
list(zip([1,2], [3,4,5]))
>>> [(1,3), (2,4)]
You can use the function compare with your code to get the positions where the lists differ, and use that to build your motifs.

Creating a Python list with given indexes for each repeating element

First list : contains the list indexes of corresponding category name
Second list : contains the category names as string
Intervals=[[Indexes_Cat1],[Indexes_Cat2],[Indexes_Cat3], ...]
Category_Names=["cat1","cat2","cat3",...]
Desired Output:
list=["cat1", "cat1","cat2","cat3","cat3"]
where indexes of any element in output list is placed using Intervals list.
Ex1:
Intervals=[[0,4], [2,3] , [1,5]]
Category_Names=["a","b","c"]
Ex: Output1
["a","c","b","b","a","c"]
Edit: More Run Cases
Ex2:
Intervals=[[0,1], [2,3] , [4,5]]
Category_Names=["a","b","c"]
Ex: Output2
["a","a","b","b","c","c"]
Ex3:
Intervals=[[3,4], [1,5] , [0,2]]
Category_Names=["a","b","c"]
Ex: Output3
["c","b","c","a","a","b"]
My solution:
Create any empty array of size n.
Run a for loop for each category.
output=[""]*n
for i in range(len(Category_Names)):
for index in Intervals[I]:
output[index]=Categories[i]
Is there a better solution, or a more pythonic way? Thanks
def categorise(Intervals=[[0,4], [2,3] , [1,5]],
Category_Names=["a","b","c"]):
flattened = sum(Intervals, [])
answer = [None] * (max(flattened) + 1)
for indices, name in zip(Intervals, Category_Names):
for i in indices:
answer[i] = name
return answer
assert categorise() == ['a', 'c', 'b', 'b', 'a', 'c']
assert categorise([[3,4], [1,5] , [0,2]],
["a","b","c"]) == ['c', 'b', 'c', 'a', 'a', 'b']
Note that in this code you will get None values in the answer if the "intervals" don't cover all integers from zero to the max interval number. It is assumed that the input is compatable.
I am not sure if there is a way to avoid the nested loop (I can't think of any right now) so it seems your solution is good.
A way you could do it a bit better is to construct the output array with one of the categories:
output = [Category_Names[0]]*n
and then start the iteration skipping that category:
for i in range(1, len(Category_Names)):
If you know there is a category that appears more than the others then you should use that as the one initializing the array.
I hope this helps!
You can reduce the amount of strings created and use enumerate to avoid range(len(..)) for indexing.
Intervals=[[0,4], [2,3] , [1,5]]
Category_Names=["a","b","c"]
n = max(x for a in Intervals for x in a) + 1
# do not construct strings that get replaced anyhow
output=[None] * n
for i,name in enumerate(Category_Names):
for index in Intervals[i]:
output[index]=name
print(output)
Output:
["a","c","b","b","a","c"]

all combination of a complicated list

I want to find all possible combination of the following list:
data = ['a','b','c','d']
I know it looks a straightforward task and it can be achieved by something like the following code:
comb = [c for i in range(1, len(data)+1) for c in combinations(data, i)]
but what I want is actually a way to give each element of the list data two possibilities ('a' or '-a').
An example of the combinations can be ['a','b'] , ['-a','b'], ['a','b','-c'], etc.
without something like the following case of course ['-a','a'].
You could write a generator function that takes a sequence and yields each possible combination of negations. Like this:
import itertools
def negations(seq):
for prefixes in itertools.product(["", "-"], repeat=len(seq)):
yield [prefix + value for prefix, value in zip(prefixes, seq)]
print list(negations(["a", "b", "c"]))
Result (whitespace modified for clarity):
[
[ 'a', 'b', 'c'],
[ 'a', 'b', '-c'],
[ 'a', '-b', 'c'],
[ 'a', '-b', '-c'],
['-a', 'b', 'c'],
['-a', 'b', '-c'],
['-a', '-b', 'c'],
['-a', '-b', '-c']
]
You can integrate this into your existing code with something like
comb = [x for i in range(1, len(data)+1) for c in combinations(data, i) for x in negations(c)]
Once you have the regular combinations generated, you can do a second pass to generate the ones with "negation." I'd think of it like a binary number, with the number of elements in your list being the number of bits. Count from 0b0000 to 0b1111 via 0b0001, 0b0010, etc., and wherever a bit is set, negate that element in the result. This will produce 2^n combinations for each input combination of length n.
Here is one-liner, but it can be hard to follow:
from itertools import product
comb = [sum(t, []) for t in product(*[([x], ['-' + x], []) for x in data])]
First map data to lists of what they can become in results. Then take product* to get all possibilities. Finally, flatten each combination with sum.
My solution basically has the same idea as John Zwinck's answer. After you have produced the list of all combinations
comb = [c for i in range(1, len(data)+1) for c in combinations(data, i)]
you generate all possible positive/negative combinations for each element of comb. I do this by iterating though the total number of combinations, 2**(N-1), and treating it as a binary number, where each binary digit stands for the sign of one element. (E.g. a two-element list would have 4 possible combinations, 0 to 3, represented by 0b00 => (+,+), 0b01 => (-,+), 0b10 => (+,-) and 0b11 => (-,-).)
def twocombinations(it):
sign = lambda c, i: "-" if c & 2**i else ""
l = list(it)
if len(l) < 1:
return
# for each possible combination, make a tuple with the appropriate
# sign before each element
for c in range(2**(len(l) - 1)):
yield tuple(sign(c, i) + el for i, el in enumerate(l))
Now we apply this function to every element of comb and flatten the resulting nested iterator:
l = itertools.chain.from_iterable(map(twocombinations, comb))

Python: Append double items to new array

lets say I have an array "array_1" with these items:
A b A c
I want to get a new array "array_2" which looks like this:
b A c A
I tried this:
array_1 = ['A','b','A','c' ]
array_2 = []
for item in array_1:
if array_1[array_1.index(item)] == array_1[array_1.index(item)].upper():
array_2.append(array_1[array_1.index(item)+1]+array_1[array_1.index(item)])
The problem: The result looks like this:
b A b A
Does anyone know how to fix this? This would be really great!
Thanks, Nico.
It's because you have 2 'A' in your array. In both case for the 'A',
array_1[array_1.index(item)+1
will equal 'b' because the index method return the first index of 'A'.
To correct this behavior; i suggest to use an integer you increment for each item. In that cas you'll retrieve the n-th item of the array and your program wont return twice the same 'A'.
Responding to your comment, let's take back your code and add the integer:
array_1 = ['A','b','A','c' ]
array_2 = []
i = 0
for item in array_1:
if array_1[i] == array_1[i].upper():
array_2.append(array_1[i+1]+array_1[i])
i = i + 1
In that case, it works but be careful, you need to add an if statement in the case the last item of your array is an 'A' for example => array_1[i+1] won't exist.
I think that simple flat list is the wrong data structure for the job if each lower case letter is paired with the consecutive upper case letter. If would turn it into a list of two-tuples i.e.:
['A', 'b', 'A', 'c'] becomes [('A', 'b'), ('A', 'c')]
Then if you are looping through the items in the list:
for item in list:
print(item[0]) # prints 'A'
print(item[1]) # prints 'b' (for first item)
To do this:
input_list = ['A', 'b', 'A', 'c']
output_list = []
i = 0;
while i < len(input_list):
output_list.append((input_list[i], input_list[i+1]))
i = i + 2;
Then you can swap the order of the upper case letters and the lower case letters really easily using a list comprehension:
swapped = [(item[1], item[0]) for item in list)]
Edit:
As you might have more than one lower case letter for each upper case letter you could use a list for each group, and then have a list of these groups.
def group_items(input_list):
output_list = []
current_group = []
while not empty(input_list):
current_item = input_list.pop(0)
if current_item == current_item.upper():
# Upper case letter, so start a new group
output_list.append(current_group)
current_group = []
current_group.append(current_item)
Then you can reverse each of the internal lists really easily:
[reversed(group) for group in group_items(input_list)]
According to your last comment, you can get what you want using this
array_1 = "SMITH Mike SMITH Judy".split()
surnames = array_1[1::2]
names = array_1[0::2]
print array_1
array_1[0::2] = surnames
array_1[1::2] = names
print array_1
You get:
['SMITH', 'Mike', 'SMITH', 'Judy']
['Mike', 'SMITH', 'Judy', 'SMITH']
If I understood your question correctly, then you can do this:
It will work for any length of array.
array_1 = ['A','b','A','c' ]
array_2 = []
for index,itm in enumerate(array_1):
if index % 2 == 0:
array_2.append(array_1[index+1])
array_2.append(array_1[index])
print array_2
Output:
['b', 'A', 'c', 'A']

Categories

Resources