Creating Simultaneous Loops in Python - python

I want to create a loop who has this sense:
for i in xrange(0,10):
for k in xrange(0,10):
z=k+i
print z
where the output should be
0
2
4
6
8
10
12
14
16
18

You can use zip to turn multiple lists (or iterables) into pairwise* tuples:
>>> for a,b in zip(xrange(10), xrange(10)):
... print a+b
...
0
2
4
6
8
10
12
14
16
18
But zip will not scale as well as izip (that sth mentioned) on larger sets. zip's advantage is that it is a built-in and you don't have to import itertools -- and whether that is actually an advantage is subjective.
*Not just pairwise, but n-wise. The tuples' length will be the same as the number of iterables you pass in to zip.

The itertools module contains an izip function that combines iterators in the desired way:
from itertools import izip
for (i, k) in izip(xrange(0,10), xrange(0,10)):
print i+k

You can do this in python - just have to make the tabs right and use the xrange argument for step.
for i in xrange(0, 20, 2);
print i

What about this?
i = range(0,10)
k = range(0,10)
for x in range(0,10):
z=k[x]+i[x]
print z
0
2
4
6
8
10
12
14
16
18

What you want is two arrays and one loop, iterate over each array once, adding the results.

Related

Python : Replace two for loops with the fastest way to sum the elements

I have list of 5 elements which could be 50000, now I want to sum all the combinations from the same list and create a dataframe from the results, so I am writing following code,
x =list(range(1,5))
t=[]
for i in x:
for j in x:
t.append((i,j,i+j))
df=pd.Dataframe(t)
The above code is generating the correct results but taking so long to execute when I have more elements in the list. Looking for the fastest way to do the same thing
Combinations can be obtained through the pandas.merge() method without using explicit loops
x = np.arange(1, 5+1)
df = pd.DataFrame(x, columns=['x']).merge(pd.Series(x, name='y'), how='cross')
df['sum'] = df.x.add(df.y)
print(df)
x y sum
0 1 1 2
1 1 2 3
2 1 3 4
3 1 4 5
4 1 5 6
5 2 1 3
6 2 2 4
...
Option 2: with itertools.product()
import itertools
num = 5
df = pd.DataFrame(itertools.product(range(1,num+1),range(1,num+1)))
df['sum'] = df[0].add(df[1])
print(df)
List Comprehension can make it faster. So, you can use t=[(i,j,i+j) for i in x for j in x] instead of for loop, as the traditional for loop is slower than list comprehensions, and nested loop is even slower. Here's the updated code in replacement of nested loops.
x =list(range(1,5))
t=[(i,j,i+j) for i in x for j in x]
df=pd.Dataframe(t)

what's the best way to increment a list of numbers

I have a list like this x=[1,2,2,3,1,2,1,1,2,2] where the number is a positive integer that increments by 0 or 1 and sometimes resets to 1, and need to transform it to [1,2,2,3,4,5,6,7,8,8] in an incremental way, where each 1 should be the previous number plus 1 and whatever follows 1 increment accordingly. Is there a simple way to do this via a numpy array etc? I tried using loops but I guess there's a simpler way.
You can use np.add.accumulate():
import numpy as np
x = np.array([1,2,2,3,1,2,1,1,2,2])
x[1:] += np.add.accumulate(x[:-1]*(x[1:]==1))
print(x)
[1 2 2 3 4 5 6 7 8 8]

Slicing large lists based on input

If I have multiple lists such that
hello = [1,3,5,7,9,11,13]
bye = [2,4,6,8,10,12,14]
and the user inputs 3
is there a way to get the output to go back 3 indexes in the list and start there to get:
9 10
11 12
13 14
with tabs \t between each space.
if the user would input 5
the expected output would be
5 6
7 8
9 10
11 12
13 14
I've tried
for i in range(user_input):
print(hello[-i-1], '\t', bye[-i-1])
Just use negative indexies that start from the end minus the user input (-user_input) and move to the the end (-1), something like:
for i in range(-user_input, 0):
print(hello[i], bye[i])
Another zip solution, but one-lined:
for h, b in zip(hello[-user_input:], bye[-user_input:]):
print(h, b, sep='\t')
Avoids converting the result of zip to a list, so the only temporaries are the slices of hello and bye. While iterating by index can avoid those temporaries, in practice it's almost always cleaner and faster to do the slice and iterate the values, as repeated indexing is both unpythonic and surprisingly slow in CPython.
Use negative indexing in the slice.
hello = [1,3,5,7,9,11,13]
print(hello[-3:])
print(hello[-3:-2])
output
[9, 11, 13]
[9]
You can zip the two lists and use itertools.islice to obtain the desired portion of the output:
from itertools import islice
print('\n'.join(map(' '.join, islice(zip(map(str, hello), map(str, bye)), len(hello) - int(input()), len(hello)))))
Given an input of 3, this outputs:
5 6
7 8
9 10
11 12
13 14
You can use zip to return a lists of tuple where the i-th element comes from the i-th iterable argument.
zip_ = list(zip(hello, bye))
for item in zip_[-user_input:]:
print(item[0], '\t' ,item[1])
then use negative index to get what you want.
If you want to analyze the data
I think using pandas.datafrme may be helpful.
INPUT_INDEX = int(input('index='))
df = pd.DataFrame([hello, bye])
df = df.iloc[:, len(df.columns)-INPUT_INDEX:]
for col in df.columns:
h_value, b_value = df[col].values
print(h_value, b_value)
console
index=3
9 10
11 12
13 14

counting T/F values for several conditions

I am a beginner using pandas.
I'm looking for mutations on several patients. I have 16 different conditions. I simply write a code about it but how can do this by for loop? I try to find the changes on MUT column and set them as True and False. Then try to count the True/False numbers. I have done for only 4.
Can you suggest a more simple way, instead of writing the same code 16 times?
s1=df["MUT"]
A_T= s1.str.contains("A:T")
ATnum= A_T.value_counts(sort=True)
s2=df["MUT"]
A_G=s2.str.contains("A:G")
AGnum=A_G.value_counts(sort=True)
s3=df["MUT"]
A_C=s3.str.contains("A:C")
ACnum=A_C.value_counts(sort=True)
s4=df["MUT"]
A__=s4.str.contains("A:-")
A_num=A__.value_counts(sort=True)
I'm not an expert with using Pandas, so don't know if there's a cleaner way of doing this, but perhaps the following might work?
chars = 'TGC-'
nums = {}
for char in chars:
s = df["MUT"]
A = s.str.contains("A:" + char)
num = A.value_counts(sort=True)
nums[char] = num
ATnum = nums['T']
AGnum = nums['G']
# ...etc
Basically, go through each unique character (T, G, C, -) then pull out the values that you need, then finally stick the numbers in a dictionary. Then, once the loop is finished, you can fetch whatever numbers you need back out of the dictionary.
Just use value_counts, this will give you a count of all unique values in your column, no need to create 16 variables:
In [5]:
df = pd.DataFrame({'MUT':np.random.randint(0,16,100)})
df['MUT'].value_counts()
Out[5]:
6 11
14 10
13 9
12 9
1 8
9 7
15 6
11 6
8 5
5 5
3 5
2 5
10 4
4 4
7 3
0 3
dtype: int64

How can I restart a string iterator endlessly? [duplicate]

This question already has answers here:
Circular list iterator in Python
(9 answers)
Closed last month.
This question is somewhat related to this, this, and this one. Assume I have two generators/iterators of different lengths:
>>> s = "abcde"
>>> r = range(0, 16)
I now want to repeat iterating over the shorter one until the longer one is exhausted. The standard zip() function terminates once the shorter of the two is exhausted:
>>> for c, i in zip(s, r) :
... print(c, i)
...
a 0
b 1
c 2
d 3
e 4
The best I can come up with is wrapping the string into a generator like so:
>>> def endless_s(s) :
... while True :
... for c in s :
... yield c
which gives me the desired result of
>>> _s = endless_s(s)
>>> for c, i in zip(_s, r) :
... print(c, i)
...
a 0
b 1
c 2
d 3
e 4
a 5
b 6
c 7
d 8
e 9
a 10
b 11
c 12
d 13
e 14
a 15
Now I wonder: is there a better and more compact way of doing this? Like an endless string join, or some such?
You could do this with itertools.cycle:
Make an iterator returning elements from the iterable and saving a
copy of each. When the iterable is exhausted, return elements from the
saved copy. Repeats indefinitely.
which is able to replace your function entirely:
from itertools import cycle as endless_s

Categories

Resources