How can I group the numbers by 5? - python

def num ():
num = int (input("Enter a number: ") )
while num in range (num >= 0,100) :
num += 1
print (num, end = " ")
num ()
My problem is I don't know how to group it into 5 (for e.g. 1 2 3 4 5 and the next line is 6 7 8 9 10). 5 numbers each line. And when the user inputs a number, it will count up from that number inputted.

Here is a variation:
num = int (input("Enter a number: ") )
l = list(range(num,100))
for i in range(0, len(l),5):
print(" ".join(map(str, l[i:i+5])))
We take sublists of size 5 (or less for the last one if necesarry) and use join to create a string with spaces. Since join needs strings, i use map
Example: (input 83)
83 84 85 86 87
88 89 90 91 92
93 94 95 96 97
98 99

You probably want something like this. Note, this handles when user input is 0.
def num ():
num = int (input("Enter a number: "))
count = 0
if num == 0:
foo = range(0, 100)
for num in foo:
count += 1
print(num, end = " ")
if count == 5:
count = 0
print()
else:
while num in range(num >= 0,100):
num += 1
count += 1
print (num, end = " ")
if count == 5:
count = 0
print()
num ()
Enter a number: 13
14 15 16 17 18
19 20 21 22 23
24 25 26 27 28
29 30 31 32 33
34 35 36 37 38
39 40 41 42 43
44 45 46 47 48
49 50 51 52 53
54 55 56 57 58
59 60 61 62 63
64 65 66 67 68
69 70 71 72 73
74 75 76 77 78
79 80 81 82 83
84 85 86 87 88
89 90 91 92 93
94 95 96 97 98
99 100

from functools import partial
from itertools import islice
num = int(input("Enter a number: "))
get_sublist = lambda iterable,length: list(islice(iterable, length))
print(list(iter(partial(get_sublist, iter(range(num,100)), 5), [])))
Enter a number: 12
[[12, 13, 14, 15, 16], [17, 18, 19, 20, 21], [22, 23, 24, 25, 26], [27, 28, 29, 30, 31],
[32, 33, 34, 35, 36], [37, 38, 39, 40, 41], [42, 43, 44, 45, 46], [47, 48, 49, 50, 51],
[52, 53, 54, 55, 56], [57, 58, 59, 60, 61], [62, 63, 64, 65, 66], [67, 68, 69, 70, 71],
[72, 73, 74, 75, 76], [77, 78, 79, 80, 81], [82, 83, 84, 85, 86], [87, 88, 89, 90, 91],
[92, 93, 94, 95, 96], [97, 98, 99]]
ref : more_itertools

Check this code:
num = int (input("Enter a number: "))
i = 100
while num in range (num >= 1,100) :
num += 1
i += 1
print (num, end = " ")
if i%5 == 0:
print()
Output:
Enter a number: 10
11 12 13 14 15
16 17 18 19 20
21 22 23 24 25
....

This may be able to be condensed but:
num = int (input("Enter a number: ") )
while i <= 100:
print(i, i+1, i+2, i+3, i+4, i+5)
i = i + 5

Related

Select columns and create new dataframe

I have a dataframe with more than 5000 columns but here is an example what it looks like:
data = {'AST_0-1': [1, 2, 3],
'AST_0-45': [4, 5, 6],
'AST_0-135': [7, 8, 20],
'AST_10-1': [10, 20, 32],
'AST_10-45': [47, 56, 67],
'AST_10-135': [48, 57, 64],
'AST_110-1': [100, 85, 93],
'AST_110-45': [100, 25, 37],
'AST_110-135': [44, 55, 67]}
I want to create multiple new dataframes based on the numbers after the "-" in the columns names. For example, a dataframe with all the columns that endes with "1" [df1=(AST_0-1;AST_10-1;AST_100-1)], another that ends with "45" and another ends with "135". To do that I know I will need a loop but I am actually having trouble to select the columns to then create the dataframes.
You can use str.extract on the v column names to get the wanted I'd, then groupby on axis=1.
Here creating a dictionary of dataframes.
group = df.columns.str.extract(r'(\d+)$', expand=False)
out = dict(list(df.groupby(group, axis=1)))
Output:
{'1': AST_0-1 AST_10-1 AST_110-1
0 1 10 100
1 2 20 85
2 3 32 93,
'135': AST_0-135 AST_10-135 AST_110-135
0 7 48 44
1 8 57 55
2 20 64 67,
'45': AST_0-45 AST_10-45 AST_110-45
0 4 47 100
1 5 56 25
2 6 67 37}
Accessing ID 135:
out['135']
AST_0-135 AST_10-135 AST_110-135
0 7 48 44
1 8 57 55
2 20 64 67
Use:
df = pd.DataFrame(data)
dfs = dict(list(df.groupby(df.columns.str.rsplit('-', n=1).str[1], axis=1)))
Output:
>>> dfs
{'1': AST_0-1 AST_10-1 AST_110-1
0 1 10 100
1 2 20 85
2 3 32 93,
'135': AST_0-135 AST_10-135 AST_110-135
0 7 48 44
1 8 57 55
2 20 64 67,
'45': AST_0-45 AST_10-45 AST_110-45
0 4 47 100
1 5 56 25
2 6 67 37}
I know it's strongly discouraged but maybe you want to create dataframes like df1, df135, df45. In this case, you can use:
for name, df in dfs.items():
locals()[f'df{name}'] = df
>>> df1
AST_0-1 AST_10-1 AST_110-1
0 1 10 100
1 2 20 85
2 3 32 93
>>> df135
AST_0-135 AST_10-135 AST_110-135
0 7 48 44
1 8 57 55
2 20 64 67
>>> df45
AST_0-45 AST_10-45 AST_110-45
0 4 47 100
1 5 56 25
2 6 67 37
data = {'AST_0-1': [1, 2, 3],
'AST_0-45': [4, 5, 6],
'AST_0-135': [7, 8, 20],
'AST_10-1': [10, 20, 32],
'AST_10-45': [47, 56, 67],
'AST_10-135': [48, 57, 64],
'AST_110-1': [100, 85, 93],
'AST_110-45': [100, 25, 37],
'AST_110-135': [44, 55, 67]}
import pandas as pd
df = pd.DataFrame(data)
value_list = ["1", "45", "135"]
for value in value_list:
interest_columns = [col for col in df.columns if col.split("-")[1] == value]
df_filtered = df[interest_columns]
print(df_filtered)
Output:
AST_0-1 AST_10-1 AST_110-1
0 1 10 100
1 2 20 85
2 3 32 93
AST_0-45 AST_10-45 AST_110-45
0 4 47 100
1 5 56 25
2 6 67 37
AST_0-135 AST_10-135 AST_110-135
0 7 48 44
1 8 57 55
2 20 64 67
I assume your problem is with the keys of the dictionary. you can get list of the keys with data.keys() then iterate it
for example
df1 = pd.DataFrame()
df45 = pd.DataFrame()
df135 = pd.DataFrame()
for i in list(data.keys()):
the_key = i.split('-')
if the_key[1] == '1':
df1[i] = data[i]
elif the_key[1] == '45':
df45[i] = data[i]
elif the_key[1] == '135':
df135[i] = data[i]

pythonic way to count multiple columns conditionaly check

I'm trying to make a ordinary loop under specific conditions.
I want to interact over rows, checking conditions, and then interact over columns counting how many times the condition was meet.
This counting should generate a new column e my dataframe indicating the total count for each row.
I tried to use apply and mapapply with no success.
I successfully generated the following code to reach my goal.
But I bet there is more efficient ways, or even, built-in pandas functions to do it.
Anyone know how?
sample code:
import pandas as pd
df = pd.DataFrame({'1column': [11, 22, 33, 44],
'2column': [32, 42, 15, 35],
'3column': [33, 77, 26, 64],
'4column': [99, 11, 110, 22],
'5column': [20, 64, 55, 33],
'6column': [10, 77, 77, 10]})
check_columns = ['3column','5column', '6column' ]
df1 = df.copy()
df1['bignum_count'] = 0
for column in check_columns:
inner_loop_count = []
bigseries = df[column]>=50
for big in bigseries:
if big:
inner_loop_count.append(1)
else:
inner_loop_count.append(0)
df1['bignum_count'] += inner_loop_count
# View the dataframe
df1
results:
1column 2column 3column 4column 5column 6column bignum_count
0 11 32 33 99 20 10 0
1 22 42 77 11 64 77 3
2 33 15 26 110 55 77 2
3 44 35 64 22 33 10 1
Index on the columns of interest and check which are greater or equal (ge) than a threshold:
df['bignum_count'] = df[check_columns].ge(50).sum(1)
print(df)
1column 2column 3column 4column 5column 6column bignum_count
0 11 32 33 99 20 10 0
1 22 42 77 11 64 77 3
2 33 15 26 110 55 77 2
3 44 35 64 22 33 10 1
check_columns
df1 = df.copy()
Use DataFrame.ge for >= with counts Trues values by sum:
df['bignum_count'] = df[check_columns].ge(50).sum(axis=1)
#alternative
#df['bignum_count'] = (df[check_columns]>=50).sum(axis=1)
print(df)
1column 2column 3column 4column 5column 6column bignum_count
0 11 32 33 99 20 10 0
1 22 42 77 11 64 77 3
2 33 15 26 110 55 77 2
3 44 35 64 22 33 10 1

How to get the common values among lists in dataframe column?

So I have a tsv file in the following format:
Gene version start end
ADK 1 23,45,67,89 30,51,79,96
ADK 2 23,67,89 30,79,96
ADK 3 23,89 30,96
I want to create a dictionary with only the start and ends which are common across all 3 versions for a particular gene. The dictionary should be in the following format:
{'ADK':{'start':[23,89], 'end':[30.96]}
The code that I am trying till now is:
def get_strong_ranges(file):
for entry in utils.records_iterator(file):
if entry['gene'] not in gene_exons:
gene_exons[entry['gene']] = {'start': list(),'end': list()}
gene_exons[entry['gene']]['start'].append(entry['start'])
gene_exons[entry['gene']]['end'].append(entry['end'])
However, I am yet to sort out the common ones. Any suggestions on how to do that.
df = pd.DataFrame({'gene': ['ADK', 'ADK', 'ADK'], 'version': [1,2,3], 'start': [[23,45,67,89], [23,67,89], [23,89]], 'end': [[30,51,79,96], [30,79,96], [30,96]]})
df
Out[14]:
gene version start end
0 ADK 1 [23, 45, 67, 89] [30, 51, 79, 96]
1 ADK 2 [23, 67, 89] [30, 79, 96]
2 ADK 3 [23, 89] [30, 96]
Convert the start column from lists to numbers:
start_df = df.explode('start')
start_df
Out[16]:
gene version start end
0 ADK 1 23 [30, 51, 79, 96]
0 ADK 1 45 [30, 51, 79, 96]
0 ADK 1 67 [30, 51, 79, 96]
0 ADK 1 89 [30, 51, 79, 96]
1 ADK 2 23 [30, 79, 96]
1 ADK 2 67 [30, 79, 96]
1 ADK 2 89 [30, 79, 96]
2 ADK 3 23 [30, 96]
2 ADK 3 89 [30, 96]
count the number of versions per value of "start":
start_df_counts = start_df.groupby(['gene', 'start'])['version'].count()
Out[19]:
gene start
ADK 23 3
45 1
67 2
89 3
Name: version, dtype: int64
compare it to the number of unique versions:
start_df_counts == len(set(start_df['version']))
Out[20]:
gene start
ADK 23 True
45 False
67 False
89 True
Name: version, dtype: bool
Take only those values:
start_df_counts[start_df_counts == len(set(start_df['version']))]
Out[30]:
gene start
ADK 23 3
89 3
Name: version, dtype: int64
Now, group by the gene and convert to list:
start_df_common = start_df_counts[start_df_counts == len(set(start_df['version']))]
start_df_common = start_df_common.reset_index()
start_df_common.groupby('gene')['start'].apply(list)
Out[35]:
gene
ADK [23, 89]
and finally, we can convert it to dict:
final_start_dict = start_df_common.groupby('gene')['start'].apply(list).to_dict()
final_start_dict
Out[38]: {'ADK': [23, 89]}
Now you can apply the same for the end column.
Hope it helps :)

Printing numbers in a 4 by 4 grid in the most efficient way

So I have this code that makes an array with 4 arrays in it each holding 4 values and it prints it in 4 rows and 4 columns. I was looking at the code and thought there was a more concise way of doing it so if anyone can think of any please tell me
import random
n=[[random.randint(0,100) for i in range(4)] for i in range(4)]
for i in range(len(n)):
o=""
for j in range(len(n[i])):
o+=str(n[i][j])+" "
print(o)
This is what I came up with:
import random
n=[[str(random.randint(0,100)) for i in range(4)] for i in range(4)]
for row in n:
print('\t'.join(row))
OUTPUT:
53 34 62 45
21 45 39 94
52 75 53 94
88 16 97 80
I converted the numbers to string and applied the join method, joining each number with a tab ('\t').
str.join() function is much more efficient than adding multiple strings. You can do it like:
import random
n=[[random.randint(0,100) for i in range(4)] for i in range(4)]
## n = [[64, 77, 76, 72], [43, 41, 30, 50], [59, 0, 34, 20], [41, 73, 81, 42]]
print( "\n".join( " ".join(str(value) for value in row) for row in n ) )
# Output:
# 64 77 76 72
# 43 41 30 50
# 59 0 34 20
# 41 73 81 42

Comparing between one value and a range of values in an array

I have a 2D array that looks like this:
a = [[ 0 0 0 0 0 25 30 35 40 45 50 55 60 65 70 75]
[ 4 5 6 7 8 29 34 39 44 49 54 59 64 69 74 250]]
and I also have another 1D array that looks like this:
age_array = [45,46,3,7]
is there a way to verify that the values in age_array are within the range of the 2 values in the first column of a and if not then move on to the next column? For example,
if a[0: , :] <= age_array[i] <= a[1:, :]
return True
else: return False
If you want to know if each entry in the age array is between the a[0][0] and a[1][0]
a = [[0, 0, 0, 0, 0, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75],
[4, 5, 6, 7, 8, 29, 34, 39, 44, 49, 54, 59, 64, 69, 74, 250]]
age_array = [45,46,3,7]
dct = {}
for age in age_array:
for i in range(len(a[0])):
if a[0][i] <= age and age <= a[1][i]:
print(str(age) + ' is between ' + str(a[0][i]) + ' and ' + str(a[1][i]))
break
This outputs:
45 is between 45 and 49
46 is between 45 and 49
3 is between 0 and 4
7 is between 0 and 7
You can convert both the arrays into sets and then check if age_array set is a subset of a set.
Unfortunately I cannot post answer as your first array is not properly formatted
Very simple to understand but it might look quite ugly.
value=[]
for x in range(len(a)):
for xx in range(len(a[x])):
for xxx in range(len(b)):
if a[x][xx]==b[xxx]:
value.append("true")
else:
value.append("false")
for a in value:
if a=="true":
#it falls in the category

Categories

Resources