Python DataFrame Loop through an element and assign a value - python

Here, in my code, the correlation matrix is a dataframe and diag is a list.
When I run the following code (CholDC part at the bottom), it returns numpy.float64 object is not iterable.
What do I need to do to make this code work?
def CholDC (correl, diag):
for column in correl:
j = 0
for j in correl[str(column)][j]:
Sum = correl[str(column)][j]
k = int(column)-1
if k >= 1:
Sum = Sum - correl[str(column)][k]*correl[str(j)][k]
else:
Sum = Sum
if int(column) == j:
if Sum <= 0:
print ("Should be PSD")
else:
diag[int(column)] = np.sqrt(Sum)
else:
correl[str(j)][int(column)] = Sum / diag[int(column)]
diag = []
df_correl = pd.DataFrame(df_correlation)
CholDC(df_correl, diag)

To loop through a dataframe, you need to use iterrows(). See the example below:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0,100, size=(10, 4)), columns=list('ABCD'))
print(df)
for index, row in df.iterrows():
print(row['B'], row['C'])
#dataframe output
A B C D
0 53 60 63 44
1 17 12 20 55
2 85 28 76 99
3 39 75 69 30
4 2 85 21 3
5 22 5 45 33
6 78 65 22 38
7 14 99 0 67
8 18 70 53 19
9 54 25 96 7
#output from loop
60 63
12 20
28 76
75 69
85 21
5 45
65 22
99 0
70 53
25 96
So use iterrows() in your code instead of for column in correl.

Related

Create a column in pandas which increments by 1 for every 10 rows

import pandas as pd
import numpy as np
rng = np.random.default_rng()
df = pd.DataFrame(rng.integers(0, 100, size=(100, 4)), columns=list('ABCD'))
This is my dataframe. I want to create a new column which starts from 1 and increases by 1 for every 10 rows. So, the column will have a value of 1 for the first ten rows, two for rows 11-20, 3 for 21-30... and so on.
You can use numpy's arange with floor division by your step and addition of the start:
start = 1
step = 10
df['new'] = np.arange(len(df))//step+start
output:
A B C D new
0 6 80 51 21 1
1 74 52 18 24 1
2 14 25 19 89 1
3 21 89 2 69 1
4 46 32 77 98 1
.. .. .. .. .. ...
95 62 87 89 65 10
96 88 70 44 68 10
97 71 14 2 10 10
98 45 62 89 65 10
99 62 40 45 93 10
[100 rows x 5 columns]
You can use repeat:
d['new'] = np.repeat(np.arange(1, 11), 10)

How to change array rows places whithouth numpy

Hello everyone here is my code:
n =[[34,2,55,24,22],[31,22,4,7,333],[87,74,44,12,48]]
for r in n:
for c in r:
print(c,end = " ")
print()
sums=[]
for i in n:
sum=0
for num in i:
sum+=int(num)
sums.append(sum)
print(*sums)
print(*(min(row) for row in n))
And here is what it prints out:
34 2 55 24 22
31 22 4 7 333
87 74 44 12 48
137 397 265
2 4 12
I need to change row whith smallest number and bigest number so it means row 1 and 2 like this:
31 22 4 7 333
34 2 55 24 22
87 74 44 12 48
#end result needs to look like this:
34 2 55 24 22
31 22 4 7 333
87 74 44 12 48
137 397 265
2 4 12
31 22 4 7 333
34 2 55 24 22
87 74 44 12 48
Please help me i cant use numpy because it doesnt work I tried using it but all it gives are errors.
I assume you want the list with max at the first index and the one with the min at the end,
maxs = [max(i) for i in n]
mins = [min(i) for i in n]
max_idx = maxs.index(max(maxs))
min_idx = mins.index(min(mins))
n[max_idx], n[min_idx] = n[min_idx], n[max_idx]
# you need to think about when min_idx = max_idx
# or when there's more than one max/min
If you don't mind numpy, you can use:
max_idx = np.argmax(np.max(n, axis=1))
min_idx = np.argmin(np.min(n, axis=1))

Print numbers serially in columns

I am struggling in one of the Pattern matching problems in Python
When input = 3, below is the expected output (input value is the number of columns it should print)
Expected output:
1
2 6
3 7 9
4 8
5
I am somehow moving in a wrong direction, hence would need some help in it.
This is the code I have tried so far:
def display():
n = 5
i = 1
# Outer loop for how many lines we want to print
while(i<=n):
k = i
j = 1
# Inner loop for printing natural number
while(j <= i):
print (k,end=" ")
# Logic to print natural value column-wise
k = k + n - j
j = j + 1
print("\r")
i = i + 1
#Driver code
display()
But it is giving me output as this:
1
2 6
3 7 10
4 8 11 13
5 9 12 14 15
Anybody who can help me with this?
n=10
for i in range(1,2*n):
k=i
for j in range(2*n-i if i>n else i):
print(k,end=' ')
k = k + 2*n - 2*j - 2
print()
Result
1
2 20
3 21 37
4 22 38 52
5 23 39 53 65
6 24 40 54 66 76
7 25 41 55 67 77 85
8 26 42 56 68 78 86 92
9 27 43 57 69 79 87 93 97
10 28 44 58 70 80 88 94 98 100
11 29 45 59 71 81 89 95 99
12 30 46 60 72 82 90 96
13 31 47 61 73 83 91
14 32 48 62 74 84
15 33 49 63 75
16 34 50 64
17 35 51
18 36
19
>
Here's a way, I started from scratch and not for code, much more easy for me
def build(nb_cols):
values = list(range(1, nb_cols ** 2 + 1))
res = []
for idx in range(nb_cols):
row_values, values = values[-(idx * 2 + 1):], values[:-(idx * 2 + 1)]
res.append([' '] * (nb_cols - idx - 1) + row_values + [' '] * (nb_cols - idx - 1))
for r in zip(*reversed(res)):
print(" ".join(map(str, r)))
Here's a recursive solution:
def col_counter(start, end):
yield start
if start < end:
yield from col_counter(start+1, end)
yield start
def row_generator(start, col, N, i=1):
if i < col:
start = start + 2*(N - i)
yield start
yield from row_generator(start, col, N, i+1)
def display(N):
for i, col_num in enumerate(col_counter(1, N), 1):
print(i, *row_generator(i, col_num, N))
Output:
>>> display(3)
1
2 6
3 7 9
4 8
5
>>> display(4)
1
2 8
3 9 13
4 10 14 16
5 11 15
6 12
7
>>> display(10)
1
2 20
3 21 37
4 22 38 52
5 23 39 53 65
6 24 40 54 66 76
7 25 41 55 67 77 85
8 26 42 56 68 78 86 92
9 27 43 57 69 79 87 93 97
10 28 44 58 70 80 88 94 98 100
11 29 45 59 71 81 89 95 99
12 30 46 60 72 82 90 96
13 31 47 61 73 83 91
14 32 48 62 74 84
15 33 49 63 75
16 34 50 64
17 35 51
18 36
19
Here is the solution using simple loops
def display(n):
nrow = 2*n -1 #Number of rows
i = 1
noofcols = 1 #Number of columns in each row
t = 1
while (i <= nrow):
print(i,end=' ')
if i <= n:
noofcols = i
else:
noofcols = 2*n - i
m =i
if t < noofcols:
for x in range(1,noofcols):
m = nrow + m -(2*x-1)
print(m, end=' ')
i = i+1
print()

Looping a function with Pandas DataFrames

I have some function that takes a DataFrame and an integer as arguments:
func(df, int)
The function returns a new DataFrame, e.g.:
df2 = func(df,2)
I'd like to write a loop for integers 2-10, resulting in 9 DataFrames. If I do this manually it would look like this:
df2 = func(df,2)
df3 = func(df2,3)
df4 = func(df3,4)
df5 = func(df4,5)
df6 = func(df5,6)
df7 = func(df6,7)
df8 = func(df7,8)
df9 = func(df8,9)
df10 = func(df9,10)
Is there a way to write a loop that does this?
This type of thing is what lists are for.
data_frames = [df]
for i in range(2, 11):
data_frames.append(func(data_frames[-1], i))
It's a sign of brittle code when you see variable names like df1, df2, df3, etc. Use lists when you have a sequence of related objects to build.
To clarify, this data_frames is a list of DataFrames that can be concatenated with data_frames = pd.concat(data_frames, sort=False), resulting in one DataFrame that combines the original df with everything that results from the loop, correct?
Yup, that's right. If your goal is one final data frame, you can concatenate the entire list at the end to combine the information into a single frame.
Do you mind explaining why data_frames[-1], which takes the last item of the list, returns a DataFrame? Not clear on this.
Because as you're building the list, at all times each entry is a data frame. data_frames[-1] evaluates to the last element in the list, which in this case, is the data frame you most recently appended.
You may try using itertools.accumulate as follows:
sample data
df:
a b c
0 75 18 17
1 48 56 3
import itertools
def func(x, y):
return x + y
dfs = list(itertools.accumulate([df] + list(range(2, 11)), func))
[ a b c
0 75 18 17
1 48 56 3, a b c
0 77 20 19
1 50 58 5, a b c
0 80 23 22
1 53 61 8, a b c
0 84 27 26
1 57 65 12, a b c
0 89 32 31
1 62 70 17, a b c
0 95 38 37
1 68 76 23, a b c
0 102 45 44
1 75 83 30, a b c
0 110 53 52
1 83 91 38, a b c
0 119 62 61
1 92 100 47, a b c
0 129 72 71
1 102 110 57]
dfs is the list of result dataframes where each one is the adding of 2 - 10 to the previous result
If you want concat them all into one dataframe, Use pd.concat
pd.concat(dfs)
Out[29]:
a b c
0 75 18 17
1 48 56 3
0 77 20 19
1 50 58 5
0 80 23 22
1 53 61 8
0 84 27 26
1 57 65 12
0 89 32 31
1 62 70 17
0 95 38 37
1 68 76 23
0 102 45 44
1 75 83 30
0 110 53 52
1 83 91 38
0 119 62 61
1 92 100 47
0 129 72 71
1 102 110 57
You can use exec with a formatted string:
for i in range(2, 11):
exec("df{0} = func(df{1}, {0})".format(i, i - 1 if i > 2 else ''))

Pandas, substract columns Dataframe in loop

I am new with pandas. I have a Dataframe that consists in 6 columns and I would like to make a for loop that does this:
-create a new column (nc 1)
-nc1 = column 1 - column 2
and I want to iterate this for all columns, so the last one would be:
ncx = column 5- column 6
I can substract columns like this:
df['nc'] = df.Column1 - df.Column2
but this is not useful when I try to do a loop since I always have to insert the names of colums.
Can someone help me by telling me how can I refer to columns as numbers?
Thank you!
In [26]: import numpy as np
...: import random
...: import pandas as pd
...:
...: A = pd.DataFrame(np.random.randint(100, size=(5, 6)))
In [27]: A
Out[27]:
0 1 2 3 4 5
0 82 13 17 58 68 67
1 81 45 15 11 20 63
2 0 84 34 60 90 34
3 59 28 46 96 86 53
4 45 74 14 10 5 12
In [28]: for i in range(0, 5):
...: A[(i + 6)] = A[i] - A[(i + 1)]
...:
...:
...: A
...:
Out[28]:
0 1 2 3 4 5 6 7 8 9 10
0 82 13 17 58 68 67 69 -4 -41 -10 1
1 81 45 15 11 20 63 36 30 4 -9 -43
2 0 84 34 60 90 34 -84 50 -26 -30 56
3 59 28 46 96 86 53 31 -18 -50 10 33
4 45 74 14 10 5 12 -29 60 4 5 -7
In [29]: nc = 1 #The first new column
...: A[(nc + 5)] #outputs the first new column
Out[29]:
0 69
1 36
2 -84
3 31
4 -29
Here you don't need to call it by name, just by the column number, and you can just write a simple function that calls the column + 5
Something like this:
In [31]: def call_new_column(n):
...: return(A[(n + 5)])
...:
...:
...: call_new_column(2)
Out[31]:
0 -4
1 30
2 50
3 -18
4 60

Categories

Resources