Related
I want to create a column 'List' for column 'Numbers' such that it gives a list leaving the element of the corresponding row in pandas.
Table:
| Numbers | List |
| -------- | -------------- |
| 1 | [2,3,4,1] |
| 2 | [3,4,1,1] |
| 3 | [4,1,1,2] |
| 4 | [1,1,2,3] |
| 1 | [1,2,3,4] |
Can anyone help with this, please?
For general solution working with duplicated values first repeat values by numpy.tile and then remove values of diagonal for delete value of row:
df = pd.DataFrame({'Numbers':[1,2,3,4,1]})
A = np.tile(df['Numbers'], len(df)).reshape(-1, len(df))
#https://stackoverflow.com/a/46736275/2901002
df['new'] = A[~np.eye(A.shape[0],dtype=bool)].reshape(A.shape[0],-1).tolist()
print (df)
0 1 [2, 3, 4, 1]
1 2 [1, 3, 4, 1]
2 3 [1, 2, 4, 1]
3 4 [1, 2, 3, 1]
4 1 [1, 2, 3, 4]
Try this:
df = pd.DataFrame({'numbers':range(1, 5)})
df['list'] = df['numbers'].apply(lambda x: [i for i in df['numbers'] if i != x])
df
import pandas as pd
df = pd.DataFrame({'Numbers':[1,2,3,4,5]})
df['List'] = df['Numbers'].apply(
# for every cell with element x in Numbers return Numbers without the element
lambda x: [y for y in df['Numbers'] if not y==x])
which results in:
df
Numbers List
0 1 [2, 3, 4, 5]
1 2 [1, 3, 4, 5]
2 3 [1, 2, 4, 5]
3 4 [1, 2, 3, 5]
4 5 [1, 2, 3, 4]
Abhinar Khandelwal. If the task is to get any other number besides the current row value, than my answer can be fixed to the following:
import numpy as np
rng = np.random.default_rng()
numbers = rng.integers(5, size=7)
df = pd.DataFrame({'numbers':numbers})
df['list'] = df.reset_index()['index'].apply(lambda x: df[df.index != x].numbers.values)
df
But this way is much faster https://stackoverflow.com/a/73275614/18965699 :)
a matrix is given:
1 2 3 4 5 6 7 8,
8 7 6 5 4 3 2 1,
2 3 4 5 6 7 8 9,
9 8 7 6 5 4 3 2,
1 3 5 7 9 7 5 3,
3 1 5 3 2 6 5 7,
1 7 5 9 7 3 1 5,
2 6 3 5 1 7 3 2.
Define a structure for storing the matrix.
Write code that swaps the first and last rows of the matrix.
Write the code for creating a matrix of any size, filled with zeros (the size is set via the console).
Write a code that will count how many times the number 3 occurs in the matrix.
I tried solving this but My teacher says the following code is wrong. Where is my mistake??
matr = [[1, 2, 3, 4, 5, 6, 7, 8],
[8, 7, 6, 5, 4, 3, 2, 1],
[2, 3, 4, 5, 6, 7, 8, 9],
[9, 8, 7, 6, 5, 4, 3, 2],
[1, 3, 5, 7, 9, 7, 5, 3],
[3, 1, 5, 3, 2, 6, 5, 7],
[1, 7, 5, 9, 7, 3, 1, 5],
[2, 6, 3, 5, 1, 7, 3, 2]]
def will_swap_first_and_last_rows(matr):
matr[len(matr) - 1], matr[0] = matr[0], matr[len(matr) - 1]
return matr
def will_craete_matrix_of_any_size_filled_with_zeros():
m = int(input('Enter the number of rows of the matrix '))
n = int(input('enter the number of columns of the matrix '))
return [[0] * m for i in range(n)]
def will_count_how_many_times_the_number_3_occurs_in_the_matrix(matr):
s = 0
for row in matr:
for elem in row:
if elem == 3:
s += 1
return s
print(*will_swap_first_and_last_rows(matr), sep='\n')
print(will_craete_matrix_of_any_size_filled_with_zeros())
print(will_count_how_many_times_the_number_3_occurs_in_the_matrix(matr))
Your code has rows (m) and columns (n) swapped. Do it like this:
return [[0] * n for i in range(m)]
Since the groups of 6 represent dice, the order of the integers in these groups does not matter. The order of the groups within each set does matter, but to check each one it would be simple to find the permutations of each set, which I can do.
Now, I've come up with a way to achieve all this, the problem is efficiency. I want to check each set but so far I've only come up with a way using:
for i in itertools.combinations(num, 6):
W.append([i])
print(W)
print("Done building list of ",len(W)," possible dice.")
which gives me all possible dice of 6 numbers chosen from 18 integers where the order of the numbers doesn't matter. This is 18,564 groups of 6. Then, I find the combination of these dice using:
for j in itertools.combinations(W, 3):
check = list(itertools.chain(*list(itertools.chain(*j))))
check = sorted(check)
#print(check)
if check == num:
The issue arises from the fact that the second combination will iterate through 10^15 possibilities where only a small fraction of that is what I want to generate.
Another question I have:
I'm pretty sure that the total ways to assign 18 things to 3 distinct groups of 6 where the order within the groups doesn't matter is given by: 18!/(6!)^3 ~ 17 million ways, which can be iterated through relatively quickly. Is this right? Is there a way to generate these ways iteratively without filtering through a much larger set of possibilities?
One last note:
What I'm doing is trying to generate all sets of non-transitive six sided dice where each face of each dice has a distinct number. I already have the code that can check a set and decide if it fits this condition and store the best result, i.e. the highest probability of a dice beating the next dice in the line.
for k in itertools.permutations(j):
B = k[0]
G = k[1]
R = k[2]
b = 0
g = 0
r = 0
for m in B:
for n in G:
if m > n:
b = b + 1
else:
continue
for m in G:
for n in R:
if m > n:
g = g + 1
else:
continue
for m in R:
for n in B:
if m > n:
r = r + 1
else:
continue
w = w + 1
print(w)
if b <= 18 or g <= 18 or r <= 18:
continue
else:
if b >= blue and g >= green and r >= red:
Blue = B
blue = b
Green = G
green = g
Red = R
red = r
print(B)
print(b/36)
print(G)
print(g/36)
print(R)
print(r/36)
continue
else:
continue
else:
continue
As mentioned in the comments, it appears that the OP is looking for Partitions of Groups of Equal Size.
The code below is converted from the algorithm implemented in C++ found here: https://github.com/jwood000/RcppAlgos/blob/master/src/ComboGroupsUtils.cpp *. I put the comment in a block of its own below, followed by the pythonized code (N.B. it is a direct translation and there is probably room for improvement).
******* Overview of the Crucial Part of the Algorithm *******
last1 is one plus the upper bound in the previous section, so to obtain the current current upper bound, we must first add the size of a section (i.e. grpSize) and subtract one. We can now compute the length we need to reset v by subtracting idx1. E.g.
Given a portion of v with grpSize = 4, idx1 = 9, 6 groups (24 subjects) and base 0:
prev sections bound (index = 8)
/ \ |
............. 8 | 9 12 23 24 | 10 20 21 22 | 11 ...
|
idx1 (equal to last1, in this case)
Sort v past idx1:
... 8 | 9 12 10 11 | 13 14 15 16 | 17...
Determine the index, idx3, such that v[idx3] > v[idx1]
... 8 | 9 12 10 11 | 13 14 15 16 | 17 ...
| |
idx1 idx3
Swap idx1 and idx3:
... 8 | 9 13 10 11 | 12 14 15 16 | 17...
Move enough indices after idx1 to fill that specific group:
... 8 | 9 13 __ __ | 10 11 12 14 | 15 16 ...
Identify and move indices that are successively incrementing values of v past idx1:
... 8 | 9 13 14 15 | 10 11 12 16 | 17 ...
The last two steps are accomplished with std::rotate. This completes the algorithm.
Here are some helpers functions:
def rotate(l, n):
return l[n:] + l[:n]
def numGroupCombs(n, nGrps, grpSize):
result = 1
for i in range(n, nGrps, -1):
result *= i
result = int(result)
myDiv = 1
for i in range(2, grpSize + 1):
myDiv *= i
result /= myDiv**nGrps
return int(result)
And here is the main generator (The rotate function was obtained from this answer Python list rotation):
def ComboGroups(v, nGrps, grpSize):
if not isinstance(v, list):
z = list(v)
else:
z = v.copy()
for i in range(numGroupCombs(len(z), nGrps, grpSize)):
yield z.copy()
idx1 = (nGrps - 1) * grpSize - 1
idx2 = len(z) - 1;
last1 = (nGrps - 2) * grpSize + 1
while (idx2 > idx1 and z[idx2] > z[idx1]):
idx2 -= 1
if (idx2 + 1) < len(z):
if z[idx2 + 1] > z[idx1]:
z[idx1], z[idx2 + 1] = z[idx2 + 1], z[idx1]
else:
while idx1 > 0:
tipPnt = z[idx2]
while (idx1 > last1 and tipPnt < z[idx1]):
idx1 -= 1
if tipPnt > z[idx1]:
idx3 = idx1 + 1
z[idx3:] = sorted(z[idx3:])
xtr = last1 + grpSize - idx3
while z[idx3] < z[idx1]:
idx3 += 1
z[idx3], z[idx1] = z[idx1], z[idx3]
z[(idx1 + 1):(idx3 + xtr)] = rotate(z[(idx1 + 1):(idx3 + xtr)], idx3 - idx1)
break
else:
idx1 -= 2
idx2 -= grpSize
last1 -= grpSize
And here is an example usage (https://ideone.com/kygF03):
import time
def example(z, nGrps, verbose = True):
grpSize = int(len(z) / nGrps)
if verbose:
for a in ComboGroups(z, nGrps, grpSize):
print([a[i:i + grpSize] for i in range(0, len(a), grpSize)])
else:
start = time.time()
for a in ComboGroups(z, nGrps, grpSize):
b = a
end = time.time()
print(end - start)
example(list(range(1, 9)), 2)
[[1, 2, 3, 4], [5, 6, 7, 8]]
[[1, 2, 3, 5], [4, 6, 7, 8]]
[[1, 2, 3, 6], [4, 5, 7, 8]]
[[1, 2, 3, 7], [4, 5, 6, 8]]
[[1, 2, 3, 8], [4, 5, 6, 7]]
[[1, 2, 4, 5], [3, 6, 7, 8]]
[[1, 2, 4, 6], [3, 5, 7, 8]]
[[1, 2, 4, 7], [3, 5, 6, 8]]
[[1, 2, 4, 8], [3, 5, 6, 7]]
[[1, 2, 5, 6], [3, 4, 7, 8]]
[[1, 2, 5, 7], [3, 4, 6, 8]]
[[1, 2, 5, 8], [3, 4, 6, 7]]
[[1, 2, 6, 7], [3, 4, 5, 8]]
[[1, 2, 6, 8], [3, 4, 5, 7]]
[[1, 2, 7, 8], [3, 4, 5, 6]]
[[1, 3, 4, 5], [2, 6, 7, 8]]
[[1, 3, 4, 6], [2, 5, 7, 8]]
[[1, 3, 4, 7], [2, 5, 6, 8]]
[[1, 3, 4, 8], [2, 5, 6, 7]]
[[1, 3, 5, 6], [2, 4, 7, 8]]
[[1, 3, 5, 7], [2, 4, 6, 8]]
[[1, 3, 5, 8], [2, 4, 6, 7]]
[[1, 3, 6, 7], [2, 4, 5, 8]]
[[1, 3, 6, 8], [2, 4, 5, 7]]
[[1, 3, 7, 8], [2, 4, 5, 6]]
[[1, 4, 5, 6], [2, 3, 7, 8]]
[[1, 4, 5, 7], [2, 3, 6, 8]]
[[1, 4, 5, 8], [2, 3, 6, 7]]
[[1, 4, 6, 7], [2, 3, 5, 8]]
[[1, 4, 6, 8], [2, 3, 5, 7]]
[[1, 4, 7, 8], [2, 3, 5, 6]]
[[1, 5, 6, 7], [2, 3, 4, 8]]
[[1, 5, 6, 8], [2, 3, 4, 7]]
[[1, 5, 7, 8], [2, 3, 4, 6]]
[[1, 6, 7, 8], [2, 3, 4, 5]]
And for your example, the unoptimized implementation above can iterate through all 2,858,856 results in just over 4 seconds.
example(list(range(1, 19)), 3, False)
4.4308202266693115
For reference, the same algorithm in C++ runs in about 0.12 seconds.
* I am the author of RcppAlgos
First, take all the combinations of choosing 6 things from 18. These are your first dice. There are 18564 possibilities.
Then, take all the combinations of choosing 6 things from 12 remaining (924), similarly. This will give you a second die, given your first one.
Finally, the third die is just all the remaining numbers. That's 18564x924x1=17,153,136 possibilities.
But actually, wait. We don't care about the order of the three dice, either. So we can assume that "1" is on the first die. And then that the lowest number not on the first die, is on the second die. That's 6188x462x1=2,858,856 possibilities.
Here's some python code. It's not as fast as if you did the combinations yourself, but it should run fine.
And if you run this on real transitive dice, try removing the "unique" constraint! I'm curious if there are interesting dice with repeats.
import itertools
def iter_dice():
nums = list(range(1,19))
for first_die in itertools.combinations(nums[1:], 5):
first_die = (1,) + first_die
remaining = sorted(set(nums) - set(first_die))
for second_die in itertools.combinations(remaining[1:], 5):
second_die = (remaining[0],) + second_die
third_die = sorted(set(remaining) - set(second_die))
yield first_die, second_die, third_die
print(len(list(iter_dice())))
print(next(iter_dice()))
I have dataframe something like this
| ID | M001 | M002 | M003 | M004 |
|------|------|------|------|------|
| E001 | 3 | 4 | 3 | 2 |
| E002 | 4 | 5 | 5 | 3 |
| E003 | 4 | 3 | 5 | 4 |
And I want output in list but something like this for each unique ID such E001, E002 i want list of their response in each M001, M002 and so on
My required output is different variable for different id lets say
E001_response = [["M001",3],["M002",4],["M003",3],["M004",2]]
You can create Series with lists with custom lambda function:
s = df.set_index('ID').apply(lambda x: list(map(list,zip(df.columns[1:], x))), 1)
print (s)
ID
E001 [[M001, 3], [M002, 4], [M003, 3], [M004, 2]]
E002 [[M001, 4], [M002, 5], [M003, 5], [M004, 3]]
E003 [[M001, 4], [M002, 3], [M003, 5], [M004, 4]]
dtype: object
And then is possible use globals, but better is not create varables by names
for k, v in s.items():
globals()[f'{k}_response'] = v
print (E001_response)
[['M001', 3], ['M002', 4], ['M003', 3], ['M004', 2]]
Better is create dictionary:
d = s.to_dict()
print (d)
{'E001': [['M001', 3], ['M002', 4], ['M003', 3], ['M004', 2]],
'E002': [['M001', 4], ['M002', 5], ['M003', 5], ['M004', 3]],
'E003': [['M001', 4], ['M002', 3], ['M003', 5], ['M004', 4]]}
print (d['E001'])
[['M001', 3], ['M002', 4], ['M003', 3], ['M004', 2]]
Something like that:
new_df = df.apply(lambda x: list(zip(df.columns, x)), axis=1)
Output:
ID
E001 [(M001, 3), (M001, 4), (M001, 3), (M001, 2)]
E002 [(M002, 4), (M002, 5), (M002, 5), (M002, 3)]
E003 [(M003, 4), (M003, 3), (M003, 5), (M003, 4)]
dtype: object
I have Pandas DataFrame that looks like this:
| Index | Value |
|-------|--------------|
| 1 | [1, 12, 123] |
| 2 | [12, 123, 1] |
| 3 | [123, 12, 1] |
and I want to append third column with list of array elements lengths:
| Index | Value | Expected_value |
|-------|--------------|----------------|
| 1 | [1, 12, 123] | [1, 2, 3] |
| 2 | [12, 123, 1] | [2, 3, 1] |
| 3 | [123, 12, 1] | [3, 2, 1] |
I've tried to use python lambda function and mapping little bit like this:
dataframe["Expected_value"] = dataframe.value.map(lambda x: len(str(x)))
but instead of list I got sum of those lengths:
| Index | Value | Expected_value |
|-------|--------------|----------------|
| 1 | [1, 12, 123] | 6 |
| 2 | [12, 123, 1] | 6 |
| 3 | [123, 12, 1] | 6 |
You can use list comprehension with map:
dataframe["Expected_value"] = dataframe.Value.map(lambda x: [len(str(y)) for y in x])
Or nested list comprehension:
dataframe["Expected_value"] = [[len(str(y)) for y in x] for x in dataframe.Value]
There is also possible use alternative for get lengths of integers:
import math
dataframe["Expected_value"] = [[int(math.log10(y))+1 for y in x] for x in dataframe.Value]
print (dataframe)
Index Value Expected_value
0 1 [1, 12, 123] [1, 2, 3]
1 2 [12, 123, 1] [2, 3, 1]
2 3 [123, 12, 1] [3, 2, 1]
Use a list comprehension:
[[len(str(y)) for y in x] for x in df['Value'].tolist()]
# [[1, 2, 3], [2, 3, 1], [3, 2, 1]]
df['Expected_value'] = [[len(str(y)) for y in x] for x in df['Value'].tolist()]
df
Index Value Expected_value
0 1 [1, 12, 123] [1, 2, 3]
1 2 [12, 123, 1] [2, 3, 1]
2 3 [123, 12, 1] [3, 2, 1]
If you need to handle missing data,
def foo(x):
try:
return [len(str(y)) for y in x]
except TypeError:
return np.nan
df['Expected_value'] = [foo(x) for x in df['Value'].tolist()]
df
Index Value Expected_value
0 1 [1, 12, 123] [1, 2, 3]
1 2 [12, 123, 1] [2, 3, 1]
2 3 [123, 12, 1] [3, 2, 1]
It is probably the best in terms of performance when dealing with object type data. More reading at For loops with pandas - When should I care?.
Another solution with pd.DataFrame, applymap and agg:
pd.DataFrame(df['Value'].tolist()).astype(str).applymap(len).agg(list, axis=1)
0 [1, 2, 3]
1 [2, 3, 1]
2 [3, 2, 1]
dtype: object