This question already has answers here:
Combine two columns of text in pandas dataframe
(21 answers)
Closed 1 year ago.
Below is the input data
Type Cat Var Dist Count
#joy A1 + B1 x + y + z 0:25:75 4
.cet C1 + D1 p + q 50:50 2
sam E1 + F1 g 100:3:2 10
Below is the intended output
Type Cat Var Dist Count Output
#joy A1 + B1 x + y + z 0:25:75 4 #joyA1 + B1x + y +z
.cet C1 + D1 p + q 50:50 2 .cetC1 + D1p + q
sam E1 + F1 g 100:3:2 10 samE1 + F1g
Below is the try from my end:
df.iloc[:,0:3].dot(['Type','Cat','Var'])
You can do that using
df['output'] = df['Type'].map(str) + df['Cat'].map(str) + df['Var].map(str)
you can simply use:
df['Output']=df['Type']+' '+df['Cat']+' '+df['Var']
output:
Type Cat Var Dist Count output
0 #joy A1 + B1 x + y + z 0.018229167 4 #joy A1 + B1 x + y + z
1 .cet C1 + D1 p + q 50:50:00 2 .cet C1 + D1 p + q
2 sam E1 + F1 g 100:03:02 10 sam E1 + F1 g
Base R: Using paste0
df$Output <- paste0(df$Type, df$Cat, df$Var)
Type Cat Var Dist Count Output
1 #joy A1 + B1 x + y + z 0:25:75 4 #joy A1 + B1 x + y + z
2 .cet C1 + D1 p + q 50:50 2 .cet C1 + D1 p + q
3 sam E1 + F1 g 100:3:2 10 sam E1 + F1 g
OR
library(dplyr)
df %>%
mutate(Output = paste(Type, Cat, Var, sep = ""))
Type Cat Var Dist Count Output
1 #joy A1 + B1 x + y + z 0:25:75 4 #joy A1 + B1 x + y + z
2 .cet C1 + D1 p + q 50:50 2 .cet C1 + D1 p + q
3 sam E1 + F1 g 100:3:2 10 sam E1 + F1 g
OR:
library(tidyr)
df %>%
unite(Output, c(Type, Cat, Var), remove=FALSE)
Output Type Cat Var Dist Count
1 #joy_A1 + B1_x + y + z #joy A1 + B1 x + y + z 0:25:75 4
2 .cet_C1 + D1_p + q .cet C1 + D1 p + q 50:50 2
3 sam_E1 + F1_g sam E1 + F1 g 100:3:2 10
I am in the middle of programming a chess ai. I have run into a problem when trying to calculate all possible diagonal moves for the bishop. I think the problem lies within the function: reverse_bits(). I don't think I handle negative binary numbers correctly in my program, but I might be wrong.
# ranks
rank1 = int("0000000000000000000000000000000000000000000000000000000011111111", 2)
rank2 = int("0000000000000000000000000000000000000000000000001111111100000000", 2)
rank3 = int("0000000000000000000000000000000000000000111111110000000000000000", 2)
rank4 = int("0000000000000000000000000000000011111111000000000000000000000000", 2)
rank5 = int("0000000000000000000000001111111100000000000000000000000000000000", 2)
rank6 = int("0000000000000000111111110000000000000000000000000000000000000000", 2)
rank7 = int("0000000011111111000000000000000000000000000000000000000000000000", 2)
rank8 = int("1111111100000000000000000000000000000000000000000000000000000000", 2)
# files
filea = int("1000000010000000100000001000000010000000100000001000000010000000", 2)
fileb = int("0100000001000000010000000100000001000000010000000100000001000000", 2)
filec = int("0010000000100000001000000010000000100000001000000010000000100000", 2)
filed = int("0001000000010000000100000001000000010000000100000001000000010000", 2)
filee = int("0000100000001000000010000000100000001000000010000000100000001000", 2)
filef = int("0000010000000100000001000000010000000100000001000000010000000100", 2)
fileg = int("0000001000000010000000100000001000000010000000100000001000000010", 2)
fileh = int("0000000100000001000000010000000100000001000000010000000100000001", 2)
# diagonals
d0 = int("0000000100000000000000000000000000000000000000000000000000000000", 2)
d1 = int("0000001000000001000000000000000000000000000000000000000000000000", 2)
d2 = int("0000010000000010000000010000000000000000000000000000000000000000", 2)
d3 = int("0000100000000100000000100000000100000000000000000000000000000000", 2)
d4 = int("0001000000001000000001000000001000000001000000000000000000000000", 2)
d5 = int("0010000000010000000010000000010000000010000000010000000000000000", 2)
d6 = int("0100000000100000000100000000100000000100000000100000000100000000", 2)
d7 = int("1000000001000000001000000001000000001000000001000000001000000001", 2)
d8 = int("0000000010000000010000000010000000010000000010000000010000000010", 2)
d9 = int("0000000000000000100000000100000000100000000100000000100000000100", 2)
d10 = int("0000000000000000000000001000000001000000001000000001000000001000", 2)
d11 = int("0000000000000000000000000000000010000000010000000010000000010000", 2)
d12 = int("0000000000000000000000000000000000000000100000000100000000100000", 2)
d13 = int("0000000000000000000000000000000000000000000000001000000001000000", 2)
d14 = int("0000000000000000000000000000000000000000000000000000000010000000", 2)
# anti-diagonal
ad0 = int("1000000000000000000000000000000000000000000000000000000000000000", 2)
ad1 = int("0100000010000000000000000000000000000000000000000000000000000000", 2)
ad2 = int("0010000001000000100000000000000000000000000000000000000000000000", 2)
ad3 = int("0001000000100000010000001000000000000000000000000000000000000000", 2)
ad4 = int("0000100000010000001000000100000010000000000000000000000000000000", 2)
ad5 = int("0000010000001000000100000010000001000000100000000000000000000000", 2)
ad6 = int("0000001000000100000010000001000000100000010000001000000000000000", 2)
ad7 = int("0000000100000010000001000000100000010000001000000100000010000000", 2)
ad8 = int("0000000000000001000000100000010000001000000100000010000001000000", 2)
ad9 = int("0000000000000000000000010000001000000100000010000001000000100000", 2)
ad10 = int("0000000000000000000000000000000100000010000001000000100000010000", 2)
ad11 = int("0000000000000000000000000000000000000001000000100000010000001000", 2)
ad12 = int("0000000000000000000000000000000000000000000000010000001000000100", 2)
ad13 = int("0000000000000000000000000000000000000000000000000000000100000010", 2)
ad14 = int("0000000000000000000000000000000000000000000000000000000000000001", 2)
# masks
rankmask = [rank1, rank2, rank3, rank4, rank5, rank6, rank7, rank8]
filemask = [filea, fileb, filec, filed, filee, filef, fileg, fileh]
diagonal = [d14, d13, d12, d11, d10, d9, d8, d7, d6, d5, d4, d3, d2, d1, d0]
antidiagonal = [ad14, ad13, ad12, ad11, ad10, ad9, ad8, ad7, ad6, ad5, ad4, ad3, ad2, ad1, ad0]
last_black_pm = [53, 45]
# bitboards
wp = 0
wr = 0
wn = 0
wb = 0
wq = 0
wk = 0
bp = 0
br = 0
bn = 0
bb = 0
bq = 0
bk = 0
def print_bitboard(bitboard):
board = '{:064b}'.format(bitboard)
for i in range(8):
print(board[8*i+0] + " " + board[8*i+1] + " " + board[8*i+2] + " " + board[8*i+3] + " " + board[8*i+4] + " " + board[8*i+5] + " " + board[8*i+6] + " " + board[8*i+7])
def print_chess_board(bitboard):
board = bitboard
for i in range(8):
print(board[8*i+0] + " " + board[8*i+1] + " " + board[8*i+2] + " " + board[8*i+3] + " " + board[8*i+4] + " " + board[8*i+5] + " " + board[8*i+6] + " " + board[8*i+7])
def integer_to_bitboard(integer):
bitboard = '{:064b}'.format(integer)
return bitboard
def create_starting_bitboards():
global last_black_pm, wp, wr, wn, wb, wq, wk, bp, bn, bb, bq, bk, br
bitboard_all_pieces = "rnbqkbnrpppppppp0000000000B000000000000000000000PPPPPPPPRNBQKBNR"
print_chess_board(bitboard_all_pieces)
for i in range(64):
if bitboard_all_pieces[i] == "P":
wp += 2**(63-i)
if bitboard_all_pieces[i] == "R":
wr += 2**(63-i)
if bitboard_all_pieces[i] == "N":
wn += 2**(63-i)
if bitboard_all_pieces[i] == "B":
wb += 2**(63-i)
if bitboard_all_pieces[i] == "Q":
wq += 2**(63-i)
if bitboard_all_pieces[i] == "K":
wk += 2**(63-i)
if bitboard_all_pieces[i] == "p":
bp += 2**(63-i)
if bitboard_all_pieces[i] == "r":
br += 2**(63-i)
if bitboard_all_pieces[i] == "n":
bn += 2**(63-i)
if bitboard_all_pieces[i] == "b":
bb += 2**(63-i)
if bitboard_all_pieces[i] == "q":
bq += 2**(63-i)
if bitboard_all_pieces[i] == "k":
bk += 2**(63-i)
occupied = wp | wr | wn | wb | wq | wk | bp | br | bn | bb | bq | bk
# g_white_pawn_moves(wp, wr, wn, wb, wq, wk, bp, br, bn, bb, bq, bk)
g_white_bishop_moves(wp, wr, wn, wb, wq, wk, occupied)
def reverse_bits(num):
num = '{:064b}'.format(num)[::-1]
if num[-1] == "-":
num = num[:-1]
return int(num, 2)
def vertical_horizontal_moves(s, occupied):
global rankmask, filemask
ranknum = int(s/8)
filenum = 7 - int(s % 8)
slider = 1 << s
horizontal = ((occupied - 2*slider) ^ reverse_bits(reverse_bits(occupied)-2*reverse_bits(slider))) & rankmask[ranknum]
vertical = (((occupied & filemask[filenum]) - 2 * slider) ^ reverse_bits(reverse_bits(occupied & filemask[filenum]) - 2 * reverse_bits(slider))) & filemask[filenum]
print_bitboard(vertical ^ horizontal)
return vertical ^ horizontal
def diagonal_antidiagonal_moves(s, occupied):
global diagonal, antidiagonal
diagonalnum = 7 - int(s % 8) + int(s/8)
antidiagonalnum = int(s / 8) + int(s % 8)
slider = 1 << s
diag1 = (((occupied & diagonal[diagonalnum]) - 2 * slider) ^ reverse_bits(reverse_bits(occupied & diagonal[diagonalnum]) - 2 * reverse_bits(slider))) & diagonal[diagonalnum]
diag2 = (((occupied & antidiagonal[antidiagonalnum]) - 2 * slider) ^ reverse_bits(reverse_bits(occupied & antidiagonal[antidiagonalnum]) - 2 * reverse_bits(slider))) & antidiagonal[antidiagonalnum]
return diag1 ^ diag2
def g_white_bishop_moves(wp, wr, wn, wb, wq, wk, occupied):
white_pieces = wp | wr | wn | wb | wq | wk
moves_list = []
for i in range(64):
if (wb >> i) & 1 == 1:
moves = diagonal_antidiagonal_moves(i, occupied) & ~white_pieces
for j in range(64):
if (moves >> j) & 1 == 1:
moves_list.extend((i, j))
print("")
print_bitboard(moves)
def g_white_pawn_moves(wp, wr, wn, wb, wq, wk, bp, br, bn, bb, bq, bk):
global rank8, rank4, rank5, fileh, filea, filemask
empty = ~(wp | wr | wn | wb| wq | wk | bp | br | bn | bb | bq | bk)
black = bp | br | bn | bb | bq
moves_list = []
# pawn 1 forward
moves = (wp << 8) & empty & ~ rank8
for i in range(64):
if (moves >> i) & 1 == 1:
moves_list.extend((i-8, i, ""))
# pawn 2 forward
moves = (wp << 16) & empty & (empty << 8) & rank4
for i in range(64):
if (moves >> i) & 1 == 1:
moves_list.extend((i-16, i, ""))
# pawn left capture
moves = (wp << 9) & black & ~ rank8 & ~ fileh
for i in range(64):
if (moves >> i) & 1 == 1:
moves_list.extend((i - 9, i, ""))
# pawn right capture
moves = (wp << 7) & black & ~ rank8 & ~ filea
for i in range(64):
if (moves >> i) & 1 == 1:
moves_list.extend((i - 9, i, ""))
# en passant
if last_black_pm[0] - last_black_pm[1] == 16:
filenum = 7 - int(last_black_pm[1] % 8)
# en passant left
moves = (wp << 1) & black & rank5 & ~fileh & filemask[filenum] # pawn_capture_right
for i in range(64):
if (moves >> i) & 1 == 1:
moves_list.extend((i - 1, i + 8, "E")) # store piece field/ and move field 0-63
# en passant right
moves = (wp >> 1) & black & rank5 & ~filea & filemask[filenum] # pawn_capture_left
for i in range(64):
if (moves >> i) & 1 == 1:
moves_list.extend((i + 1, i + 8, "E")) # store piece field/ and move field 0-63
# pawn promotion
# pawn 1 forward
moves = (wp << 8) & empty & rank8
for i in range(64):
if (moves >> i) & 1 == 1:
moves_list.extend((i - 8, i, "P"))
# pawn left capture
moves = (wp << 9) & black & rank8 & ~ fileh
for i in range(64):
if (moves >> i) & 1 == 1:
moves_list.extend((i - 9, i, "P"))
# pawn right capture
moves = (wp << 7) & black & rank8 & ~ filea
for i in range(64):
if (moves >> i) & 1 == 1:
moves_list.extend((i - 9, i, "P"))
print(moves_list)
create_starting_bitboards()
For example in this situation, it calculates all the possible bishop moves correctly:
r n b q k b n r
p p p p p p p p
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 B 0 0 0 0 0
0 0 0 0 0 0 0 0
P P P P P P P P
R N B Q K B N R
0 0 0 0 0 0 0 0
0 0 0 0 0 1 0 0
1 0 0 0 1 0 0 0
0 1 0 1 0 0 0 0
0 0 0 0 0 0 0 0
0 1 0 1 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
But when I move the bishop for example to another square, this happens:
r n b q k b n r
p p p p p p p p
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 B 0 0 0 0 0 0
0 0 0 0 0 0 0 0
P P P P P P P P
R N B Q K B N R
0 0 0 0 0 1 0 0
0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
When I examined the code in the function called: diagonal_antidiagonal_moves(), which finds all diagonal/antidiagonal moves, I started printing out different bitboards. I noticed that some bitboards had "-" sign in them. For example I took: reverse_bits(occupied & antidiagonal[antidiagonalnum]) - 2 * reverse_bits(slider) from
diag2 = (((occupied & antidiagonal[antidiagonalnum]) - 2 * slider) ^ reverse_bits(reverse_bits(occupied & antidiagonal[antidiagonalnum]) - 2 * reverse_bits(slider))) & antidiagonal[antidiagonalnum]
and printed out the bitboard. This was the result:
- 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 0 1 1 1 1
1 1 1 0 0 0 0 0
This is why I think that there must be something wrong when I reverse negative integers in the reverse_bits-function.
Funnily enough, the function vertical_horizontal_moves() which is used for example for finding all possible rook moves, seems to work just fine.
I hope someone might give me an idea, on what exactly is going wrong in my code.
reverse_bits is indeed wrong, as you suspect. This is easy to prove with an example: reverse_bits(-1) returns the value 0x4000000000000000.
The current implementation of reverse_bits already works for non-negative numbers, so it can be repaired by masking the input to turn it non-negative while retaining all the bits relevant in this context (the lowest 64):
def reverse_bits(num):
num = num & 0xffffffffffffffff
num = '{:064b}'.format(num)[::-1]
return int(num, 2)
I have the following code:
current code:
import math
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import linregress
c1_high = 98
c1_low = 75
c2_high = 15
c2_low = 6
c3_high = 8
c3_low = 2
def mix_gen(number):
flag = 0
container = []
y_array = [1,2,3,4,5,6,7,8,9,10,11]
while flag < number:
c1 = np.random.uniform(c1_low, c1_high)
c2 = np.random.uniform(c2_low, c2_high)
c3 = np.random.uniform(c3_low, c3_high)
tot = c1+c2+c3
if 99.99 <= tot <= 100.01:
flag += 1
container.append([c1,c2,c3])
return container
def average(x):
assert len(x) > 0
return float(sum(x)) / len(x)
def pearson_def(x, y):
assert len(x) == len(y)
n = len(x)
assert n > 0
avg_x = average(x)
avg_y = average(y)
diffprod = 0
xdiff2 = 0
ydiff2 = 0
for idx in range(n):
xdiff = x[idx] - avg_x
ydiff = y[idx] - avg_y
diffprod += xdiff * ydiff
xdiff2 += xdiff * xdiff
ydiff2 += ydiff * ydiff
return diffprod / math.sqrt(xdiff2 * ydiff2)
def corr_check():
while True:
mixes = mix_gen(5)
mixes_C1 =[item[0] for item in mixes]
mixes_C2 =[item[1] for item in mixes]
mixes_C3 =[item[2] for item in mixes]
mylen = [1,2,3,4,5]
c1_r = pearson_def(mixes_C1, mylen)
c2_r = pearson_def(mixes_C2, mylen)
c3_r = pearson_def(mixes_C3, mylen)
if c1_r >0.99 and c2_r >0.99 and c3_r>0.99:
print(mixes)
print (c1_r)
else:
continue
corr = corr_check()
print(corr)
This code provides me with effectively (when converted to a dataframe) the following output:
C1 C2 C3 sum range
1 70 20 10 100 ^
2 .. |
3 .. |
4 .. |
5 .. |
6 .. |
7 .. |
8 .. |
9 .. |
10 .. |
11 90 _
I require the sum of each row to be equal to 100 and each column to have an r^2 value (Pearson Corr.) to be > 0.99.
However, the complexity and number of iterations required renders the problem almost impossible to solve. Is there a better way of achieving this goal instead of trying to rely on the initial random number generation for all three components C1, C2 and C3?
I am trying to print a combination of np.array values, a string and and some values I get from an iterator.
The code looks like this:
import numpy as np
site = np.genfromtxt('.....\Plot_1.txt', dtype=None, delimiter='\t')
c1 = np.array([148, 108])
c2 = np.array([181, 147])
c3 = np.array([173, 153])
c4 = np.array([98, 221])
c5 = np.array([43, 153])
trees_list = [c1, c2, c3, c4, c5]
def trees_pixel(rc_list, matrix):
t_row = rc_list[0]
t_col = rc_list[1]
tree = matrix[t_row, t_col]
for i in range(1, 6, 1):
print "C",i,"=",tree
return tree
for i in trees_list:
trees_pixel(i, site)
Site is a np.array of 400x370 row/columns, that I need to read the values from. C1...C5 are the locations (row/column) from the 'site' array.
My code prints the following:
C 1 = 8.266602
C 2 = 8.266602
C 3 = 8.266602
C 4 = 8.266602
C 5 = 8.266602
C 1 = 17.89282
C 2 = 17.89282
C 3 = 17.89282
C 4 = 17.89282
C 5 = 17.89282
C 1 = 18.31433
C 2 = 18.31433
C 3 = 18.31433
C 4 = 18.31433
C 5 = 18.31433
etc...
But what I expected was:
C 1 = 8.266602
C 2 = 17.89282
C 3 = 18.31433
C 4 = 20.47229
C 5 = 13.5907
How can I do this, so I will avoid the repeating pattern? Thanks!
You're iterating twice, once inside trees_pixel and once outside of it. If I understand what you mean, you want something that looks like the following:
import numpy as np
site = np.random.random((400, 370)) # Used in place of your data
c1 = np.array([148, 108])
c2 = np.array([181, 147])
c3 = np.array([173, 153])
c4 = np.array([98, 221])
c5 = np.array([43, 153])
trees_list = [c1, c2, c3, c4, c5]
def trees_pixel(rc_list, listIdx, matrix):
t_row = rc_list[0]
t_col = rc_list[1]
tree = matrix[t_row, t_col]
print "C",listIdx,"=",tree
return tree
for i in xrange(len(trees_list)):
trees_pixel(trees_list[i], i+1, site)
C 1 = 0.820317259854
C 2 = 0.960883528796
C 3 = 0.363985436225
C 4 = 0.189575015844
C 5 = 0.667578060856
I just recently made the switch from R to python and have been having some trouble getting used to data frames again as opposed to using R's data.table. The problem I've been having is that I'd like to take a list of strings, check for a value, then sum the count of that string- broken down by user. So I would like to take this data:
A_id B C
1: a1 "up" 100
2: a2 "down" 102
3: a3 "up" 100
3: a3 "up" 250
4: a4 "left" 100
5: a5 "right" 102
And return:
A_id_grouped sum_up sum_down ... over_200_up
1: a1 1 0 ... 0
2: a2 0 1 0
3: a3 2 0 ... 1
4: a4 0 0 0
5: a5 0 0 ... 0
Before I did it with the R code (using data.table)
>DT[ ,list(A_id_grouped, sum_up = sum(B == "up"),
+ sum_down = sum(B == "down"),
+ ...,
+ over_200_up = sum(up == "up" & < 200), by=list(A)];
However all of my recent attempts with Python have failed me:
DT.agg({"D": [np.sum(DT[DT["B"]=="up"]),np.sum(DT[DT["B"]=="up"])], ...
"C": np.sum(DT[(DT["B"]=="up") & (DT["C"]>200)])
})
Thank you in advance! it seems like a simple question however I couldn't find it anywhere.
To complement unutbu's answer, here's an approach using apply on the groupby object.
>>> df.groupby('A_id').apply(lambda x: pd.Series(dict(
sum_up=(x.B == 'up').sum(),
sum_down=(x.B == 'down').sum(),
over_200_up=((x.B == 'up') & (x.C > 200)).sum()
)))
over_200_up sum_down sum_up
A_id
a1 0 0 1
a2 0 1 0
a3 1 0 2
a4 0 0 0
a5 0 0 0
There might be a better way; I'm pretty new to pandas, but this works:
import pandas as pd
import numpy as np
df = pd.DataFrame({'A_id':'a1 a2 a3 a3 a4 a5'.split(),
'B': 'up down up up left right'.split(),
'C': [100, 102, 100, 250, 100, 102]})
df['D'] = (df['B']=='up') & (df['C'] > 200)
grouped = df.groupby(['A_id'])
def sum_up(grp):
return np.sum(grp=='up')
def sum_down(grp):
return np.sum(grp=='down')
def over_200_up(grp):
return np.sum(grp)
result = grouped.agg({'B': [sum_up, sum_down],
'D': [over_200_up]})
result.columns = [col[1] for col in result.columns]
print(result)
yields
sum_up sum_down over_200_up
A_id
a1 1 0 0
a2 0 1 0
a3 2 0 1
a4 0 0 0
a5 0 0 0
An old question; I feel a better way, and avoiding the apply, would be to create a new dataframe, before grouping and aggregating:
df = df.set_index('A_id')
outcome = {'sum_up' : df.B.eq('up'),
'sum_down': df.B.eq('down'),
'over_200_up' : df.B.eq('up') & df.C.gt(200)}
outcome = pd.DataFrame(outcome).groupby(level=0).sum()
outcome
sum_up sum_down over_200_up
A_id
a1 1 0 0
a2 0 1 0
a3 2 0 1
a4 0 0 0
a5 0 0 0
Another option would be to unstack before grouping; however, I feel it is a longer, unnecessary process:
(df
.set_index(['A_id', 'B'], append = True)
.C
.unstack('B')
.assign(gt_200 = lambda df: df.up.gt(200))
.groupby(level='A_id')
.agg(sum_up=('up', 'count'),
sum_down =('down', 'count'),
over_200_up = ('gt_200', 'sum')
)
)
sum_up sum_down over_200_up
A_id
a1 1 0 0
a2 0 1 0
a3 2 0 1
a4 0 0 0
a5 0 0 0
Here, what I have recently learned using df assign and numpy's where method:
df3=
A_id B C
1: a1 "up" 100
2: a2 "down" 102
3: a3 "up" 100
3: a3 "up" 250
4: a4 "left" 100
5: a5 "right" 102
df3.assign(sum_up= np.where(df3['B']=='up',1,0),sum_down= np.where(df3['B']=='down',1,0),
over_200_up= np.where((df3['B']=='up') & (df3['C']>200),1,0)).groupby('A_id',as_index=False).agg({'sum_up':sum,'sum_down':sum,'over_200_up':sum})
outcome=
A_id sum_up sum_down over_200_up
0 a1 1 0 0
1 a2 0 1 0
2 a3 2 0 1
3 a4 0 0 0
4 a5 0 0 0
This also resembles with if you are familiar with SQL case and want to apply the same logic in pandas
select a,
sum(case when B='up' then 1 else 0 end) as sum_up
....
from table
group by a