program to calculate days of the week - python
It it maybe tricky to explain.
I have to "translate" a Old BASIC program into python.
the program is called weekdays:
10 PRINT TAB(32);"WEEKDAY"
20 PRINT TAB(15);"CREATIVE COMPUTING MORRISTOWN, NEW JERSEY"
30 PRINT:PRINT:PRINT
100 PRINT "WEEKDAY IS A COMPUTER DEMONSTRATION THAT"
110 PRINT"GIVES FACTS ABOUT A DATE OF INTEREST TO YOU."
120 PRINT
130 PRINT "ENTER TODAY'S DATE IN THE FORM: 3,24,1979 ";
140 INPUT M1,D1,Y1
150 REM THIS PROGRAM DETERMINES THE DAY OF THE WEEK
160 REM FOR A DATE AFTER 1582
170 DEF FNA(A)=INT(A/4)
180 DIM T(12)
190 DEF FNB(A)=INT(A/7)
200 REM SPACE OUTPUT AND READ IN INITIAL VALUES FOR MONTHS.
210 FOR I= 1 TO 12
220 READ T(I)
230 NEXT I
240 PRINT"ENTER DAY OF BIRTH (OR OTHER DAY OF INTEREST)";
250 INPUT M,D,Y
260 PRINT
270 LET I1 = INT((Y-1500)/100)
280 REM TEST FOR DATE BEFORE CURRENT CALENDAR.
290 IF Y-1582 <0 THEN 1300
300 LET A = I1*5+(I1+3)/4
310 LET I2=INT(A-FNB(A)*7)
320 LET Y2=INT(Y/100)
330 LET Y3 =INT(Y-Y2*100)
340 LET A =Y3/4+Y3+D+T(M)+I2
350 LET B=INT(A-FNB(A)*7)+1
360 IF M > 2 THEN 470
370 IF Y3 = 0 THEN 440
380 LET T1=INT(Y-FNA(Y)*4)
390 IF T1 <> 0 THEN 470
400 IF B<>0 THEN 420
410 LET B=6
420 LET B = B-1
430 GOTO 470
440 LET A = I1-1
450 LET T1=INT(A-FNA(A)*4)
460 IF T1 = 0 THEN 400
470 IF B <>0 THEN 490
480 LET B = 7
490 IF (Y1*12+M1)*31+D1<(Y*12+M)*31+D THEN 550
500 IF (Y1*12+M1)*31+D1=(Y*12+M)*31+D THEN 530
510 PRINT M;"/";D;"/";Y;" WAS A ";
520 GOTO 570
530 PRINT M;"/";D;"/";Y;" IS A ";
540 GOTO 570
550 PRINT M;"/";D;"/";Y;" WILL BE A ";
560 REM PRINT THE DAY OF THE WEEK THE DATE FALLS ON.
570 IF B <>1 THEN 590
580 PRINT "SUNDAY."
590 IF B<>2 THEN 610
600 PRINT "MONDAY."
610 IF B<>3 THEN 630
620 PRINT "TUESDAY."
630 IF B<>4 THEN 650
640 PRINT "WEDNESDAY."
650 IF B<>5 THEN 670
660 PRINT "THURSDAY."
670 IF B<>6 THEN 690
680 GOTO 1250
690 IF B<>7 THEN 710
700 PRINT "SATURDAY."
710 IF (Y1*12+M1)*31+D1=(Y*12+M)*31+D THEN 1120
720 LET I5=Y1-Y
730 PRINT
740 LET I6=M1-M
750 LET I7=D1-D
760 IF I7>=0 THEN 790
770 LET I6= I6-1
780 LET I7=I7+30
790 IF I6>=0 THEN 820
800 LET I5=I5-1
810 LET I6=I6+12
820 IF I5<0 THEN 1310
830 IF I7 <> 0 THEN 850
835 IF I6 <> 0 THEN 850
840 PRINT"***HAPPY BIRTHDAY***"
850 PRINT " "," ","YEARS","MONTHS","DAYS"
855 PRINT " "," ","-----","------","----"
860 PRINT "YOUR AGE (IF BIRTHDATE) ",I5,I6,I7
870 LET A8 = (I5*365)+(I6*30)+I7+INT(I6/2)
880 LET K5 = I5
890 LET K6 = I6
900 LET K7 = I7
910 REM CALCULATE RETIREMENT DATE.
920 LET E = Y+65
930 REM CALCULATE TIME SPENT IN THE FOLLOWING FUNCTIONS.
940 LET F = .35
950 PRINT "YOU HAVE SLEPT ",
960 GOSUB 1370
970 LET F = .17
980 PRINT "YOU HAVE EATEN ",
990 GOSUB 1370
1000 LET F = .23
1010 IF K5 > 3 THEN 1040
1020 PRINT "YOU HAVE PLAYED",
1030 GOTO 1080
1040 IF K5 > 9 THEN 1070
1050 PRINT "YOU HAVE PLAYED/STUDIED",
1060 GOTO 1080
1070 PRINT "YOU HAVE WORKED/PLAYED",
1080 GOSUB 1370
1085 GOTO 1530
1090 PRINT "YOU HAVE RELAXED ",K5,K6,K7
1100 PRINT
1110 PRINT TAB(16);"*** YOU MAY RETIRE IN";E;" ***"
1120 PRINT
1140 PRINT
1200 PRINT
1210 PRINT
1220 PRINT
1230 PRINT
1240 END
1250 IF D=13 THEN 1280
1260 PRINT "FRIDAY."
1270 GOTO 710
1280 PRINT "FRIDAY THE THIRTEENTH---BEWARE!"
1290 GOTO 710
1300 PRINT "NOT PREPARED TO GIVE DAY OF WEEK PRIOR TO MDLXXXII. "
1310 GOTO 1140
1320 REM TABLE OF VALUES FOR THE MONTHS TO BE USED IN CALCULATIONS.
1330 DATA 0, 3, 3, 6, 1, 4, 6, 2, 5, 0, 3, 5
1340 REM THIS IS THE CURRENT DATE USED IN THE CALCULATIONS.
1350 REM THIS IS THE DATE TO BE CALCULATED ON.
1360 REM CALCULATE TIME IN YEARS, MONTHS, AND DAYS
1370 LET K1=INT(F*A8)
1380 LET I5 = INT(K1/365)
1390 LET K1 = K1- (I5*365)
1400 LET I6 = INT(K1/30)
1410 LET I7 = K1 -(I6*30)
1420 LET K5 = K5-I5
1430 LET K6 =K6-I6
1440 LET K7 = K7-I7
1450 IF K7>=0 THEN 1480
1460 LET K7=K7+30
1470 LET K6=K6-1
1480 IF K6>0 THEN 1510
1490 LET K6=K6+12
1500 LET K5=K5-1
1510 PRINT I5,I6,I7
1520 RETURN
1530 IF K6=12 THEN 1550
1540 GOTO 1090
1550 LET K5=K5+1
1560 LET K6=0
1570 GOTO 1090
1580 REM
1590 END
this program will take current date, and date of birth and return some statistics eg how long you have lives, how many days you have slept.
For part of the assignment, I have to explain what each variable means in the OLD BASIC program. In the old days, the variable name can only be things like A1, B3 etc...
In this program, There is an array
Call
DATA = [0, 3, 3, 6, 1, 4, 6, 2, 5, 0, 3, 5]
There are 12 numbers in this array. I realized that the program will read each number and match from Jan to Dec and I also find out this is to deal with calculating what is it is eg Monday. Tuesday.
I have found that much so far but can anybody explain to me what those numbers in DATA array mean exactly.
thanks.
Without pulling all the code apart, it looks like it's the offset for the start of the week for a given month...
Assume Jan 1st is a Tuesday (like 2013)...
Jan 0 Tuesday
Feb 3 Friday (Tuesday + 3)
Mar 3 Friday (Tuesday + 3)
Apr 6 Monday (Tuesday + 6)
etc...
This seems to assume it's not a leap year otherwise the number from March onwards would need to be decreased by 1 to allow for the extra day.
Related
How to find an optimal solutions for 2 teams playing against each other?
I am given a table of teams A and B where for each pair of 2 players there is number. The rows represent players of players of team A and columns of players of the team B. If a number is positive, it means that the player A is better than the plyaer from the B team and vice versa if negative. For example: -710 415 527 -641 175 48 -447 -799 253 626 304 895 509 -523 -758 -678 -689 92 24 -318 -61 -9 174 255 487 408 696 861 -394 -67 Both teams know this table. Now, what is done is that the team A reports 5 players, the team B can look at them and choose the best 5 players for them. If we want to compere the teams we sum up the numbers on the given positions from the table knowing that each team has a captain who is counted twice (as if a team had 6 players and the captain is there twice), if the sum is positive, the team A is better. The input are numbers a (the number of rows/players A) and b (columns/players B) and the table like this: 6 6 -54 -927 428 -510 911 93 -710 415 527 -641 175 48 -447 -799 253 626 304 895 509 -523 -758 -678 -689 92 24 -318 -61 -9 174 255 487 408 696 861 -394 -67 The output should be 1282. So, what I did was that I put the numbers into a matrix like this: a, b = int(input()), int(input()) matrix = [list(map(int,input().split())) for _ in range(a)] I used a MinHeap and a MaxHeap for this. I put the rows into the MaxHeap because A team wants the biggest, then I get 5 best A players from it as follows: for player, values in enumerate(matrix): maxheap.enqueue(sum(values), player) playersA = [] overallA = 0 for i in range(5): ov, pl = maxheap.remove_max() if i == 0: # it is a captain playersA.append(pl) overallA += ov playersA.append(pl) overallA += ov The B team knowing the A players the uses the MinHeap to find its best 5 players: for i in range(b): player = [] ov = 0 for j in range(a): #take out a column of a matrix player.append(matrix[j][i]) for rival in playersA: #counting only players already chosen by A ov += player[rival] minheap.enqueue(ov,i) playersB = [] overallB = 0 for i in range(5): ov, pl = minheap.remove_min() if i == 0: playersB.append(pl) overallB += ov playersB.append(pl) overallB += ov Having the players, then I count the sum from the matrix: out = 0 for a in playersA: for b in playersB: out += matrix[a][b] print(out) However, this solution doesn't give the right solutions always. For example, it does for the input: 10 10 -802 -781 826 997 -403 243 -533 -694 195 182 103 182 -14 130 953 -900 43 334 -724 716 -350 506 184 691 -785 742 -303 -682 186 -520 25 -815 475 -407 -78 509 -512 714 898 243 758 -743 -504 -160 855 -792 -177 747 188 -190 333 -439 529 795 -500 112 625 -2 -994 282 824 498 -899 158 453 644 117 598 432 310 -799 594 933 -15 47 -687 68 480 -933 -631 741 400 979 -52 -78 -744 -573 -170 882 -610 -376 -928 -324 658 -538 811 -724 848 344 -308 But it doesn't for 11 11 279 475 -894 -641 -716 687 253 -451 580 -727 -509 880 -778 -867 -527 816 -458 -136 -517 217 58 740 360 -841 492 -3 940 754 -584 715 -389 438 -887 -739 664 972 838 -974 -802 799 258 628 3 815 952 -404 -273 -323 -948 674 687 233 62 -339 352 285 -535 -812 -452 -335 -452 -799 -902 691 195 -837 -78 56 459 -178 631 -348 481 608 -131 -575 732 -212 -826 -547 440 -399 -994 486 -382 -509 483 -786 -94 -983 785 -8 445 -462 -138 804 749 890 -890 -184 872 -341 776 447 -573 405 462 -76 -69 906 -617 704 292 287 464 -711 354 428 444 -42 45 So the question is: Can it be done like this or is there another fast algorithm ( O(n ** 2 ) / O(n ** 3) etc.), or I just gave to try all the possible combinations using brute force in O(n!) time complexity?
There is a way to do that with a polynomial complexity. To show you why your solution doesn't work, let's consider an other simpler problem. Let's say each team only choose 2 players and there is no captain. Let's also take a simple score matrix: 1 1 1 2 1 1 1 1 1 1 0 3 0 2 0 0 0 0 0 4 0 0 0 0 4 Here you can see that team A has no chance to win (as there are no negative numbers), but still they are going to try their best. Who should they pick? Using your algorithm, team A should pick their best players and their ranking would be: pa0 < pa1 = pa2 < pa3 = pa4 If they choose pa3 and pa4, who both have a score of 4 (which is bad, but not as bad as pa0's score of 6), team B will win by 8 (they will choose pb4 and an other player who doesn't matter). On the other hand, if team A chose pa0 and pa1 (who are worse than pa3 and pa4 by your metric), the best team B can get is winning by 5 (if they choose pb3 and any other player) Basically, your approximation fails to take into consideration that team B can only choose two players and thus can't take advantage of the pa0+pa1 weakness while it can easily exploit pa3+pa4's one. A better solution would be for team A to evaluate each player's score only by taking into account their 2 worst scores (or 5 if 5 players are to be selected): this would make the ranking as such: pa2 < pa3 = pa4 < pa0 < pa1 Still it would be an approximation: some combinations like pa2+pa3 are actually not as bad as they sound as, once again, the weaknesses are spread enough that team B can't exploit them all (although for this very example the approximation yields the best result). What we really need to pick is not the two best players, but the best combination of two players, and sadly there is no way I know of other than trying all the $s!/(k!(s-k)!)$ combinations of k players among s (the size of the team). It is not so bad, though, as for k=2 that's only $s*(s-1)/2$ and for k=5 that's $s*(s-1)(s-2)(s-3)*(s-4)/5!$, which is still polynomial in complexity despite being in O(s^5). Adding a captain to the mix only multiplies the number of combinations by k. It also requires a twist on how to calculate the score but you should be able to find that. Now that team A have selected their players, team B have the easy job to select theirs. This is way simpler as here each player can be chosen individually. example of how this last algorithm should work with the score matrix provided in the beginning. team A has 10 possible combinations: pa0+pa1, pa0+pa2, pa0+pa3, pa0+pa4, pa1+pa2, pa1+pa3, pa1+pa4, pa2+pa3, pa2+pa4, pa3+pa4. Their respective scores are: 5, 8, 7, 7, 7, 6, 6, 7, 7, 8. The best combination is pa0+pa1, so that's what they send to team B. Team B calculate each of its player's score against pa0+pa1: pb0:2, pb1:2, pb2:2, pb3:3, pb4:2. pb3 is the best, all the others are equals, thus team B sends pb3+pb4 (for example), and the "answer" is 5.
Find overlaping rows and keep longest
I want to edit a table based on overlaping values. On column 1 I have a group name, on column 3 I have a start position value, and in column 4 is the end position. I want to keep only rows with position values (start and end) that are not contained within the range of other rows of a given group (ex CE170_HUMAN). For example, for CE170_HUMAN I have 6 rows, some of them have overlapping values: for example 165-523 (358 positions) range is contained within 1-523 range, I want to keep only the row with 1-523 as it covers a longer range (523 positions). Then do the same for the next group PURA2 and so on. Input: RAEG_00037367-RA CE170_HUMAN 557 1584 RAEG_00037368-RB CE170_HUMAN 165 523 RAEG_00037368-RA CE170_HUMAN 326 523 RAEG_00037368-RD CE170_HUMAN 165 370 RAEG_00037368-RC CE170_HUMAN 1 523 RAEG_00037368-RE CE170_HUMAN 1 370 RAEG_00037388-RB PURA2_PIG 61 456 RAEG_00037388-RC PURA2_PIG 61 357 RAEG_00037388-RA PURA2_PIG 181 456 RAEG_00037400-RA KI26B_HUMAN 454 545 RAEG_00037401-RA KI26B_HUMAN 753 2108 RAEG_00037415-RA CNST_HUMAN 137 613 RAEG_00037416-RA CNST_HUMAN 637 725 RAEG_00037420-RE ELYS_HUMAN 1 2266 RAEG_00037420-RG ELYS_HUMAN 1080 2266 RAEG_00037420-RF ELYS_HUMAN 1 2266 RAEG_00037420-RD ELYS_HUMAN 1080 2266 RAEG_00037420-RC ELYS_HUMAN 205 2266 RAEG_00037420-RB ELYS_HUMAN 1080 2266 Desired output RAEG_00037367-RA CE170_HUMAN 557 1584 RAEG_00037368-RB CE170_HUMAN 1 523 RAEG_00037388-RC PURA2_PIG 61 357 RAEG_00037400-RA KI26B_HUMAN 454 545 RAEG_00037401-RA KI26B_HUMAN 753 2108 RAEG_00037415-RA CNST_HUMAN 137 613 RAEG_00037416-RA CNST_HUMAN 637 725 RAEG_00037420-RE ELYS_HUMAN 1 2266 I am looking for a solution either on bash, perl or python. I appreciate your help!
I don't understand your format, but I am sure you can adapt this: rows = [ "Hello", "World", "Hello World" ] solution = [] found = False for i in range(len(rows)): for j in range(len(rows)): if i == j: # Comparing equal things (will result in false positive) continue if str(rows[i]) in str(rows[j]): # Not a solution found = True break if not found: # We have found a solution! solution.append(rows[i]) else: # Not a solution. Resetting found = False for i in solution: print(i)
Python print string alignment
I am printing some values in a loop in Python. My current output is as follows: 0 Data Count: 249 7348 249 4469 2768 261 20 126 1 Data Count: 288 11 288 48 2284 598 137 408 2 Data Count: 808 999 808 2896 32739 138 202 678 3 Data Count: 140 26 140 2688 8054 884 433 987 What I'd like is for all values in each column to align, despite differing character/number counts in some, to make it easier to read. The pseudo code behind this is as follows: for i in range(0,3): print i, " Data Count: ", Count_A, " ", Count_B, " ", Count_C, " ", Count_D, " ", Count_E, " ", Count_F, " ", Count_G, " ", Count_H Thanks in advance everyone!
You could use format string justification: from random import randint for i in range(5): data = [randint(0, 1000) for j in range(5)] print("{:5} {:5} {:5} {:5}".format(*data)) output: 92 460 72 630 837 214 118 677 906 328 102 320 895 998 177 922 651 742 215 938 According to the format specification from Python docs
With the % string formatting operator, the minimum width of output is specified in a placeholder as a number before the data type (the full format of a placeholder is %[key][flags][width][.precision][length type]conversion type). If the result is shorter, it will be left-padded to the specified length: from random import randint for i in range(5): data = [randint(0, 1000) for j in range(5)] print("%5d %5d %5d %5d %5d" % tuple(data)) gives: 946 937 544 636 871 232 860 704 877 716 868 849 851 488 739 419 381 695 909 518 570 756 467 351 537 (code adapted from #andreihondrari's answer)
How to display a sequence of numbers in column-major order?
Program description: Find all the prime numbers between 1 and 4,027 and print them in a table which "reads down", using as few rows as possible, and using as few sheets of paper as possible. (This is because I have to print them out on paper to turn it in.) All numbers should be right-justified in their column. The height of the columns should all be the same, except for perhaps the last column, which might have a few blank entries towards its bottom row. The plan for my first function is to find all prime numbers between the range above and put them in a list. Then I want my second function to display the list in a table that reads up to down. 2 23 59 3 29 61 5 31 67 7 37 71 11 41 73 13 43 79 17 47 83 19 53 89 ect... This all I've been able to come up with myself: def findPrimes(n): """ Adds calculated prime numbers to a list. """ prime_list = list() for number in range(1, n + 1): prime = True for i in range(2, number): if(number % i == 0): prime = False if prime: prime_list.append(number) return prime_list def displayPrimes(): pass print(findPrimes(4027)) I'm not sure how to make a row/column display in Python. I remember using Java in my previous class and we had to use a for loop inside a for loop I believe. Do I have to do something similar to that?
Although I frequently don't answer questions where the original poster hasn't even made an attempt to solve the problem themselves, I decided to make an exception of yours—mostly because I found it an interesting (and surprisingly challenging) problem that required solving a number of somewhat tricky sub-problems. I also optimized your find_primes() function slightly by taking advantage of some reatively well-know computational shortcuts for calculating them. For testing and demo purposes, I made the tables only 15 rows high to force more than one page to be generated as shown in the output at the end. from itertools import zip_longest import locale import math locale.setlocale(locale.LC_ALL, '') # enable locale-specific formatting def zip_discard(*iterables, _NULL=object()): """ Like zip_longest() but doesn't fill out all rows to equal length. https://stackoverflow.com/questions/38054593/zip-longest-without-fillvalue """ return [[entry for entry in iterable if entry is not _NULL] for iterable in zip_longest(*iterables, fillvalue=_NULL)] def grouper(n, seq): """ Group elements in sequence into groups of "n" items. """ for i in range(0, len(seq), n): yield seq[i:i+n] def tabularize(width, height, numbers): """ Print list of numbers in column-major tabular form given the dimensions of the table in characters (rows and columns). Will create multiple tables of required to display all numbers. """ # Determine number of chars needed to hold longest formatted numeric value gap = 2 # including space between numbers col_width = len('{:n}'.format(max(numbers))) + gap # Determine number of columns that will fit within the table's width. num_cols = width // col_width chunk_size = num_cols * height # maximum numbers in each table for i, chunk in enumerate(grouper(chunk_size, numbers), start=1): print('---- Page {} ----'.format(i)) num_rows = int(math.ceil(len(chunk) / num_cols)) # rounded up table = zip_discard(*grouper(num_rows, chunk)) for row in table: print(''.join(('{:{width}n}'.format(num, width=col_width) for num in row))) def find_primes(n): """ Create list of prime numbers from 1 to n. """ prime_list = [] for number in range(1, n+1): for i in range(2, int(math.sqrt(number)) + 1): if not number % i: # Evenly divisible? break # Not prime. else: prime_list.append(number) return prime_list primes = find_primes(4027) tabularize(80, 15, primes) Output: ---- Page 1 ---- 1 47 113 197 281 379 463 571 659 761 863 2 53 127 199 283 383 467 577 661 769 877 3 59 131 211 293 389 479 587 673 773 881 5 61 137 223 307 397 487 593 677 787 883 7 67 139 227 311 401 491 599 683 797 887 11 71 149 229 313 409 499 601 691 809 907 13 73 151 233 317 419 503 607 701 811 911 17 79 157 239 331 421 509 613 709 821 919 19 83 163 241 337 431 521 617 719 823 929 23 89 167 251 347 433 523 619 727 827 937 29 97 173 257 349 439 541 631 733 829 941 31 101 179 263 353 443 547 641 739 839 947 37 103 181 269 359 449 557 643 743 853 953 41 107 191 271 367 457 563 647 751 857 967 43 109 193 277 373 461 569 653 757 859 971 ---- Page 2 ---- 977 1,069 1,187 1,291 1,427 1,511 1,613 1,733 1,867 1,987 2,087 983 1,087 1,193 1,297 1,429 1,523 1,619 1,741 1,871 1,993 2,089 991 1,091 1,201 1,301 1,433 1,531 1,621 1,747 1,873 1,997 2,099 997 1,093 1,213 1,303 1,439 1,543 1,627 1,753 1,877 1,999 2,111 1,009 1,097 1,217 1,307 1,447 1,549 1,637 1,759 1,879 2,003 2,113 1,013 1,103 1,223 1,319 1,451 1,553 1,657 1,777 1,889 2,011 2,129 1,019 1,109 1,229 1,321 1,453 1,559 1,663 1,783 1,901 2,017 2,131 1,021 1,117 1,231 1,327 1,459 1,567 1,667 1,787 1,907 2,027 2,137 1,031 1,123 1,237 1,361 1,471 1,571 1,669 1,789 1,913 2,029 2,141 1,033 1,129 1,249 1,367 1,481 1,579 1,693 1,801 1,931 2,039 2,143 1,039 1,151 1,259 1,373 1,483 1,583 1,697 1,811 1,933 2,053 2,153 1,049 1,153 1,277 1,381 1,487 1,597 1,699 1,823 1,949 2,063 2,161 1,051 1,163 1,279 1,399 1,489 1,601 1,709 1,831 1,951 2,069 2,179 1,061 1,171 1,283 1,409 1,493 1,607 1,721 1,847 1,973 2,081 2,203 1,063 1,181 1,289 1,423 1,499 1,609 1,723 1,861 1,979 2,083 2,207 ---- Page 3 ---- 2,213 2,333 2,423 2,557 2,687 2,789 2,903 3,037 3,181 3,307 3,413 2,221 2,339 2,437 2,579 2,689 2,791 2,909 3,041 3,187 3,313 3,433 2,237 2,341 2,441 2,591 2,693 2,797 2,917 3,049 3,191 3,319 3,449 2,239 2,347 2,447 2,593 2,699 2,801 2,927 3,061 3,203 3,323 3,457 2,243 2,351 2,459 2,609 2,707 2,803 2,939 3,067 3,209 3,329 3,461 2,251 2,357 2,467 2,617 2,711 2,819 2,953 3,079 3,217 3,331 3,463 2,267 2,371 2,473 2,621 2,713 2,833 2,957 3,083 3,221 3,343 3,467 2,269 2,377 2,477 2,633 2,719 2,837 2,963 3,089 3,229 3,347 3,469 2,273 2,381 2,503 2,647 2,729 2,843 2,969 3,109 3,251 3,359 3,491 2,281 2,383 2,521 2,657 2,731 2,851 2,971 3,119 3,253 3,361 3,499 2,287 2,389 2,531 2,659 2,741 2,857 2,999 3,121 3,257 3,371 3,511 2,293 2,393 2,539 2,663 2,749 2,861 3,001 3,137 3,259 3,373 3,517 2,297 2,399 2,543 2,671 2,753 2,879 3,011 3,163 3,271 3,389 3,527 2,309 2,411 2,549 2,677 2,767 2,887 3,019 3,167 3,299 3,391 3,529 2,311 2,417 2,551 2,683 2,777 2,897 3,023 3,169 3,301 3,407 3,533 ---- Page 4 ---- 3,539 3,581 3,623 3,673 3,719 3,769 3,823 3,877 3,919 3,967 4,019 3,541 3,583 3,631 3,677 3,727 3,779 3,833 3,881 3,923 3,989 4,021 3,547 3,593 3,637 3,691 3,733 3,793 3,847 3,889 3,929 4,001 4,027 3,557 3,607 3,643 3,697 3,739 3,797 3,851 3,907 3,931 4,003 3,559 3,613 3,659 3,701 3,761 3,803 3,853 3,911 3,943 4,007 3,571 3,617 3,671 3,709 3,767 3,821 3,863 3,917 3,947 4,013
Pandas: select by bigger than a value
My dataframe has a column called dir, it has several values, I want to know how many the values passes a certain point. For example: df['dir'].value_counts().sort_index() It returns a Series 0 855 20 881 40 2786 70 3777 90 3964 100 4 110 2115 130 3040 140 1 160 1697 180 1734 190 3 200 618 210 3 220 1451 250 895 270 2167 280 1 290 1643 300 1 310 1894 330 1 340 965 350 1 Name: dir, dtype: int64 Here, I want to know the number of the value passed 500. In this case, it's all except 100, 140, 190,210, 280,300,330,350. How can I do that? I can get away with df['dir'].value_counts()[df['dir'].value_counts() > 500]
(df['dir'].value_counts() > 500).sum() This gets the value counts and returns them as a series of Truth Values. The parens treats this whole thing like a series. .sum() counts the True values as 1 and the False values as 0.