Pandas Slicing Between Dates Then Replace Values With Zero - python

I have the following DataFrame:
Channel Column 1 Column 2 Column 3
Date
12/30/2018 638 4472 487
12/31/2018 868 6985 540
1/1/2019 755 4401 829
1/2/2019 1655 9484 1145
1/3/2019 2002 14212 1158
1/4/2019 1633 9575 1098
1/5/2019 1026 5575 941
1/6/2019 1025 4963 1007
1/7/2019 1944 10685 1246
1/8/2019 2140 9932 1151
1/9/2019 2067 1031 1087
1/10/2019 2168 1005 1074
1/11/2019 2052 9371 909
1/12/2019 1223 5953 895
1/13/2019 1268 4809 827
I would like to return the following result if possible [essentially reduce values between certain dates in a specific column to zero]
Channel Column 1 Column 2 Column 3
Date
12/30/2018 638 4472 487
12/31/2018 868 6985 540
1/1/2019 755 4401 829
1/2/2019 1655 9484 1145
1/3/2019 2002 14212 1158
1/4/2019 1633 9575 1098
1/5/2019 1026 5575 941
1/6/2019 0 4963 1007
1/7/2019 0 10685 1246
1/8/2019 0 9932 1151
1/9/2019 0 1031 1087
1/10/2019 2168 1005 1074
1/11/2019 2052 9371 909
1/12/2019 1223 5953 895
1/13/2019 1268 4809 827
I am trying to filter by a specific column at specific dates, but I can't get it to work properly.
I have tried the following approaches, but I haven't had much luck
df[df['Channel'] == 'Branded Paid Search'].loc['1/6/2019':'1/9/2019']['Sessions'].apply(lambda x: 0 if x < 4000 else 0).to_frame()
This works, but not sure how to get the values back into the original dataframe.
I tried this:
def zero(df):
if df[df['Column 1'] > 0].loc['1/6/2019':'1/9/2019']:
return 0
else:
return 1
df.apply(zero, axis=1)
ValueError: ('The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().')
I tried this:
sessions_df[sessions_df['Column 1'] > 0].loc['1/6/2019':'1/9/2019'] = 0
Nothing changes.
Any help would be appreciated

First create DatetimeIndex by to_datetime and then set values with DataFrame.loc:
df.index = pd.to_datetime(df.index)
df.loc['1/6/2019':'1/9/2019', 'Column 1'] = 0
print (df)
Column 1 Column 2 Column 3
Channel
2018-12-30 638 4472 487
2018-12-31 868 6985 540
2019-01-01 755 4401 829
2019-01-02 1655 9484 1145
2019-01-03 2002 14212 1158
2019-01-04 1633 9575 1098
2019-01-05 1026 5575 941
2019-01-06 0 4963 1007
2019-01-07 0 10685 1246
2019-01-08 0 9932 1151
2019-01-09 0 1031 1087
2019-01-10 2168 1005 1074
2019-01-11 2052 9371 909
2019-01-12 1223 5953 895
2019-01-13 1268 4809 827

Related

How can I get separate plots for each columns(A,B,C,D) vs date side by side using python

I have similar data as below in my pandas dataframe.
Date
A
B
C
D
01-01-2022
10000
1700
1457
327
02-01-2022
17000
3000
1245
526
03-01-2022
16000
2624
1478
632
04-01-2022
10138
1745
1325
800
05-01-2022
4761
1789
1475
952
06-01-2022
5000
1874
1423
1105
07-01-2022
3000
1965
1421
895
08-01-2022
4000
1847
1420
1410
09-01-2022
3001
1654
1418
564
10-01-2022
3002
1754
1417
1715
11-01-2022
3003
1598
1415
564
12-01-2022
3004
1515
1414
2020
13-01-2022
3005
1433
1412
564
14-01-2022
3006
1350
1411
2325
15-01-2022
3007
1268
1409
456
Table
How can I get separate plots side by side as date vs A, Date vs B, Date Vs C and so on, using python?
I am still learning, new to python and data visualization.
Try this, using pandas plot with subplots equal to True, and layout with (row, column) tuple:
df['Date'] = pd.to_datetime(df['Date'], format='%d-%m-%Y')
df.set_index('Date').plot(subplots=True, layout=(1,4), figsize=(15,7))
Output:

python read and rewrite values per row

I am changing an old question of mine.
I have a file with this format; 4 values per line:
2623 831 6892 0
2353 1803 3425 0
1910 1823 3810 0
1637 1287 2811 0
2803 546 6609 0
1591 2157 2367 0
2167 1906 2665 0
3192 2168 8362 0
3903 1465 2011 0
2355 1801 2004 0
2390 796 5055 0
1703 1044 3441 0
1886 1328 2731 0
1496 1277 3074 0
1827 460 5992 0
1945 1785 2065 0
1983 1963 2818 0
1532 2229 6936 0
2449 5972 1918 0
2699 2007 1581 0
and I want to get this one; 10 values per line:
2623 831 6892 0 2353 1803 3425 0 1910 1823
3810 0 1637 1287 2811 0 2803 546 6609 0
1591 2157 2367 0 2167 1906 2665 0 3192 2168
8362 0 3903 1465 2011 0 2355 1801 2004 0
2390 796 5055 0 1703 1044 3441 0 1886 1328
2731 0 1496 1277 3074 0 1827 460 5992 0
1945 1785 2065 0 1983 1963 2818 0 1532 2229
6936 0 2449 5972 1918 0 2699 2007 1581 0
with open("Read_file") as f1:
with open("Write_file"),"w") as f2:
f2.writelines(itertools.islice(f1, 4, None))
Any tip is appreciated.
Try this:
with open('data.txt') as fp, open('output.txt', 'w') as fw:
data = fp.read().replace('\n', ' ').split()
for i in range(0, len(data) // 10):
fw.write(' '.join(data[i * 10: (i + 1) * 10]) + '\n')
Output:
2623 831 6892 0 2353 1803 3425 0 1910 1823
3810 0 1637 1287 2811 0 2803 546 6609 0
1591 2157 2367 0 2167 1906 2665 0 3192 2168
8362 0 3903 1465 2011 0 2355 1801 2004 0
2390 796 5055 0 1703 1044 3441 0 1886 1328
2731 0 1496 1277 3074 0 1827 460 5992 0
1945 1785 2065 0 1983 1963 2818 0 1532 2229
6936 0 2449 5972 1918 0 2699 2007 1581 0
A version that does not rely on reading the whole file into memory:
def get_words(f):
for line in f:
for word in line.split():
yield word
def chunk_values(iterator, num):
while True:
yield [next(iterator) for _ in range(num)]
with open('input.txt') as fin, open('output.txt', 'w') as fout:
for chunk in chunk_values(get_words(fin), 10):
fout.write(' '.join(chunk) + '\n')

Why is Pandas resample sampling out of sample?

I've got an issue with pandas resample function when trying resample a time series. My program fetches daily traffic data two years back from today and populates it in a .csv file. Resampling the data initially functioned well but recently it has started acting up. When I try to resample the daily data into weekly, monthly or quarterly frequency, pandas seems to randomly give out-of sample (non-existent) data from both sides of the actual range.
I first create a Pandas data frame from the csv file:
data = pd.read_csv('Trucks.csv')
data['Date'] = pd.to_datetime(data['Date'], infer_datetime_format=True)
data.set_index('Date',inplace=True)
data['Modified Total Trucks'] = data['Modified Total Trucks'].astype(int)
Here's a sample of the data:
Date Total Trucks Modified Total Trucks Solo Trucks Semi Trucks Full Trucks
2020-07-04 3898 2535 805 2281 812
2020-06-04 4125 2740 927 2378 820
2020-05-04 730 569 234 431 65
2020-04-04 465 354 145 270 50
2020-03-04 3501 2377 812 2051 638
2020-02-04 3594 2334 754 2081 759
...
2018-04-13 3243 2333 819 1978 446
2018-12-04 3402 2394 767 2144 491
2018-11-04 3559 2543 859 2209 491
2018-10-04 3492 2473 813 2182 497
2018-09-04 3733 2672 902 2321 510
I then try to resample the data:
DataWeekly = data.resample('1W').sum()
DataMonthly = data.resample('1M').sum()
DataQuarterly = data.resample('1Q').sum()
However, the resampled data frames have the wrong range and sometimes incorrect values. Here's an example of the monthly set:
Date Total Trucks Modified Total Trucks Solo Trucks Semi Trucks Full Trucks
2018-01-31 15553 11119 3842 9531 2180
2018-02-28 18488 13113 4497 11291 2700
2018-03-31 21355 15177 5134 13176 3045
2018-04-30 67785 48478 16524 41893 9368
2018-05-31 72390 51690 17666 44594 10130
2018-06-30 63877 45356 14938 40000 8939
2018-07-31 64846 46437 16108 39703 9035
2018-08-31 68352 49036 16905 42081 9366
2018-09-30 64629 46379 15963 39842 8824
2018-10-31 68093 48609 16806 41643 9644
2018-11-30 74643 53052 18581 45073 10989
2018-12-31 60270 43042 15030 36649 8591
2019-01-31 76866 55463 18994 47789 10083
2019-02-28 74705 53744 18170 46674 9861
2019-03-31 78664 56562 19108 49144 10412
2019-04-30 77760 56175 19356 48224 10180
2019-05-31 88033 63219 22049 53859 12125
2019-06-30 70370 50626 17448 43454 9468
2019-07-31 76014 54531 18698 46947 10369
2019-08-31 83509 60418 21600 50653 11256
2019-09-30 77289 55375 19097 47517 10675
2019-10-31 83514 60021 20761 51397 11356
2019-11-30 81383 58460 20550 49551 11282
2019-12-31 68307 49172 17092 41990 9225
2020-01-31 59448 42384 14547 36472 8429
2020-02-29 53862 38544 13687 32457 7718
2020-03-31 62950 43478 14930 37403 10617
2020-04-30 7796 5645 1968 4811 1017
2020-05-31 7983 5840 2053 4951 979
2020-06-30 11200 7918 2785 6710 1705
2020-07-31 10998 7673 2576 6691 1731
2020-08-31 4602 3323 1155 2838 609
2020-09-30 7980 5794 1991 4981 1008
2020-10-31 9759 7060 2464 6012 1283
2020-11-30 7762 5595 1906 4836 1020
2020-12-31 7642 5412 1790 4760 1092
I would expect the resample to be:
2018-04-30 67785 48478 16524 41893 9368
2018-05-31 72390 51690 17666 44594 10130
2018-06-30 63877 45356 14938 40000 8939
2018-07-31 64846 46437 16108 39703 9035
2018-08-31 68352 49036 16905 42081 9366
2018-09-30 64629 46379 15963 39842 8824
2018-10-31 68093 48609 16806 41643 9644
2018-11-30 74643 53052 18581 45073 10989
2018-12-31 60270 43042 15030 36649 8591
2019-01-31 76866 55463 18994 47789 10083
2019-02-28 74705 53744 18170 46674 9861
2019-03-31 78664 56562 19108 49144 10412
2019-04-30 77760 56175 19356 48224 10180
2019-05-31 88033 63219 22049 53859 12125
2019-06-30 70370 50626 17448 43454 9468
2019-07-31 76014 54531 18698 46947 10369
2019-08-31 83509 60418 21600 50653 11256
2019-09-30 77289 55375 19097 47517 10675
2019-10-31 83514 60021 20761 51397 11356
2019-11-30 81383 58460 20550 49551 11282
2019-12-31 68307 49172 17092 41990 9225
2020-01-31 59448 42384 14547 36472 8429
2020-02-29 53862 38544 13687 32457 7718
2020-03-31 62950 43478 14930 37403 10617
2020-04-30 7796 5645 1968 4811 1017
What am I missing? Many thanks in advance!
I think this is a problem with US vs ISO (European) time format, i.e. YYYY-DD-MM vs YYYY-MM-DD, it looks like it reads 2018-01-04 as 4th of January and puts it into the 2018-01-31 block (i.e. January 2018).
you want to set the option dayfirst=True in your pd.to_datetime call, see the Pandas doc for more details.

Find max values for each 5 rows in pd.DateFrame

I have some marketing data with 1-minute interval.
As a sample of csv-table, each row represents max values for each minute:
time ch1 ch2 ch3 ch4
20:03 1754 539 149 1337
20:04 2073 576 160 1448
20:05 2246 599 176 1515
20:06 2246 637 176 1531
20:07 2457 651 183 1549
20:08 2564 677 184 1655
20:09 2624 712 191 1699
20:10 2742 717 194 1672
20:11 2788 714 199 1675
20:12 2792 693 186 1680
20:13 2914 708 188 1672
20:14 3067 715 194 1685
20:15 3067 725 196 1682
additionally, I need to find max values for each 5 minute. So I need to find max for every 5 rows (or less - if there are no more rows remained) of each columns and insert it to new 5-minute row.
What I looking to recieve (as example):
each new row has to represent max value for 5
time ch1 ch2 ch3 ch4
20:03 2564 677 184 1655
20:08 2914 717 199 1699
20:13 3067 725 196 1685
I honestly have searched but no result.
Is there in Python some elegant solution for my task?
Thank for your help!
g = df.groupby(np.arange(len(df)) // 5)
g.max().assign(time=g.time.first())
time ch1 ch2 ch3 ch4 ch5
0 20:03 2457 651 183 1549 4840
1 20:08 2792 717 199 1699 5376
2 20:13 3067 725 196 1685 5670
By using your input :
df['group']=df.index//5
target=df.groupby('group').agg(max)
target['time']=df.groupby('group').time.agg(min)
Out[511]:
time ch1 ch2 ch3 ch4 ch5
group
0 20:03 2457 651 183 1549 4840
1 20:08 2792 717 199 1699 5376
2 20:13 3067 725 196 1685 5670
Im going to assume that you did not convert your values to datetime since you specified this is a csv table of data, so I will convert the index to datetime.
df.index = pd.to_datetime(df.time,format='%H:%M')
Now that the index is of datetime format we can use resample to group by 5 minute intervals. Note: I will set the base to 3 here since that is how you wanted it formatted, however I think in the long run you may be better suited leaving it at 0. So to group the data just run
df.resample('5T',base=3).max().drop('time',1)
To dynamically set the base to the first minute value use
df.resample('5T',base=int(df.time.values[0][-1:])).max().drop('time',1)
Yields
ch1 ch2 ch3 ch4
time
2017-09-20 20:03:00 2457 651 183 1549
2017-09-20 20:08:00 2792 717 199 1699
2017-09-20 20:13:00 3067 725 196 1685
If you dont want the date in the index just run
df.index = df.index.time
However, you need the date included to resample
ch1 ch2 ch3 ch4
20:03:00 2457 651 183 1549
20:08:00 2792 717 199 1699
20:13:00 3067 725 196 1685

file output in python giving me garbage

When I write the following code I get garbage for an output. It is just a simple program to find prime numbers. It works when the first for loops range only goes up to 1000 but once the range becomes large the program fail's to output meaningful data
output = open("output.dat", 'w')
for i in range(2, 10000):
prime = 1
for j in range(2, i-1):
if i%j == 0:
prime = 0
j = i-1
if prime == 1:
output.write(str(i) + " " )
output.close()
print "writing finished"
This is a known Notepad bug. Check out
http://blogs.msdn.com/oldnewthing/archive/2007/04/17/2158334.aspx
The classic way to trigger this bug is to put "Bush hid the facts" in a file, save it, reopen it, and scream about conspiracy theories, but I guess "2 3 5 7 11 13 17" works too, except that you don't get to scream about conspiracy theories.
You're setting a single variable named prime ten thousand times to 1, then 9998 times possibly setting it to 0, and finally (if it's not been set to 0) outputting one incomplete line (no line-end). I suspect that's not what you want to do! Maybe something like...:
output = open("output.dat", 'w')
for i in range(2, 10000):
prime = 1
for j in range(2, i-1):
if i%j == 0:
prime = 0
break
if prime == 1:
output.write(str(i) + " " )
output.close()
print "writing finished"
Note the very different indentation from what you had posted. I also used break to break out of an inner loop, which I think was what you meant where you wrote j = i - 1 (which would in fact have absolutely no effect since j would just be set to its next natural value in the very next leg of that inner loop, which would still run to the end).
With fixed indentation (which I'll have to assume is a bad paste job, otherwise I don't think it would run) your code outputs fine for me :
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283 293 307 311 313 317 331 337 347 349 353 359 367 373 379 383 389 397 401 409 419 421 431 433 439 443 449 457 461 463 467 479 487 491 499 503 509 521 523 541 547 557 563 569 571 577 587 593 599 601 607 613 617 619 631 641 643 647 653 659 661 673 677 683 691 701 709 719 727 733 739 743 751 757 761 769 773 787 797 809 811 821 823 827 829 839 853 857 859 863 877 881 883 887 907 911 919 929 937 941 947 953 967 971 977 983 991 997 1009 1013 1019 1021 1031 1033 1039 1049 1051 1061 1063 1069 1087 1091 1093 1097 1103 1109 1117 1123 1129 1151 1153 1163 1171 1181 1187 1193 1201 1213 1217 1223 1229 1231 1237 1249 1259 1277 1279 1283 1289 1291 1297 1301 1303 1307 1319 1321 1327 1361 1367 1373 1381 1399 1409 1423 1427 1429 1433 1439 1447 1451 1453 1459 1471 1481 1483 1487 1489 1493 1499 1511 1523 1531 1543 1549 1553 1559 1567 1571 1579 1583 1597 1601 1607 1609 1613 1619 1621 1627 1637 1657 1663 1667 1669 1693 1697 1699 1709 1721 1723 1733 1741 1747 1753 1759 1777 1783 1787 1789 1801 1811 1823 1831 1847 1861 1867 1871 1873 1877 1879 1889 1901 1907 1913 1931 1933 1949 1951 1973 1979 1987 1993 1997 1999 2003 2011 2017 2027 2029 2039 2053 2063 2069 2081 2083 2087 2089 2099 2111 2113 2129 2131 2137 2141 2143 2153 2161 2179 2203 2207 2213 2221 2237 2239 2243 2251 2267 2269 2273 2281 2287 2293 2297 2309 2311 2333 2339 2341 2347 2351 2357 2371 2377 2381 2383 2389 2393 2399 2411 2417 2423 2437 2441 2447 2459 2467 2473 2477 2503 2521 2531 2539 2543 2549 2551 2557 2579 2591 2593 2609 2617 2621 2633 2647 2657 2659 2663 2671 2677 2683 2687 2689 2693 2699 2707 2711 2713 2719 2729 2731 2741 2749 2753 2767 2777 2789 2791 2797 2801 2803 2819 2833 2837 2843 2851 2857 2861 2879 2887 2897 2903 2909 2917 2927 2939 2953 2957 2963 2969 2971 2999 3001 3011 3019 3023 3037 3041 3049 3061 3067 3079 3083 3089 3109 3119 3121 3137 3163 3167 3169 3181 3187 3191 3203 3209 3217 3221 3229 3251 3253 3257 3259 3271 3299 3301 3307 3313 3319 3323 3329 3331 3343 3347 3359 3361 3371 3373 3389 3391 3407 3413 3433 3449 3457 3461 3463 3467 3469 3491 3499 3511 3517 3527 3529 3533 3539 3541 3547 3557 3559 3571 3581 3583 3593 3607 3613 3617 3623 3631 3637 3643 3659 3671 3673 3677 3691 3697 3701 3709 3719 3727 3733 3739 3761 3767 3769 3779 3793 3797 3803 3821 3823 3833 3847 3851 3853 3863 3877 3881 3889 3907 3911 3917 3919 3923 3929 3931 3943 3947 3967 3989 4001 4003 4007 4013 4019 4021 4027 4049 4051 4057 4073 4079 4091 4093 4099 4111 4127 4129 4133 4139 4153 4157 4159 4177 4201 4211 4217 4219 4229 4231 4241 4243 4253 4259 4261 4271 4273 4283 4289 4297 4327 4337 4339 4349 4357 4363 4373 4391 4397 4409 4421 4423 4441 4447 4451 4457 4463 4481 4483 4493 4507 4513 4517 4519 4523 4547 4549 4561 4567 4583 4591 4597 4603 4621 4637 4639 4643 4649 4651 4657 4663 4673 4679 4691 4703 4721 4723 4729 4733 4751 4759 4783 4787 4789 4793 4799 4801 4813 4817 4831 4861 4871 4877 4889 4903 4909 4919 4931 4933 4937 4943 4951 4957 4967 4969 4973 4987 4993 4999 5003 5009 5011 5021 5023 5039 5051 5059 5077 5081 5087 5099 5101 5107 5113 5119 5147 5153 5167 5171 5179 5189 5197 5209 5227 5231 5233 5237 5261 5273 5279 5281 5297 5303 5309 5323 5333 5347 5351 5381 5387 5393 5399 5407 5413 5417 5419 5431 5437 5441 5443 5449 5471 5477 5479 5483 5501 5503 5507 5519 5521 5527 5531 5557 5563 5569 5573 5581 5591 5623 5639 5641 5647 5651 5653 5657 5659 5669 5683 5689 5693 5701 5711 5717 5737 5741 5743 5749 5779 5783 5791 5801 5807 5813 5821 5827 5839 5843 5849 5851 5857 5861 5867 5869 5879 5881 5897 5903 5923 5927 5939 5953 5981 5987 6007 6011 6029 6037 6043 6047 6053 6067 6073 6079 6089 6091 6101 6113 6121 6131 6133 6143 6151 6163 6173 6197 6199 6203 6211 6217 6221 6229 6247 6257 6263 6269 6271 6277 6287 6299 6301 6311 6317 6323 6329 6337 6343 6353 6359 6361 6367 6373 6379 6389 6397 6421 6427 6449 6451 6469 6473 6481 6491 6521 6529 6547 6551 6553 6563 6569 6571 6577 6581 6599 6607 6619 6637 6653 6659 6661 6673 6679 6689 6691 6701 6703 6709 6719 6733 6737 6761 6763 6779 6781 6791 6793 6803 6823 6827 6829 6833 6841 6857 6863 6869 6871 6883 6899 6907 6911 6917 6947 6949 6959 6961 6967 6971 6977 6983 6991 6997 7001 7013 7019 7027 7039 7043 7057 7069 7079 7103 7109 7121 7127 7129 7151 7159 7177 7187 7193 7207 7211 7213 7219 7229 7237 7243 7247 7253 7283 7297 7307 7309 7321 7331 7333 7349 7351 7369 7393 7411 7417 7433 7451 7457 7459 7477 7481 7487 7489 7499 7507 7517 7523 7529 7537 7541 7547 7549 7559 7561 7573 7577 7583 7589 7591 7603 7607 7621 7639 7643 7649 7669 7673 7681 7687 7691 7699 7703 7717 7723 7727 7741 7753 7757 7759 7789 7793 7817 7823 7829 7841 7853 7867 7873 7877 7879 7883 7901 7907 7919 7927 7933 7937 7949 7951 7963 7993 8009 8011 8017 8039 8053 8059 8069 8081 8087 8089 8093 8101 8111 8117 8123 8147 8161 8167 8171 8179 8191 8209 8219 8221 8231 8233 8237 8243 8263 8269 8273 8287 8291 8293 8297 8311 8317 8329 8353 8363 8369 8377 8387 8389 8419 8423 8429 8431 8443 8447 8461 8467 8501 8513 8521 8527 8537 8539 8543 8563 8573 8581 8597 8599 8609 8623 8627 8629 8641 8647 8663 8669 8677 8681 8689 8693 8699 8707 8713 8719 8731 8737 8741 8747 8753 8761 8779 8783 8803 8807 8819 8821 8831 8837 8839 8849 8861 8863 8867 8887 8893 8923 8929 8933 8941 8951 8963 8969 8971 8999 9001 9007 9011 9013 9029 9041 9043 9049 9059 9067 9091 9103 9109 9127 9133 9137 9151 9157 9161 9173 9181 9187 9199 9203 9209 9221 9227 9239 9241 9257 9277 9281 9283 9293 9311 9319 9323 9337 9341 9343 9349 9371 9377 9391 9397 9403 9413 9419 9421 9431 9433 9437 9439 9461 9463 9467 9473 9479 9491 9497 9511 9521 9533 9539 9547 9551 9587 9601 9613 9619 9623 9629 9631 9643 9649 9661 9677 9679 9689 9697 9719 9721 9733 9739 9743 9749 9767 9769 9781 9787 9791 9803 9811 9817 9829 9833 9839 9851 9857 9859 9871 9883 9887 9901 9907 9923 9929 9931 9941 9949 9967 9973
EDIT the version of indentation I ran:
output = open("output.dat", 'w')
for i in range(2, 10000):
prime = 1
for j in range(2, i-1):
if i%j == 0:
prime = 0
j = i-1
if prime == 1:
output.write(str(i) + " " )
output.close()
print "writing finished"
Your second for should be nested in the first for.
Also, this looks like a homework question. It is not clear how your output is garbage - does it not compute what you want? Or is the output scrambled? Post a copy of the output so we can see!
Don't you want your loops to be nested?
output = open("output.dat", 'w')
for i in range(2, 10000):
prime = 1
for j in range(2, i-1):
if i%j == 0:
prime = 0
j = i-1
if prime == 1:
output.write(str(i) + " " )
output.close()
print "writing finished"
so, you set prime to 1, 9998 times
then you use the final value of i (10000?, 10001?) as an end value
....
to summarize, you have serious indention problems....

Categories

Resources