Find max values for each 5 rows in pd.DateFrame - python

I have some marketing data with 1-minute interval.
As a sample of csv-table, each row represents max values for each minute:
time ch1 ch2 ch3 ch4
20:03 1754 539 149 1337
20:04 2073 576 160 1448
20:05 2246 599 176 1515
20:06 2246 637 176 1531
20:07 2457 651 183 1549
20:08 2564 677 184 1655
20:09 2624 712 191 1699
20:10 2742 717 194 1672
20:11 2788 714 199 1675
20:12 2792 693 186 1680
20:13 2914 708 188 1672
20:14 3067 715 194 1685
20:15 3067 725 196 1682
additionally, I need to find max values for each 5 minute. So I need to find max for every 5 rows (or less - if there are no more rows remained) of each columns and insert it to new 5-minute row.
What I looking to recieve (as example):
each new row has to represent max value for 5
time ch1 ch2 ch3 ch4
20:03 2564 677 184 1655
20:08 2914 717 199 1699
20:13 3067 725 196 1685
I honestly have searched but no result.
Is there in Python some elegant solution for my task?
Thank for your help!

g = df.groupby(np.arange(len(df)) // 5)
g.max().assign(time=g.time.first())
time ch1 ch2 ch3 ch4 ch5
0 20:03 2457 651 183 1549 4840
1 20:08 2792 717 199 1699 5376
2 20:13 3067 725 196 1685 5670

By using your input :
df['group']=df.index//5
target=df.groupby('group').agg(max)
target['time']=df.groupby('group').time.agg(min)
Out[511]:
time ch1 ch2 ch3 ch4 ch5
group
0 20:03 2457 651 183 1549 4840
1 20:08 2792 717 199 1699 5376
2 20:13 3067 725 196 1685 5670

Im going to assume that you did not convert your values to datetime since you specified this is a csv table of data, so I will convert the index to datetime.
df.index = pd.to_datetime(df.time,format='%H:%M')
Now that the index is of datetime format we can use resample to group by 5 minute intervals. Note: I will set the base to 3 here since that is how you wanted it formatted, however I think in the long run you may be better suited leaving it at 0. So to group the data just run
df.resample('5T',base=3).max().drop('time',1)
To dynamically set the base to the first minute value use
df.resample('5T',base=int(df.time.values[0][-1:])).max().drop('time',1)
Yields
ch1 ch2 ch3 ch4
time
2017-09-20 20:03:00 2457 651 183 1549
2017-09-20 20:08:00 2792 717 199 1699
2017-09-20 20:13:00 3067 725 196 1685
If you dont want the date in the index just run
df.index = df.index.time
However, you need the date included to resample
ch1 ch2 ch3 ch4
20:03:00 2457 651 183 1549
20:08:00 2792 717 199 1699
20:13:00 3067 725 196 1685

Related

Seaborn Lineplot Confidence Interval not visible for all values [duplicate]

I am using sns.lineplot to show the confidence intervals in a plot.
sns.lineplot(x = threshold, y = mrl_array, err_style = 'band', ci=95)
plt.show()
I'm getting the following plot, which doesn't show the confidence interval:
What's the problem?
There is probably only a single observation per x value.
If there is only one observation per x value, then there is no confidence interval to plot.
Bootstrapping is performed per x value, but there needs to be more than one obsevation for this to take effect.
ci: Size of the confidence interval to draw when aggregating with an estimator. 'sd' means to draw the standard deviation of the data. Setting to None will skip bootstrapping.
Note the following examples from seaborn.lineplot.
This is also the case for sns.relplot with kind='line'.
The question specifies sns.lineplot, but this answer applies to any seaborn plot that displays a confidence interval, such as seaborn.barplot.
Data
import seaborn as sns
# load data
flights = sns.load_dataset("flights")
year month passengers
0 1949 Jan 112
1 1949 Feb 118
2 1949 Mar 132
3 1949 Apr 129
4 1949 May 121
# only May flights
may_flights = flights.query("month == 'May'")
year month passengers
4 1949 May 121
16 1950 May 125
28 1951 May 172
40 1952 May 183
52 1953 May 229
64 1954 May 234
76 1955 May 270
88 1956 May 318
100 1957 May 355
112 1958 May 363
124 1959 May 420
136 1960 May 472
# standard deviation for each year of May data
may_flights.set_index('year')[['passengers']].std(axis=1)
year
1949 NaN
1950 NaN
1951 NaN
1952 NaN
1953 NaN
1954 NaN
1955 NaN
1956 NaN
1957 NaN
1958 NaN
1959 NaN
1960 NaN
dtype: float64
# flight in wide format
flights_wide = flights.pivot("year", "month", "passengers")
month Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
year
1949 112 118 132 129 121 135 148 148 136 119 104 118
1950 115 126 141 135 125 149 170 170 158 133 114 140
1951 145 150 178 163 172 178 199 199 184 162 146 166
1952 171 180 193 181 183 218 230 242 209 191 172 194
1953 196 196 236 235 229 243 264 272 237 211 180 201
1954 204 188 235 227 234 264 302 293 259 229 203 229
1955 242 233 267 269 270 315 364 347 312 274 237 278
1956 284 277 317 313 318 374 413 405 355 306 271 306
1957 315 301 356 348 355 422 465 467 404 347 305 336
1958 340 318 362 348 363 435 491 505 404 359 310 337
1959 360 342 406 396 420 472 548 559 463 407 362 405
1960 417 391 419 461 472 535 622 606 508 461 390 432
# standard deviation for each year
flights_wide.std(axis=1)
year
1949 13.720147
1950 19.070841
1951 18.438267
1952 22.966379
1953 28.466887
1954 34.924486
1955 42.140458
1956 47.861780
1957 57.890898
1958 64.530472
1959 69.830097
1960 77.737125
dtype: float64
Plots
may_flights has one observation per year, so no CI is shown.
sns.lineplot(data=may_flights, x="year", y="passengers")
sns.barplot(data=may_flights, x='year', y='passengers')
flights_wide shows there are twelve observations for each year, so the CI shows when all of flights is plotted.
sns.lineplot(data=flights, x="year", y="passengers")
sns.barplot(data=flights, x='year', y='passengers')

How can I get separate plots for each columns(A,B,C,D) vs date side by side using python

I have similar data as below in my pandas dataframe.
Date
A
B
C
D
01-01-2022
10000
1700
1457
327
02-01-2022
17000
3000
1245
526
03-01-2022
16000
2624
1478
632
04-01-2022
10138
1745
1325
800
05-01-2022
4761
1789
1475
952
06-01-2022
5000
1874
1423
1105
07-01-2022
3000
1965
1421
895
08-01-2022
4000
1847
1420
1410
09-01-2022
3001
1654
1418
564
10-01-2022
3002
1754
1417
1715
11-01-2022
3003
1598
1415
564
12-01-2022
3004
1515
1414
2020
13-01-2022
3005
1433
1412
564
14-01-2022
3006
1350
1411
2325
15-01-2022
3007
1268
1409
456
Table
How can I get separate plots side by side as date vs A, Date vs B, Date Vs C and so on, using python?
I am still learning, new to python and data visualization.
Try this, using pandas plot with subplots equal to True, and layout with (row, column) tuple:
df['Date'] = pd.to_datetime(df['Date'], format='%d-%m-%Y')
df.set_index('Date').plot(subplots=True, layout=(1,4), figsize=(15,7))
Output:

input data from slider to change marker colour

Appreciate any help on this task!
I'm trying to have this dash app take slider input value to change a variable through a function and then change the marker colour variable only.
The code is written in python and uses plotly-dash and plotly and as well as pandas numpy and mapbox.
The top part of the code is getting the data into the right format. It has traffic data which is being processed to product a heatmap that shows congestion over time on a map. The dataframe DF is for volume of traffic, and the dataframe HF was created so the slider would work (i added a column numbered 0 to number of columns to use with the slider) - the function datatime should choose the volumes based on the time and the detector ID.
I originally created this functionality with javascript as shown here - https://jsbin.com/detejef/edit?html,js,output
I've been working at this code for awhile. Very close to finally getting a prototype but have this one snag - the variable of time doesn't update properly and reupdate the map with the detector changes...
I just need marker dictionary sub function colour to change with the slider value changing in conjunction with the functions I've created. The function works by itself.
This is an update to the code.
# data wrangling
xls = pd.ExcelFile('FreewayFDSData.xlsx') # this loads the data only once saving memory
df = pd.read_excel(xls, 'Volume', parse_dates=True, index_col="Time")
df = df.T
df2 = pd.read_excel(xls, 'Occupancy', parse_dates=True, index_col="Time")
df2 = df2.T
df3 = pd.read_excel(xls, 'Speed', parse_dates=True, index_col="Time")
df3 = df3.T
Detectors = list(df.columns)
mf = pd.read_excel('FreewayFDSData.xlsx', 'Coordinates', index_col="Short Name")
# return df, df2, df3, Detectors, mf
# input slider value then output into data frame filter for slider time volume value
# timeslider arrangement
def heatmap(SVO):
# creates heatmap data for map
SVO['Period'] = np.arange(len(SVO))
mintime = SVO['Period'].min()
maxtime = SVO['Period'].max()
return mintime, maxtime
mintime, maxtime = heatmap(df)
hf = df.reset_index().set_index('Period')
df2['Period'] = np.arange(len(df2))
hf2 = df2.reset_index().set_index('Period')
df3['Period'] = np.arange(len(df3))
hf3 = df.reset_index().set_index('Period')
# Marker
def datatime(t,hf):
heat = hf.filter(items=[t], axis=0).T.drop("index")
return heat[t]
This is the app section with only the useful parts included.
.....
html.Div([
dcc.RadioItems(
id='tdatam',
options=[{'label': i, 'value': i} for i in ['Volume', 'Speed', 'Occupancy']],
value='Volume',
labelStyle={'display': 'inline-block'}
),
],
style={'width': '48%', 'display': 'inline-block'}),
html.Div([
....
],
style={'width': '50%', 'display': 'inline-block'}),
dcc.Graph(id='graph'),
html.P("", id="popupAnnotation", className="popupAnnotation"),
dcc.Slider(
id="Slider",
marks={i: 'Hour {}'.format(i) for i in range(0, 24)},
min=mintime / 4,
max=maxtime / 4,
step=.01,
value=9,
)
], style={"padding-bottom": '50px', "padding-right": '50px', "padding-left": '50px', "padding-top": '50px'}),
....
App functions/ callbacks
#app.callback(
Output('graph', 'figure'),
[Input('Slider', 'value'),
Input('tdatam', 'value')]
)
def update_map(time, tdata):
#use state
zoom = 10.0
latInitial = -37.8136
lonInitial = 144.9631
bearing = 0
#when time function is updated from slider it is failing
#Trying to create either a new time variable to create a test for time slider or alternatively a new function for updating time
if tdata == "Volume":
return go.Figure(
data=Data([
Scattermapbox(
lat=mf.Y,
lon=mf.X,
mode='markers',
hoverinfo="text",
text=["Monash Freeway", "Western Link",
"Eastern Link",
"Melbourne CBD", "Swan Street"],
# opacity=0.5,
marker=Marker(size=15,
color=datatime(time,hf),
colorscale='Viridis',
opacity=.8,
showscale=True,
cmax=2500,
cmin=700
),
),
]),
layout=Layout(
autosize=True,
height=750,
margin=Margin(l=0, r=0, t=0, b=0),
showlegend=False,
mapbox=dict(
accesstoken=mapbox_access_token,
center=dict(
lat=latInitial, # -37.8136
lon=lonInitial # 144.9631
),
style='dark',
bearing=bearing,
zoom=zoom
),........
)
]
)
)
Example data (anonamized)
Lat/Long/Name
Short Name Y X
A -37.883416 145.090084
B -37.883378 145.090038
C -37.882968 145.089531
D -37.882931 145.089484
Data input
Row Labels 00:00 - 00:15 00:15 - 00:30 00:30 - 00:45 00:45 - 01:00 01:00 - 01:15 01:15 - 01:30 01:30 - 01:45 01:45 - 02:00 02:00 - 02:15 02:15 - 02:30 02:30 - 02:45 02:45 - 03:00 03:00 - 03:15 03:15 - 03:30 03:30 - 03:45 03:45 - 04:00 04:00 - 04:15 04:15 - 04:30 04:30 - 04:45 04:45 - 05:00 05:00 - 05:15 05:15 - 05:30 05:30 - 05:45 05:45 - 06:00 06:00 - 06:15 06:15 - 06:30 06:30 - 06:45 06:45 - 07:00 07:00 - 07:15 07:15 - 07:30 07:30 - 07:45 07:45 - 08:00 08:00 - 08:15 08:15 - 08:30 08:30 - 08:45 08:45 - 09:00 09:00 - 09:15 09:15 - 09:30 09:30 - 09:45 09:45 - 10:00 10:00 - 10:15 10:15 - 10:30 10:30 - 10:45 10:45 - 11:00 11:00 - 11:15 11:15 - 11:30 11:30 - 11:45 11:45 - 12:00 12:00 - 12:15 12:15 - 12:30 12:30 - 12:45 12:45 - 13:00 13:00 - 13:15 13:15 - 13:30 13:30 - 13:45 13:45 - 14:00 14:00 - 14:15 14:15 - 14:30 14:30 - 14:45 14:45 - 15:00 15:00 - 15:15 15:15 - 15:30 15:30 - 15:45 15:45 - 16:00 16:00 - 16:15 16:15 - 16:30 16:30 - 16:45 16:45 - 17:00 17:00 - 17:15 17:15 - 17:30 17:30 - 17:45 17:45 - 18:00 18:00 - 18:15 18:15 - 18:30 18:30 - 18:45 18:45 - 19:00 19:00 - 19:15 19:15 - 19:30 19:30 - 19:45 19:45 - 20:00 20:00 - 20:15 20:15 - 20:30 20:30 - 20:45 20:45 - 21:00 21:00 - 21:15 21:15 - 21:30 21:30 - 21:45 21:45 - 22:00 22:00 - 22:15 22:15 - 22:30 22:30 - 22:45 22:45 - 23:00 23:00 - 23:15 23:15 - 23:30 23:30 - 23:45 23:45 - 24:00
A 88 116 84 68 76 56 56 48 72 48 76 40 76 44 36 76 76 116 124 176 236 352 440 624 1016 1172 1260 1280 1304 1312 1252 1344 1324 1336 1212 1148 1132 1120 1084 996 924 1040 952 900 900 1116 1136 1044 1144 1152 1224 1088 1132 1184 1208 1120 1240 1196 1116 1264 1196 1240 1308 1192 1164 1096 1080 1160 1112 1244 1244 1184 1232 996 1108 876 864 776 644 520 684 724 632 620 680 724 516 504 432 396 264 252 272 256 100 144
B 88 116 76 68 76 56 56 48 68 48 76 48 80 44 32 76 76 108 120 180 240 340 456 624 1088 1268 1352 1384 1412 1376 1356 1372 1400 1436 1296 1240 1200 1256 1120 1028 1008 1072 980 944 932 1148 1192 1040 1188 1220 1292 1140 1116 1268 1292 1172 1272 1236 1216 1280 1248 1280 1388 1244 1224 1076 1096 1148 1108 1256 1356 1308 1236 992 1100 880 872 768 640 520 680 720 636 620 660 716 512 504 428 396 260 244 272 252 100 136
C 84 108 68 68 72 56 56 36 60 48 76 44 72 48 32 68 76 108 124 176 240 340 436 604 1036 1168 1280 1372 1204 1304 1268 1228 1280 1312 1164 1076 1156 1108 924 960 864 944 896 840 840 1068 1052 1036 1128 1164 1136 1084 1052 1136 1072 1056 1136 1160 1088 1224 1180 1228 1264 1204 1044 1008 1076 1128 1112 1252 1188 1180 1156 1000 1096 860 868 736 600 520 680 704 624 616 684 720 500 504 408 392 252 236 264 240 96 144
D 92 108 68 68 72 56 56 40 64 48 76 44 72 48 32 72 76 112 132 184 240 340 436 608 1040 1156 1280 1336 1196 1336 1316 1272 1344 1332 1144 1140 1176 1128 924 948 888 956 892 848 868 1036 1064 1036 1108 1192 1120 1080 1044 1152 1068 1040 1140 1180 1104 1232 1164 1280 1256 1196 1052 1016 1084 1128 1116 1252 1192 1168 1160 1000 1076 868 872 744 620 524 680 716 628 628 680 716 500 500 412 388 256 244 260 244 96 144
The key issue I have determined is that HF is not being pulled into the function after the initial call. I am not sure why - it should work just as the time value on the slider changes. The function itself clearly works though - it is definitely that HF is not being brought into def update_map.
The issue here was the slider was inputing values like 9.19 which has no column to filter too.
The way i solved this issue was too implement a floor using numpy array through the datetime function. this meant it only used values that were full number integers.

Pandas Slicing Between Dates Then Replace Values With Zero

I have the following DataFrame:
Channel Column 1 Column 2 Column 3
Date
12/30/2018 638 4472 487
12/31/2018 868 6985 540
1/1/2019 755 4401 829
1/2/2019 1655 9484 1145
1/3/2019 2002 14212 1158
1/4/2019 1633 9575 1098
1/5/2019 1026 5575 941
1/6/2019 1025 4963 1007
1/7/2019 1944 10685 1246
1/8/2019 2140 9932 1151
1/9/2019 2067 1031 1087
1/10/2019 2168 1005 1074
1/11/2019 2052 9371 909
1/12/2019 1223 5953 895
1/13/2019 1268 4809 827
I would like to return the following result if possible [essentially reduce values between certain dates in a specific column to zero]
Channel Column 1 Column 2 Column 3
Date
12/30/2018 638 4472 487
12/31/2018 868 6985 540
1/1/2019 755 4401 829
1/2/2019 1655 9484 1145
1/3/2019 2002 14212 1158
1/4/2019 1633 9575 1098
1/5/2019 1026 5575 941
1/6/2019 0 4963 1007
1/7/2019 0 10685 1246
1/8/2019 0 9932 1151
1/9/2019 0 1031 1087
1/10/2019 2168 1005 1074
1/11/2019 2052 9371 909
1/12/2019 1223 5953 895
1/13/2019 1268 4809 827
I am trying to filter by a specific column at specific dates, but I can't get it to work properly.
I have tried the following approaches, but I haven't had much luck
df[df['Channel'] == 'Branded Paid Search'].loc['1/6/2019':'1/9/2019']['Sessions'].apply(lambda x: 0 if x < 4000 else 0).to_frame()
This works, but not sure how to get the values back into the original dataframe.
I tried this:
def zero(df):
if df[df['Column 1'] > 0].loc['1/6/2019':'1/9/2019']:
return 0
else:
return 1
df.apply(zero, axis=1)
ValueError: ('The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().')
I tried this:
sessions_df[sessions_df['Column 1'] > 0].loc['1/6/2019':'1/9/2019'] = 0
Nothing changes.
Any help would be appreciated
First create DatetimeIndex by to_datetime and then set values with DataFrame.loc:
df.index = pd.to_datetime(df.index)
df.loc['1/6/2019':'1/9/2019', 'Column 1'] = 0
print (df)
Column 1 Column 2 Column 3
Channel
2018-12-30 638 4472 487
2018-12-31 868 6985 540
2019-01-01 755 4401 829
2019-01-02 1655 9484 1145
2019-01-03 2002 14212 1158
2019-01-04 1633 9575 1098
2019-01-05 1026 5575 941
2019-01-06 0 4963 1007
2019-01-07 0 10685 1246
2019-01-08 0 9932 1151
2019-01-09 0 1031 1087
2019-01-10 2168 1005 1074
2019-01-11 2052 9371 909
2019-01-12 1223 5953 895
2019-01-13 1268 4809 827

file output in python giving me garbage

When I write the following code I get garbage for an output. It is just a simple program to find prime numbers. It works when the first for loops range only goes up to 1000 but once the range becomes large the program fail's to output meaningful data
output = open("output.dat", 'w')
for i in range(2, 10000):
prime = 1
for j in range(2, i-1):
if i%j == 0:
prime = 0
j = i-1
if prime == 1:
output.write(str(i) + " " )
output.close()
print "writing finished"
This is a known Notepad bug. Check out
http://blogs.msdn.com/oldnewthing/archive/2007/04/17/2158334.aspx
The classic way to trigger this bug is to put "Bush hid the facts" in a file, save it, reopen it, and scream about conspiracy theories, but I guess "2 3 5 7 11 13 17" works too, except that you don't get to scream about conspiracy theories.
You're setting a single variable named prime ten thousand times to 1, then 9998 times possibly setting it to 0, and finally (if it's not been set to 0) outputting one incomplete line (no line-end). I suspect that's not what you want to do! Maybe something like...:
output = open("output.dat", 'w')
for i in range(2, 10000):
prime = 1
for j in range(2, i-1):
if i%j == 0:
prime = 0
break
if prime == 1:
output.write(str(i) + " " )
output.close()
print "writing finished"
Note the very different indentation from what you had posted. I also used break to break out of an inner loop, which I think was what you meant where you wrote j = i - 1 (which would in fact have absolutely no effect since j would just be set to its next natural value in the very next leg of that inner loop, which would still run to the end).
With fixed indentation (which I'll have to assume is a bad paste job, otherwise I don't think it would run) your code outputs fine for me :
2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283 293 307 311 313 317 331 337 347 349 353 359 367 373 379 383 389 397 401 409 419 421 431 433 439 443 449 457 461 463 467 479 487 491 499 503 509 521 523 541 547 557 563 569 571 577 587 593 599 601 607 613 617 619 631 641 643 647 653 659 661 673 677 683 691 701 709 719 727 733 739 743 751 757 761 769 773 787 797 809 811 821 823 827 829 839 853 857 859 863 877 881 883 887 907 911 919 929 937 941 947 953 967 971 977 983 991 997 1009 1013 1019 1021 1031 1033 1039 1049 1051 1061 1063 1069 1087 1091 1093 1097 1103 1109 1117 1123 1129 1151 1153 1163 1171 1181 1187 1193 1201 1213 1217 1223 1229 1231 1237 1249 1259 1277 1279 1283 1289 1291 1297 1301 1303 1307 1319 1321 1327 1361 1367 1373 1381 1399 1409 1423 1427 1429 1433 1439 1447 1451 1453 1459 1471 1481 1483 1487 1489 1493 1499 1511 1523 1531 1543 1549 1553 1559 1567 1571 1579 1583 1597 1601 1607 1609 1613 1619 1621 1627 1637 1657 1663 1667 1669 1693 1697 1699 1709 1721 1723 1733 1741 1747 1753 1759 1777 1783 1787 1789 1801 1811 1823 1831 1847 1861 1867 1871 1873 1877 1879 1889 1901 1907 1913 1931 1933 1949 1951 1973 1979 1987 1993 1997 1999 2003 2011 2017 2027 2029 2039 2053 2063 2069 2081 2083 2087 2089 2099 2111 2113 2129 2131 2137 2141 2143 2153 2161 2179 2203 2207 2213 2221 2237 2239 2243 2251 2267 2269 2273 2281 2287 2293 2297 2309 2311 2333 2339 2341 2347 2351 2357 2371 2377 2381 2383 2389 2393 2399 2411 2417 2423 2437 2441 2447 2459 2467 2473 2477 2503 2521 2531 2539 2543 2549 2551 2557 2579 2591 2593 2609 2617 2621 2633 2647 2657 2659 2663 2671 2677 2683 2687 2689 2693 2699 2707 2711 2713 2719 2729 2731 2741 2749 2753 2767 2777 2789 2791 2797 2801 2803 2819 2833 2837 2843 2851 2857 2861 2879 2887 2897 2903 2909 2917 2927 2939 2953 2957 2963 2969 2971 2999 3001 3011 3019 3023 3037 3041 3049 3061 3067 3079 3083 3089 3109 3119 3121 3137 3163 3167 3169 3181 3187 3191 3203 3209 3217 3221 3229 3251 3253 3257 3259 3271 3299 3301 3307 3313 3319 3323 3329 3331 3343 3347 3359 3361 3371 3373 3389 3391 3407 3413 3433 3449 3457 3461 3463 3467 3469 3491 3499 3511 3517 3527 3529 3533 3539 3541 3547 3557 3559 3571 3581 3583 3593 3607 3613 3617 3623 3631 3637 3643 3659 3671 3673 3677 3691 3697 3701 3709 3719 3727 3733 3739 3761 3767 3769 3779 3793 3797 3803 3821 3823 3833 3847 3851 3853 3863 3877 3881 3889 3907 3911 3917 3919 3923 3929 3931 3943 3947 3967 3989 4001 4003 4007 4013 4019 4021 4027 4049 4051 4057 4073 4079 4091 4093 4099 4111 4127 4129 4133 4139 4153 4157 4159 4177 4201 4211 4217 4219 4229 4231 4241 4243 4253 4259 4261 4271 4273 4283 4289 4297 4327 4337 4339 4349 4357 4363 4373 4391 4397 4409 4421 4423 4441 4447 4451 4457 4463 4481 4483 4493 4507 4513 4517 4519 4523 4547 4549 4561 4567 4583 4591 4597 4603 4621 4637 4639 4643 4649 4651 4657 4663 4673 4679 4691 4703 4721 4723 4729 4733 4751 4759 4783 4787 4789 4793 4799 4801 4813 4817 4831 4861 4871 4877 4889 4903 4909 4919 4931 4933 4937 4943 4951 4957 4967 4969 4973 4987 4993 4999 5003 5009 5011 5021 5023 5039 5051 5059 5077 5081 5087 5099 5101 5107 5113 5119 5147 5153 5167 5171 5179 5189 5197 5209 5227 5231 5233 5237 5261 5273 5279 5281 5297 5303 5309 5323 5333 5347 5351 5381 5387 5393 5399 5407 5413 5417 5419 5431 5437 5441 5443 5449 5471 5477 5479 5483 5501 5503 5507 5519 5521 5527 5531 5557 5563 5569 5573 5581 5591 5623 5639 5641 5647 5651 5653 5657 5659 5669 5683 5689 5693 5701 5711 5717 5737 5741 5743 5749 5779 5783 5791 5801 5807 5813 5821 5827 5839 5843 5849 5851 5857 5861 5867 5869 5879 5881 5897 5903 5923 5927 5939 5953 5981 5987 6007 6011 6029 6037 6043 6047 6053 6067 6073 6079 6089 6091 6101 6113 6121 6131 6133 6143 6151 6163 6173 6197 6199 6203 6211 6217 6221 6229 6247 6257 6263 6269 6271 6277 6287 6299 6301 6311 6317 6323 6329 6337 6343 6353 6359 6361 6367 6373 6379 6389 6397 6421 6427 6449 6451 6469 6473 6481 6491 6521 6529 6547 6551 6553 6563 6569 6571 6577 6581 6599 6607 6619 6637 6653 6659 6661 6673 6679 6689 6691 6701 6703 6709 6719 6733 6737 6761 6763 6779 6781 6791 6793 6803 6823 6827 6829 6833 6841 6857 6863 6869 6871 6883 6899 6907 6911 6917 6947 6949 6959 6961 6967 6971 6977 6983 6991 6997 7001 7013 7019 7027 7039 7043 7057 7069 7079 7103 7109 7121 7127 7129 7151 7159 7177 7187 7193 7207 7211 7213 7219 7229 7237 7243 7247 7253 7283 7297 7307 7309 7321 7331 7333 7349 7351 7369 7393 7411 7417 7433 7451 7457 7459 7477 7481 7487 7489 7499 7507 7517 7523 7529 7537 7541 7547 7549 7559 7561 7573 7577 7583 7589 7591 7603 7607 7621 7639 7643 7649 7669 7673 7681 7687 7691 7699 7703 7717 7723 7727 7741 7753 7757 7759 7789 7793 7817 7823 7829 7841 7853 7867 7873 7877 7879 7883 7901 7907 7919 7927 7933 7937 7949 7951 7963 7993 8009 8011 8017 8039 8053 8059 8069 8081 8087 8089 8093 8101 8111 8117 8123 8147 8161 8167 8171 8179 8191 8209 8219 8221 8231 8233 8237 8243 8263 8269 8273 8287 8291 8293 8297 8311 8317 8329 8353 8363 8369 8377 8387 8389 8419 8423 8429 8431 8443 8447 8461 8467 8501 8513 8521 8527 8537 8539 8543 8563 8573 8581 8597 8599 8609 8623 8627 8629 8641 8647 8663 8669 8677 8681 8689 8693 8699 8707 8713 8719 8731 8737 8741 8747 8753 8761 8779 8783 8803 8807 8819 8821 8831 8837 8839 8849 8861 8863 8867 8887 8893 8923 8929 8933 8941 8951 8963 8969 8971 8999 9001 9007 9011 9013 9029 9041 9043 9049 9059 9067 9091 9103 9109 9127 9133 9137 9151 9157 9161 9173 9181 9187 9199 9203 9209 9221 9227 9239 9241 9257 9277 9281 9283 9293 9311 9319 9323 9337 9341 9343 9349 9371 9377 9391 9397 9403 9413 9419 9421 9431 9433 9437 9439 9461 9463 9467 9473 9479 9491 9497 9511 9521 9533 9539 9547 9551 9587 9601 9613 9619 9623 9629 9631 9643 9649 9661 9677 9679 9689 9697 9719 9721 9733 9739 9743 9749 9767 9769 9781 9787 9791 9803 9811 9817 9829 9833 9839 9851 9857 9859 9871 9883 9887 9901 9907 9923 9929 9931 9941 9949 9967 9973
EDIT the version of indentation I ran:
output = open("output.dat", 'w')
for i in range(2, 10000):
prime = 1
for j in range(2, i-1):
if i%j == 0:
prime = 0
j = i-1
if prime == 1:
output.write(str(i) + " " )
output.close()
print "writing finished"
Your second for should be nested in the first for.
Also, this looks like a homework question. It is not clear how your output is garbage - does it not compute what you want? Or is the output scrambled? Post a copy of the output so we can see!
Don't you want your loops to be nested?
output = open("output.dat", 'w')
for i in range(2, 10000):
prime = 1
for j in range(2, i-1):
if i%j == 0:
prime = 0
j = i-1
if prime == 1:
output.write(str(i) + " " )
output.close()
print "writing finished"
so, you set prime to 1, 9998 times
then you use the final value of i (10000?, 10001?) as an end value
....
to summarize, you have serious indention problems....

Categories

Resources