flask ValueError: I/O operation on closed file - python

I have a code which i am using to scrape from a web page and I am saving the scraped data in a html file and displaying it as a different page. below is the code
from flask import Flask, render_template,request from bs4 import
BeautifulSoup import urllib.request import sys,os app =
Flask(__name__) #app.route('/') def index():
return render_template ('index.html')
#app.route('/result',methods = ['POST']) def result(): if
request.method == 'POST':
result = request.form.get("search")
link = "https://xyz.comindex?search="
url = (link+result)
print(url)
try:
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page, 'html.parser')
test = soup.findAll('div', attrs={"class": "search-inner-wrapper"})
sys.stdout = open("tests.html", "w")
print(test)
sys.stdout.close()
return render_template("SearchResults.html", test=test)
except:
print("An error occured.")
return render_template("test.html", test=test)
if __name__ == '__main__':
app.run(use_reloader = True, debug = True)
My problem is that this code works perfectly fine but just for once, When i reload the index page and perform a search query I get
ValueError: I/O operation on closed file.
I cant figure a work around for this since i have to use single file every time and do not want the results to append with existing code.

You are redefining sys.stdout to be the file handle of the file you opened. Use another name, don't overwrite sys.stdout. And don't close sys.stdout. It's ok to close the file handle you create though.
Example of opening a file and reading it, opening a file and writing it:
bjb#blueeyes:~$ cat /tmp/infile.html
<html>
<head>
</head>
<body>
<div class="search-inner-wrapper">fleeble flobble</div>
</body>
</html>
bjb#blueeyes:~$ cat /tmp/file2.py
#!/usr/bin/env python3
with open('/tmp/infile.html', 'r') as infile:
page = infile.readlines()
with open('/tmp/outfile.html', 'w') as ofd:
ofd.write(''.join(page))
bjb#blueeyes:~$ /tmp/file2.py
bjb#blueeyes:~$ cat /tmp/outfile.html
<html>
<head>
</head>
<body>
<div class="search-inner-wrapper">fleeble flobble</div>
</body>
</html>
The first line of /tmp/file2.py just says this is a python script.
The next two lines open a file called /tmp/infile.html for reading and declare a variable "infile" as the read file descriptor. Then all the lines in /tmp/infile.html are read into a list of strings.
When we leave that "with" block, the file is closed for us.
Then in the next two lines, we open /tmp/outfile.html for writing and we use the variable ofd ("output file descriptor") to hold the file descriptor. We use ofd to write the series of lines in the list "page" to that file. Once we leave that second "with" block, the output file is closed for us. Then the program exits ... my last command dumps out the contents of /tmp/outfile.html, which you can see is the same as infile.html.
If you want to open and close files without using those with blocks, you can:
infile = open('/tmp/infile.html', 'r')
page = infile.readlines()
infile.close()
ofd = open('/tmp/outfile.html', 'w')
ofd.write(''.join(page))
ofd.close()
Hopefully that will work in a flask script ...

Related

Why does my Python program close after running the first loop?

I'm new to Python and scraping. I'm trying to run two loops. One goes and scrapes ids from one page. Then, using those ids, I call another API to get more info/properties.
But when I run this program, it just runs the first bit fine (gets the IDs), but then it closes and doesn't run the 2nd part. I feel I'm missing something really basic about control flow in Python here. Why does Python close after the first loop when I run it in Terminal?
import requests
import csv
import time
import json
from bs4 import BeautifulSoup, Tag
file = open('parcelids.csv','w')
writer = csv.writer(file)
writer.writerow(['parcelId'])
for x in range(1,10):
time.sleep(1) # slowing it down
url = 'http://apixyz/Parcel.aspx?Pid=' + str(x)
source = requests.get(url)
response = source.content
soup = BeautifulSoup(response, 'html.parser')
parcelId = soup.find("span", id="MainContent_lblMblu").text.strip()
writer.writerow([parcelId])
out = open('mapdata.csv','w')
with open('parcelIds.csv', 'r') as in1:
reader = csv.reader(in1)
writer = csv.writer(out)
next(reader, None) # skip header
for row in reader:
row = ''.join(row[0].split())[:-2].upper().replace('/','-') #formatting
url="https://api.io/api/properties/"
url1=url+row
time.sleep(1) # slowing it down
response = requests.get(url1)
resp_json_payload = response.json()
address = resp_json_payload['property']['address']
writer.writerow([address])
If you are running in windows (where filenames are not case sensitive), then the file you have open for writing (parcelids.csv) is still open when you reopen it to read from it.
Try closing the file before opening it to read from it.

How to download a file in Python (Jinja2) on-click Export button?

I have a button export :
<button class="aptButton" formaction="/export/" type="submit">export</button>
and I have this in the /export/
index.cgi
#! /apollo/sbin/envroot $ENVROOT/bin/python
# -*- coding: utf-8 -*-
import cgitb
cgitb.enable()
import cgi
def main():
print "Content-Type: text/html"
print
form = cgi.FieldStorage()
results = helpers.getResults()
environment = helpers.get_environment()
print environment.get_template('export.html').render(
results = results)
main()
and I have this in my export.html
<!doctype html>
{% for id in results %}
{{ write_results_to_file(id) }}
{% endfor %}
I am trying to download the results to a tab separated file, so I thought of writing to a local file and then send(download) the file but I am not sure how to do the download part, I couldnt use flask or django which has some good libs.. is there any other lib which I can use to download the results to a tab delimited file on the users desktop?
export.py
def write_results_to_file(result):
local_filename = "/home/testing.txt"
# NOTE the stream=True parameter
with open(local_filename, 'w') as f:
f.write('\t'.join(result) + '\n')
If you're using good old-fashioned CGI to produce a tab-separated file,
all you need to do is print an appropriate header and then print the content on stdout, something like this:
def main():
form = cgi.FieldStorage()
results = helpers.getResults()
print "Content-Type: text/plain"
print "Content-Disposition: attachment; filename=testing.txt"
print
for result in results:
print '\t'.join(result) + '\n'
main()
The essential parts are the 2 lines that print the header,
followed by a blank line to separate from the content,
followed by the plain text content.
If you want to make this happen on the click of an Export button,
then you can, for example:
Make the Export button a link to another URL endpoint that will use the example script I put above
Or, use the same script, with a conditional statement on form parameters to decide to print the front page, or to print the content using the example script above
Let me know if you need further help.

Write the output of a script in a html page using flask

I wrote a script that makes me able to run a script and then to get the output I want, I am trying to write the output in a html page, how can I do that? this is my script:
def execute(cmd):
os.system(cmd)
to_return =dict()
for filename in files:
with open(filename, 'r') as f:
data = f.read()
to_return[filename] = data
return to_return
output = execute('./script')
print output
Any Idea of how I can generate an html page where I can print the result of running this script??
In your views.py, under the corresponding route, do
#app.route('/route_name')
def script_output():
output = execute('./script')
return render_template('template_name.html',output=output)
And in your template,
<p>{{ output }}</p>

How to implement the 'tempfile' module to this code

This is a small widget that I am designing that is designed to 'browse' while circumventing proxy settings. I have been told on Code Review that it would be beneficial here, but am struggling to put it in with my program's current logic. Here is the code:
import urllib.request
import webbrowser
import os
import tempfile
location = os.path.dirname(os.path.abspath(__file__))
proxy_handler = urllib.request.ProxyHandler(proxies=None)
opener = urllib.request.build_opener(proxy_handler)
def navigate(query):
response = opener.open(query)
html = response.read()
return html
def parse(data):
start = str(data)[2:-1]
lines = start.split('\\n')
return lines
while True:
url = input("Path: ")
raw_data = navigate(url)
content = parse(raw_data)
with open('cache.html', 'w') as f:
f.writelines(content)
webbrowser.open_new_tab(os.path.join(location, 'cache.html'))
Hopefully someone who has worked with these modules before can help me. The reason that I want to use tempfile is that my program gets raw html, parses it and stores it in a file. This file is overwritten every time a new input comes in, and would ideally be deleted when the program stops running. Also, the file doesn't have to exist when the program initializes so it seems logical from that view also.
Since you are passing the name of the file to webbrowser.open_new_tab(), you should use a NamedTemporaryFile
cache = tempfile.NamedTemporaryFile()
...
cache.seek(0)
cache.writelines(bytes(line, 'UTF-8') for line in content)
cache.seek(0)
webbrowser.open_new_tab('file://' + cache.name)

Loading mako templates from files

I'm new to python and currently trying to use mako templating.
I want to be able to take an html file and add a template to it from another html file.
Let's say I got this index.html file:
<html>
<head>
<title>Hello</title>
</head>
<body>
<p>Hello, ${name}!</p>
</body>
</html>
and this name.html file:
world
(yes, it just has the word world inside).
I want the ${name} in index.html to be replaced with the content of the name.html file.
I've been able to do this without the name.html file, by stating in the render method what name is, using the following code:
#route(':filename')
def static_file(filename):
mylookup = TemplateLookup(directories=['html'])
mytemplate = mylookup.get_template('hello/index.html')
return mytemplate.render(name='world')
This is obviously not useful for larger pieces of text. Now all I want is to simply load the text from name.html, but haven't yet found a way to do this. What should I try?
return mytemplate.render(name=open(<path-to-file>).read())
Thanks for the replies.
The idea is to use the mako framework since it does things like cache and check if the file has been updated...
this code seems to eventually work:
#route(':filename')
def static_file(filename):
mylookup = TemplateLookup(directories=['.'])
mytemplate = mylookup.get_template('index.html')
temp = mylookup.get_template('name.html').render()
return mytemplate.render(name=temp)
Thanks again.
Did I understand you correctly that all you want is read the content from a file? If you want to read the complete content use something like this (Python >= 2.5):
from __future__ import with_statement
with open(my_file_name, 'r') as fp:
content = fp.read()
Note: The from __future__ line has to be the first line in your .py file (or right after the content encoding specification that can be placed in the first line)
Or the old approach:
fp = open(my_file_name, 'r')
try:
content = fp.read()
finally:
fp.close()
If your file contains non-ascii characters, you should also take a look at the codecs page :-)
Then, based on your example, the last section could look like this:
from __future__ import with_statement
#route(':filename')
def static_file(filename):
mylookup = TemplateLookup(directories=['html'])
mytemplate = mylookup.get_template('hello/index.html')
content = ''
with open('name.html', 'r') as fp:
content = fp.read()
return mytemplate.render(name=content)
You can find more details about the file object in the official documentation :-)
There is also a shortcut version:
content = open('name.html').read()
But I personally prefer the long version with the explicit closing :-)

Categories

Resources