python if $_GET is empty do something & url path

python if $_GET is empty do something & url path - python

import cgi
form = cgi.FieldStorage()
test = form['name'].value
if test is None:
print('empty')
else:
print ('Hello ' + test)
... and that doesn't seem to display anything when my url is something like .../1.py
if i set it to .../1.py?name=asd it will display Hello asd
also how to get everything after the question mark and after the domain name: for example if i try to access http://localhost/thisis/test i want to get /thisis/test.
edit: i tried to use try: and i couldn't get it working.

To answer the first part of my question, i found what the problem is and that's the correct code:
import cgi
form = cgi.FieldStorage()
try:
test = form['name'].value
except KeyError:
print('not found')
else:
print(test)
for my second question:
import os
print(os.environ["REQUEST_URI"])

Related

Previous solutions doesn't work: TypeError: 'str' object is not callable

I know similar error has been reported, but I checked all online solutions and none of them work. So I decided to open a new post.
I was running the following code from an online course. It is supposed to work, however, it always reports the following error when running on my machine:
----> 7 input_str = input('Enter location: ')
TypeError: 'str' object is not callable
Below is the whole block of code:
import urllib.request, urllib.parse, urllib.error
import xml.etree.ElementTree as ET
serviceurl = 'http://maps.googleapis.com/maps/api/geocode/xml?'
while True:
input_str = input('Enter location: ')
if len(input_str) < 1: break
url = serviceurl + urllib.parse.urlencode({'address': input_str})
print('Retrieving', url)
uh = urllib.request.urlopen(url)
data = uh.read()
print('Retrieved', len(data), 'characters')
print(data.decode())
tree = ET.fromstring(data)
results = tree.findall('result')
lat = results[0].find('geometry').find('location').find('lat').text
lng = results[0].find('geometry').find('location').find('lng').text
location = results[0].find('formatted_address').text
print('lat', lat, 'lng', lng)
print(location)
Thanks in advance!

You redefined the built-in function input to a string somewhere in your code (not necessarily in the posted code fragment) by executing something like this:
input = ....
There is only one way to fix this error: close the Python interpreter and start it over again. Make sure that any code you execute does not contain assignments to input or any other identifiers that refer to the built-in functions.

I see you're trying to put a string in "input_str". Have you already tried?:
input_str = raw_input('Enter location: ')
This type of method recieves a string as a input. The input of the method "input" must be a number
Regards!

Have you tried renaming the variable? Instead of input_str, try something like user_input. I'm pretty sure that 'str' is reserved in Python and so using it as a variable won't work.

How to download books automatically from Gutenberg

I am trying to download books from "http://www.gutenberg.org/". I want to know why my code gets nothing.
import requests
import re
import os
import urllib
def get_response(url):
response = requests.get(url).text
return response
def get_content(html):
reg = re.compile(r'(<span class="mw-headline".*?</span></h2><ul><li>.*</a></li></ul>)',re.S)
return re.findall(reg,html)
def get_book_url(response):
reg = r'a href="(.*?)"'
return re.findall(reg,response)
def get_book_name(response):
reg = re.compile('>.*</a>')
return re.findall(reg,response)
def download_book(book_url,path):
path = ''.join(path.split())
path = 'F:\\books\\{}.html'.format(path) #my local file path
if not os.path.exists(path):
urllib.request.urlretrieve(book_url,path)
print('ok!!!')
else:
print('no!!!')
def get_url_name(start_url):
content = get_content(get_response(start_url))
for i in content:
book_url = get_book_url(i)
if book_url:
book_name = get_book_name(i)
try:
download_book(book_url[0],book_name[0])
except:
continue
def main():
get_url_name(start_url)
if __name__ == '__main__':
start_url = 'http://www.gutenberg.org/wiki/Category:Classics_Bookshelf'
main()
I have run the code and get nothing, no tracebacks. How can I download the books automatically from the website?

I have run the code and get nothing,no tracebacks.
Well, there's no chance you get a traceback in the case of an exception in download_book() since you explicitely silent them:
try:
download_book(book_url[0],book_name[0])
except:
continue
So the very first thing you want to do is to at least print out errors:
try:
download_book(book_url[0],book_name[0])
except exception as e:
print("while downloading book {} : got error {}".format(book_url[0], e)
continue
or just don't catch exception at all (at least until you know what to expect and how to handle it).
I don't even know how to fix it
Learning how to debug is actually even more important than learning how to write code. For a general introduction, you want to read this first.
For something more python-specific, here are a couple ways to trace your program execution:
1/ add print() calls at the important places to inspect what you really get
2/ import your module in the interactive python shell and test your functions in isolation (this is easier when none of them depend on global variables)
3/ use the builtin step debugger
Now there are a few obvious issues with your code:
1/ you don't test the result of request.get() - an HTTP request can fail for quite a few reasons, and the fact you get a response doesn't mean you got the expected response (you could have a 400+ or 500+ response as well.
2/ you use regexps to parse html. DONT - regexps cannot reliably work on html, you want a proper HTML parser instead (BeautifulSoup is the canonical solution for web scraping as it's very tolerant). Also some of your regexps look quite wrong (greedy match-all etc).

start_url is not defined in main()
You need to use a global variable. Otherwise, a better (cleaner) approach is to pass in the variable that you are using. In any case, I would expect an error, start_url is not defined
def main(start_url):
get_url_name(start_url)
if __name__ == '__main__':
start_url = 'http://www.gutenberg.org/wiki/Category:Classics_Bookshelf'
main(start_url)
EDIT:
Nevermind, the problem is in this line: content = get_content(get_response(start_url))
The regex in get_content() does not seem to match anything. My suggestion would be to use BeautifulSoup, from bs4 import BeautifulSoup. For any information regarding why you shouldn't parse html with regex, see this answer RegEx match open tags except XHTML self-contained tags
Asking regexes to parse arbitrary HTML is like asking a beginner to write an operating system

As others have said, you get no output because your regex doesn't match anything. The text returned by the initial url has got a newline between </h2> and <ul>, try this instead:
r'(<span class="mw-headline".*?</span></h2>\n<ul><li>.*</a></li></ul>)'
When you fix that one, you will face another error, I suggest some debug printouts like this:
def get_url_name(start_url):
content = get_content(get_response(start_url))
for i in content:
print('[DEBUG] Handling:', i)
book_url = get_book_url(i)
print('[DEBUG] book_url:', book_url)
if book_url:
book_name = get_book_name(i)
try:
print('[DEBUG] book_url[0]:', book_url[0])
print('[DEBUG] book_name[0]:', book_name[0])
download_book(book_url[0],book_name[0])
except:
continue

PYTHON How to search a word on a HTML Page and make a Test

hello everyone I'm searching a special sentence or word in a html page after make a webrequest
the sentence = Couldn't resolve host 'http:'
I try to code a script using pycurl with b.getvalue() but seems doesen't works
website to try http://www.moorelandpartners.com/plugins/system/plugin_googlemap2_proxy.php
code :
http://pastebin.com/qAUjv1ux
I would like search the total sentence or just maybe the word "http" or "Couldn't"
Thanks for your help

This seems to work (uses the 'in' operator as I suggested in my comment) :
import pycurl
import StringIO
import sys
import time
ip = "http://www.moorelandpartners.com/plugins/system/plugin_googlemap2_proxy.php"
c = pycurl.Curl()
b = StringIO.StringIO()
c.setopt(pycurl.WRITEFUNCTION, b.write)
c.setopt(pycurl.TIMEOUT, 10) # Note 1
c.setopt(pycurl.CONNECTTIMEOUT, 10) # Note 1
c.setopt(c.URL, ip)
try:
c.perform()
except Exception:
gg = 88
print "No ",ip
else:
html = b.getvalue()
if "Couldn't resolve host" in html: # Note 2
print "{0} FOUND ".format( ip ) # Note 3
else:
print "do not works"
What I did :
Note 1 : Increased the timeouts - for some reason the setting of "1" didn't work for me
Note 2 : used the 'in' operator to test that the returned page contained the words we are looking for.
Note 3 : removed references to bcolors.OKGREEN and bcolors.ENDC as your bcolors was not defined.
When I tested this on my pc it "worked" - i.e. it stated that it found the web page, and it found the relevant text.

Taking an argument from user (URL)

Does anyone know how I would be able to take the the URL as an argument in Python as page?
Just to readline in the script, user inputs into the shell and pass it through as an argument just to make the script more portable?
import sys, re
import webpage_get
def print_links(page):
''' find all hyperlinks on a webpage passed in as input and
print '''
print '\n[*] print_links()'
links = re.findall(r'(\http://\w+\.\w+[-_]*\.*\w+\.*?\w+\.*?\w+\.*[//]*\.*?\w+ [//]*?\w+[//]*?\w+)', page)
# sort and print the links
links.sort()
print '[+]', str(len(links)), 'HyperLinks Found:'
for link in links:
print link
def main():
# temp testing url argument
sys.argv.append('http://www.4chan.org')
# Check args
if len(sys.argv) != 2:
print '[-] Usage: webpage_getlinks URL'
return
# Get the web page
page = webpage_get.wget(sys.argv[1])
# Get the links
print_links(page)
if __name__ == '__main__':
main()

It looks like you kind of got started with command line arguments but just to give you an example for this specific situation you could do something like this:
def main(url):
page = webpage_get.wget(url)
print_links(page)
if __name__ == '__main__':
url = ""
if len(sys.argv >= 1):
url = sys.argv[0]
main(url)
Then run it from shell like this
python test.py http://www.4chan.org
Here is a tutorial on command line arguments which may help your understanding more than this snippet http://www.tutorialspoint.com/python/python_command_line_arguments.htm
Can you let me know if I miss understood your question? I didn't feel to confident in the meaning after I read it.

Python Clientform-can not get expexted result

I am trying to search through http://www.wegottickets.com/ with the keywords "Live music". But the returned result is still the main page, not the search result page including lots of live music information. Could anyone show me out what the problem is?
from urllib2 import urlopen
from ClientForm import ParseResponse
response = urlopen("http://www.wegottickets.com/")
forms = ParseResponse(response, backwards_compat=False)
form = forms[0]
form.set_value("Live music", name="unified_query")
form.set_all_readonly(False)
control = form.find_control(type="submit")
print control.disabled
print control.readonly
#print form
request2 = form.click()
try:
response2 = urlopen(request2)
except:
print "Unsccessful query"
print response2.geturl()
print response2.info()
print response.read()
response2.close()
Thank you very much!

Never used it, but I've had success with the python mechanize module, if it turns out to be a fault in clientform.
However, as a first step, I'd suggest removing your try...except wrapper. What you're basically doing is saying "catch any error, then ignore the actual error and print 'Unsuccessful Query' instead". Not helpful for debugging. The exception will stop the program and print a useful error message, if you don't get in its way.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

python if $_GET is empty do something & url path - python

To answer the first part of my question, i found what the problem is and that's the correct code: import cgi form = cgi.FieldStorage() try: test = form['name'].value except KeyError: print('not found') else: print(test) for my second question: import os print(os.environ["REQUEST_URI"])

Related

Previous solutions doesn't work: TypeError: 'str' object is not callable

How to download books automatically from Gutenberg

PYTHON How to search a word on a HTML Page and make a Test

Taking an argument from user (URL)

Python Clientform-can not get expexted result

Categories

Resources