I'm probably overlooking something spectacularly obvious, but I can't find why the following is happening.
I'm trying to POST a search query to http://www.arcade-museum.com using the requests lib and whenever the query contains spaces, the resulting page contains no results. Compare the result of these snippets:
import requests
url = "http://www.arcade-museum.com/results.php"
payload = {'type': 'Videogame', 'q': '1942'}
r = requests.post(url, payload)
with open("search_results.html", mode="wb") as f:
f.write(r.content)
and
import requests
url = "http://www.arcade-museum.com/results.php"
payload = {'type': 'Videogame', 'q': 'Wonder Boy'}
r = requests.post(url, payload)
with open("search_results.html", mode="wb") as f:
f.write(r.content)
If you try the same query on the website, the latter will result in a list of about 10 games. The same happens when posting the form data using the Postman REST client Chrome extension:
Again, it's probably something very obvious I'm overlooking, but I can't find what's causing this issue.
Related
I wanted to know how can I get data using QID URL. I have some names, I use falcon 2.0 entity linker curl command( change it into python script) to get information of its QID. Now I want to use that QID to access information about the persons gender( male or female) or alias or other information. Can someone give an idea how it should be approached. The code to get QID URL is given below. the link to falcon 2.0 is https://github.com/SDM-TIB/Falcon2.0.
import requests
import json
response_list=[]
person_names=[]
if __name__ == '__main__':
limit=100
with open(filename, 'r') as in_file:
in_reader = in_file.readlines()
for data in in_reader:
if limit > 0:
person_names.append(data.rstrip())
limit -=1
else :
break
"""
Url of post request and header of type json create linking against each line of text.
"""
url="https://labs.tib.eu/falcon/falcon2/api?mode=long"
headers = {'Content-type': 'application/json'}
for name in person_names:
data = {"text":name }
data_json = json.dumps(data)
response = requests.post(url, data=data_json, headers=headers)
print(response.content)
It gives output as http://www.wikidata.org/entity/Q42493 for the entity.
You can convert URLs of the form http://www.wikidata.org/entity/Q42493 to https://www.wikidata.org/wiki/Special:EntityData/Q42493.json to get a JSON payload with the information that you seek, but first you should make sure that the entity resolution algorithm is giving you accurate results so that you have the correct QID to start with.
I'm trying to decode a qr image from a website with python: https://zxing.org/w/decode.jspx
And i don't know why my post requests fail and i don't get any response
import requests
url ="https://zxing.org/w/decode.jspx"
session = requests.Session()
f = {'f':open("new.png","rb")}
response = session.post(url,files = f)
f = open("page.html","w")
f.write(response.text)
f.close()
session.close()
Even when i do it with a get requests it still fail ... :/
url ="https://zxing.org/w/decode.jspx"
session = requests.Session()
data = {'u':'https://www.qrstuff.com/images/default_qrcode.png'}
response = session.post(url,data = data)
f = open("page.html","w")
f.write(response.text)
f.close()
session.close()
maby because the website contain two forms ? ...
Thanks for helping
You can do this:
import urllib
url ="https://zxing.org/w/decode?u=https://www.qrstuff.com/images/default_qrcode.png"
response = urllib.urlopen(url)
f = open("page.html","w")
f.write(response.read())
f.close()
If you want to send url action == get and if you want to post data as a file, action == post.
You can check it with Hackbar addons on Firefox
Well i just saw my mistake ...
the web site is : https://zxing.org/w/decode.jspx
but once you have a post or a get it'll be
https://zxing.org/w/decode without ".jspx" so i just removed it and every thing worked well !!
I'm trying to retrieve the response json data by a web site that I call.
The site is this:
WebSite DriveNow
On this page are shown on map some data. With browser debugger I can see the end point
end point
that sends response data json.
I have use this python to try scrape the json response data:
import requests
import json
headers = {
'Host': 'api2.drive-now.com',
'X-Api-Key': 'adf51226795afbc4e7575ccc124face7'
}
r = requests.get('https://api2.drive-now.com/cities/42756?expand=full', headers=headers)
json_obj = json.loads(r.content)
but I get this error:
hostname doesn't match either of 'activityharvester.com'
How I can retrieve this data?
Thanks
I have tried to call the endpoint that show json response using Postam, and passing into Header only Host and Api-Key. The result is the json that i want. But i i try the same call into python i recive the error hostname doesn't match either of 'activityharvester.com'
I don't understand your script, nor your question. Why two requests and three headers ? Did you mean something like this ?
import requests
import json
headers = {
'User-Agent': 'Mozilla/5.0',
'X-Api-Key':'adf51226795afbc4e7575ccc124face7',
}
res = requests.get('https://api2.drive-now.com/cities/4604?expand=full', headers=headers, allow_redirects=False)
print(res.status_code, res.reason)
json_obj = json.loads(res.content)
print(json_obj)
I'm trying to get all users information from GitHub API using Python Requests library. Here is my code:
import requests
import json
url = 'https://api.github.com/users'
token = "my_token"
headers = {'Authorization': 'token %s' % token}
r = requests.get(url, headers=headers)
users = r.json()
with open('users.json', 'w') as outfile:
json.dump(users, outfile)
I can dump first page of users into a json file by now. I can also find the 'next' page's url:
next_url = r.links['next'].get('url')
r2 = requests.get(next_url, headers=headers)
users2 = r2.json()
Since I don't know how many pages yet, how can I append 2nd, 3rd... page to 'users.json' sequentially in a while loop as fast as possible?
Thanks!
First, you need to open file in 'a' mode, otherwise subsequence write will overwrite everything
import requests
import json
url = 'https://api.github.com/users'
token = "my_token"
headers = {'Authorization': 'token %s' % token}
outfile = open('users.json', 'a')
while True:
r = requests.get(url, headers=headers)
users = r.json()
json.dump(users, outfile)
url = r.links['next'].get('url')
# I don't know what Github return in case there is no more users, so you need to double check by yourself
if url == '':
break
outfile.close()
Append the data you get from the requests query to a list and move on to the next query.
Once you have all of the data you want, then proceed to try to concatenate the data into a file or into an object. You can also use threading to do multiple queries in parallel, but most likely there is going to be rate limiting on the api.
I used Postman to send a raw request to Jetstar website to get the flight details. And I wanted to use python script to do the same thing using requests library, but I cannot get back the correct response.
Here what I have done in Postman:
And a simple script I used to send post request:
import requests
files = {'file': open('PostContent.txt', 'rb')}
if __name__ == "__name__"):
url = "http://www.jetstar.com/"
r = requests.post(url, files = files)
print(r.text)
When I run the python script, I always get the welcome page not flight details. I am not sure what is error?
Note: The PostContent.txt contains the form-data in raw text when I search for flights.
I used Chrome Dev Tool to capture the POST request when I search for a specific flight date. And it is the Form Data in the Headers.
Try using a dictionary instead of a FILE. The FILE is supposed to be for posting a FILE, not a FORM-ENCODED post, which is probably what the site expects.
payload = {
'DropDownListCurrency': 'SGD'
}
r = requests.post("http://httpbin.org/post", data=payload)
You use a key file which is wrong for this type of request. Also your sample code isn't working! Just paste working code here...
import requests
import logging
logging.basicConfig(level=logging.DEBUG)
payload = {"__EVENTTARGET":"",
"__EVENTARGUMENT":"",
"__VIEWSTATE":"/wEPDwUBMGQYAQUeX19Db250cm9sc1JlcXVpcmVQb3N0QmFja0tleV9fFgEFJ01lbWJlckxvZ2luU2VhcmNoVmlldyRtZW1iZXJfUmVtZW1iZXJtZSDCMtVG/1lYc7dy4fVekQjBMvD5",
"pageToken":"",
"total_price":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$RadioButtonMarketStructure":"RoundTrip",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextBoxMarketOrigin1":"Nadi (NAN)",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextBoxMarketDestination1":"Melbourne (Tullamarine) (MEL)",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextboxDepartureDate1":"14/01/2015",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextboxDestinationDate1":"16/02/2015",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$DropDownListCurrency":"AUD",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextBoxMarketOrigin2":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextBoxMarketDestination2":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextboxDepartureDate2":"16/02/2015",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextboxDestinationDate2":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextBoxMarketOrigin3":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextBoxMarketDestination3":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextboxDepartureDate3":"27/12/2014",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextboxDestinationDate3":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextBoxMarketOrigin4":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextBoxMarketDestination4":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextboxDepartureDate4":"03/01/2015",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextboxDestinationDate4":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextBoxMarketOrigin5":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextBoxMarketDestination5":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextboxDepartureDate5":"10/01/2015",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextboxDestinationDate5":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextBoxMarketOrigin6":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextBoxMarketDestination6":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextboxDepartureDate6":"17/01/2015",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextboxDestinationDate6":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$DropDownListPassengerType_ADT":1,
"ControlGroupSearchView$AvailabilitySearchInputSearchView$DropDownListPassengerType_CHD":0,
"ControlGroupSearchView$AvailabilitySearchInputSearchView$DropDownListPassengerType_INFANT":0,
"ControlGroupSearchView$AvailabilitySearchInputSearchView$RadioButtonSearchBy":"SearchStandard",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextBoxMultiCityOrigin1":"Origin",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextBoxMultiCityDestination1":"Destination",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextboxDepartureMultiDate1":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextBoxMultiCityOrigin2":"Origin",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextBoxMultiCityDestination2":"Destination",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$TextboxDepartureMultiDate2":"",
"ControlGroupSearchView$AvailabilitySearchInputSearchView$DropDownListMultiPassengerType_ADT":1,
"ControlGroupSearchView$AvailabilitySearchInputSearchView$DropDownListMultiPassengerType_CHD":0,
"ControlGroupSearchView$AvailabilitySearchInputSearchView$DropDownListMultiPassengerType_INFANT":0,
"ControlGroupSearchView$AvailabilitySearchInputSearchView$numberTrips":2,
"ControlGroupSearchView$AvailabilitySearchInputSearchView$ButtonSubmit":""}
if __name__ == "__main__":
url = "http://booknow.jetstar.com/Search.aspx"
r = requests.post(url, data=payload)
print(r.text)