Get current URL in Python

Get current URL in Python - python

How would i get the current URL with Python,
I need to grab the current URL so i can check it for query strings e.g
requested_url = "URL_HERE"
url = urlparse(requested_url)
if url[4]:
params = dict([part.split('=') for part in url[4].split('&')])
also this is running in Google App Engine

Try this:
self.request.url
Also, if you just need the querystring, this will work:
self.request.query_string
And, lastly, if you know the querystring variable that you're looking for, you can do this:
self.request.get("name-of-querystring-variable")

For anybody finding this via google,
i figured it out,
you can get the query strings on your current request using:
url_get = self.request.GET
which is a UnicodeMultiDict of your query strings!

I couldn't get the other answers to work, but here is what worked for me:
url = os.environ['HTTP_HOST']
uri = os.environ['REQUEST_URI']
return url + uri

Try this
import os
url = os.environ['HTTP_HOST']

This is how I capture in Python 3 from CGI (A) URL, (B) GET parameters and (C) POST data:
=======================================================
import sys, os, io
CAPTURE URL
myDomainSelf = os.environ.get('SERVER_NAME')
myPathSelf = os.environ.get('PATH_INFO')
myURLSelf = myDomainSelf + myPathSelf
CAPTURE GET DATA
myQuerySelf = os.environ.get('QUERY_STRING')
CAPTURE POST DATA
myTotalBytesStr=(os.environ.get('HTTP_CONTENT_LENGTH'))
if (myTotalBytesStr == None):
myJSONStr = '{"error": {"value": true, "message": "No (post) data received"}}'
else:
myTotalBytes=int(os.environ.get('HTTP_CONTENT_LENGTH'))
myPostDataRaw = io.open(sys.stdin.fileno(),"rb").read(myTotalBytes)
myPostData = myPostDataRaw.decode("utf-8")
Write RAW to FILE
mySpy = "myURLSelf: [" + str(myURLSelf) + "]\n"
mySpy = mySpy + "myQuerySelf: [" + str(myQuerySelf) + "]\n"
mySpy = mySpy + "myPostData: [" + str(myPostData) + "]\n"
You need to define your own myPath here
myFilename = "spy.txt"
myFilePath = myPath + "\" + myFilename
myFile = open(myFilePath, "w")
myFile.write(mySpy)
myFile.close()
=======================================================
Here are some other useful CGI environment vars:
AUTH_TYPE
CONTENT_LENGTH
CONTENT_TYPE
GATEWAY_INTERFACE
PATH_INFO
PATH_TRANSLATED
QUERY_STRING
REMOTE_ADDR
REMOTE_HOST
REMOTE_IDENT
REMOTE_USER
REQUEST_METHOD
SCRIPT_NAME
SERVER_NAME
SERVER_PORT
SERVER_PROTOCOL
SERVER_SOFTWARE
============================================
I am using these methods running Python 3 on Windows Server with CGI via MIIS.
Hope this can help you.

requests module has 'url' attribute, that is changed url.
just try this:
import requests
current_url=requests.get("some url").url
print(current_url)

If your python script is server side:
You can use os
import os
url = os.environ
print(url)
with that, you will see all the data os.environ gives you. It looks like your need the 'QUERY_STRING'. Like any JSON object, you can obtain the data like this.
import os
url = os.environ['QUERY_STRING']
print(url)
And if you want a really elegant scalable solution you can use anywhere and always, you can save the variables into a dictionary (named vars here) like so:
vars={}
splits=os.environ['QUERY_STRING'].split('&')
for x in splits:
name,value=x.split('=')
vars[name]=value
print(vars)
If you are client side, then any of the other responses involving the get request will work

Related

How do I get the args from a post or get with Python without using cgi.FieldStorage

I just read that cgi is deprecated and so cgi.FieldStorage will stop working.
I'm struggling to find the replacement for this functionality. All the searches I've tried refer to urllib or requests, both of which (AFAIK) are designed to create requests, not to respond to them.
Thanks in advance

The reference to urllib is actually a bit misleading. The following might give some insight to the cgi interface from a python programmers point of view:
#!/usr/bin/python3
'''
preflight_cgi.py
check the preflight option call
'''
import sys
import os
if __name__ == "__main__":
print("Content-Type: text/html") # HTML is following
print()
i = 0
for arg in sys.argv:
print("argv{}: {}\n".format(i, arg))
i = 0
for line in sys.stdin:
print("line {}: {}\n".format(i, line))
i += 1
print("<TITLE>CGI script output</TITLE>")
print("<H1>This is the environmet</H1>")
for it in os.environ.items():
print("<p>{} = {}</p>".format(it[0], it[1]))
Put that where your current cgi.FieldStorage based app is and call it via the address line of the browser.
You will see something like
[...]
CONTENT_LENGTH = 0
QUERY_STRING = par=meter&var=able
REQUEST_URI = /cgi-bin/preflight_cgi.py?par=meter&var=able
REDIRECT_STATUS = 200
SCRIPT_NAME = /cgi-bin/preflight_cgi.py
REQUEST_METHOD = GET
SERVER_PROTOCOL = HTTP/1.1
SERVER_SOFTWARE = lighttpd/1.4.53
GATEWAY_INTERFACE = CGI/1.1
REQUEST_SCHEME = http
SERVER_PORT = 80
[...]
The environment variables have already most of done.
As an alternative you can also use one of the http.server classes to build the server completely in python.

Check the domain availability via Python and Godaddy API

I have a list of domain names in txt file ('links.txt') and I want to check it's availability and write it in different txt file ('available_domains.txt') if it is available. I wrote the code like:
import requests
import time
import json
api_key = "3mM44UaguNL6GH_Kc3bKzig25G1mZtnA87nwS"
secret_key = "37ZnMbQkQrYJ5pF57ZhrEi"
headers = {"Authorization" : "sso-key {}:{}".format(api_key, secret_key)}
url = "https://api.godaddy.com/v1/domains/available"
appraisal = "https://api.godaddy.com/v1/appraisal/{}"
do_appraise = True
with open("links.txt") as f:
for domains in f:
availability_res = requests.post(url, json=domains, headers=headers)
for domain in json.loads(availability_res.text)['domains']:
if domain['available']:
with open("available_domains.txt", 'w', newline="", encoding="UTF-8") as f:
f.write(domain)
else:
print("Not Available")
But I'm getting error like:
for domain in json.loads(availability_res.text)["domains"]:
KeyError: 'domains'
I'm new in it. And I don't think my code is that correct. If you have any idea can you help me with it??

I have found your issue.
You need to define the "domain" using the "params" flag of the get requests function.
Here is a little snippet that worked for me (you will need to incorporate it into your for loop)
api_key = "YOURKEYHERE"
secret_key = "YOURKEYHERE"
headers = {"Authorization": "sso-key {}:{}".format(api_key, secret_key)}
url = "https://api.godaddy.com/v1/domains/available"
test = get(url, params={'domain':'google.co.uk'}, headers=headers).text
print(test)

I have been tasked to do the same thing in python.
Thanks to this blogGodaddy domain name API in Python I was able to complete this task.
Try it out

This error occur because of the limit of requests you can send per minute you have to add time.sleep(48) after every 20 request in this way yo will not get this error

OpenShift REST API for scaling, invalid character 's' looking for beginning of value

I'm trying to scale out my deployments using openshift rest api, but I'm encountering the error "invalid character 's' looking for beginning of value".
I can successfully get the deployment config details but it's the patch request which is troubling me.
From the documents I have tried Content-Type as below 3 but nothing works:
application/json-patch+json
application/merge-patch+json
application/strategic-merge-patch+json
Here's my code:
data = {'spec':{'replicas':2}}
headers = {"Authorization": token, "Content-Type": "application/json-patch+json"}
def updateReplicas():
url = root + "namespaces" + namespace + "deploymentconfigs" + dc + "scale"
resp = requests.patch(url, headers=headers, data=data, verify=False)
print(resp.content)
Thank you.

Ok, I found out the issue. Silly thing first, data should be inside single quotes data = '{'spec':{'replicas':2}}'.
Then, we need few more info in our data, which finally looks like :
data = '{"kind":"Scale","apiVersion":"extensions/v1beta1","metadata":{"name":"deployment_name","namespace":"namespace_name"},"spec":{"replicas":1}}'
Thank you for your time.

I had the same use case and the hint of #GrahamDumpleton to run oc with --loglevel 9 was very helpful.
This is what oc scale does:
It makes a get request to the resource, receiving some JSON object
Then it makes a put request to the resource, with a modified JSON
object (the number of replicas changed) as payload
If you're doing this you don't have to worry about setting the apiVersion, you just reuse what you get in the first place.
Here is a small python script, that follows this approach:
"""
Login into your project first `oc login` and `oc project <your-project>` before running this script.
Usage:
pip install requests
python scale_pods.py --deployment-name <your-deployment> --nof-replicas <number>
"""
import argparse
import requests
from subprocess import check_output
import warnings
warnings.filterwarnings("ignore") # ignore insecure request warnings
def byte_to_str(bs):
return bs.decode("utf-8").strip()
def get_endpoint():
byte_str = check_output("echo $(oc config current-context | cut -d/ -f2 | tr - .)", shell=True)
return byte_to_str(byte_str)
def get_namespace():
byte_str = check_output("echo $(oc config current-context | cut -d/ -f1)", shell=True)
return byte_to_str(byte_str)
def get_token():
byte_str = check_output("echo $(oc whoami -t)", shell=True)
return byte_to_str(byte_str)
def scale_pods(deployment_name, nof_replicas):
url = "https://{endpoint}/apis/apps.openshift.io/v1/namespaces/{namespace}/deploymentconfigs/{deplyoment_name}/scale".format(
endpoint=get_endpoint(),
namespace=get_namespace(),
deplyoment_name=deployment_name
)
headers = {
"Authorization": "Bearer %s" % get_token()
}
get_response = requests.get(url, headers=headers, verify=False)
data = get_response.json()
data["spec"]["replicas"] = nof_replicas
print(data)
response_put = requests.put(url, headers=headers, json=data, verify=False)
print(response_put.status_code)
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--deployment-name", type=str, required=True, help="deployment name")
parser.add_argument("--nof-replicas", type=int, required=True, help="nof replicas")
args = parser.parse_args()
scale_pods(args.deployment_name, args.nof_replicas)
if __name__ == "__main__":
main()

MITMProxy: smart URL replacement

We use a custom scraper that have to take a separate website for a language (this is an architecture limitation). Like site1.co.uk, site1.es, site1.de etc.
But we need to parse a website with many languages, separated by url - like site2.com/en, site2.com/de, site2.com/es and so on.
I thought about MITMProxy: I could redirect all requests this way:
en.site2.com/* --> site2.com/en
de.site2.com/* --> site2.com/de
...
I have written a small script which simply takes URLs and rewrites them:
class MyMaster(flow.FlowMaster):
def handle_request(self, r):
url = r.get_url()
# replace URLs
if 'blabla' in url:
r.set_url(url.replace('something', 'another'))
But the target host generates 301 redirect with the response from the webserver - 'the page has been moved here' and the link to the site2.com/en
It worked when I played with URL rewriting, i.e. site2.com/en --> site2.com/de.
But for different hosts (subdomain and the root domain, to be precise), it does not work.
I tried to replace the Host header in the handle_request method from above:
for key in r.headers.keys():
if key.lower() == 'host':
r.headers[key] = ['site2.com']
also I tried to replace the Referrer - all of that didn't help.
How can I finally spoof that request from the subdomain to the main domain? If it generates a HTTP(s) client warning it's ok since we need that for the scraper (and the warnings there can be turned off), not the real browser.
Thanks!

You need to replace the content of the response and craft the header with just a few fields.
Open a new connection to the redirected url and craft your response :
def handle_request(self, flow):
newUrl = <new-url>
retryCount = 3
newResponse = None
while True:
try:
newResponse = requests.get(newUrl) # import requests
except:
if retryCount == 0:
print 'Cannot reach new url ' + newUrl
traceback.print_exc() # import traceback
return
retryCount -= 1
continue
break
responseHeaders = Headers() # from netlib.http import Headers
if 'Date' in newResponse.headers:
responseHeaders['Date'] = str(newResponse.headers['Date'])
if 'Connection' in newResponse.headers:
responseHeaders['Connection'] = str(newResponse.headers['Connection'])
if 'Content-Type' in newResponse.headers:
responseHeaders['Content-Type'] = str(newResponse.headers['Content-Type'])
if 'Content-Length' in newResponse.headers:
responseHeaders['Content-Length'] = str(newResponse.headers['Content-Length'])
if 'Content-Encoding' in newResponse.headers:
responseHeaders['Content-Encoding'] = str(inetResponse.headers['Content-Encoding'])
response = HTTPResponse( # from libmproxy.models import HTTPResponse
http_version='HTTP/1.1',
status_code=200,
reason='OK',
headers=responseHeaders,
content=newResponse.content)
flow.reply(response)

How to get current URL in python web page?

I am a noob in Python. Just installed it, and spent 2 hours googleing how to get to a simple parameter sent in the URL to a Python script
Found this
Very helpful, except I cannot for anything in the world to figure out how to replace
import urlparse
url = 'http://foo.appspot.com/abc?def=ghi'
parsed = urlparse.urlparse(url)
print urlparse.parse_qs(parsed.query)['def']
With what do I replace url = 'string' to make it work?
I just want to access http://site.com/test/test.py?param=abc and see abc printed.
Final code after Alex's answer:
url = os.environ["REQUEST_URI"]
parsed = urlparse.urlparse(url)
print urlparse.parse_qs(parsed.query)['param']

If you don't have any libraries to do this for you, you can construct your current URL from the HTTP request that gets sent to your script via the browser.
The headers that interest you are Host and whatever's after the HTTP method (probably GET, in your case). Here are some more explanations (first link that seemed ok, you're free to Google some more :).
This answer shows you how to get the headers in your CGI script:
If you are running as a CGI, you can't read the HTTP header directly,
but the web server put much of that information into environment
variables for you. You can just pick it out of os.environ[].
If you're doing this as an exercise, then it's fine because you'll get to understand what's behind the scenes. If you're building anything reusable, I recommend you use libraries or a framework so you don't reinvent the wheel every time you need something.

This is how I capture in Python 3 from CGI (A) URL, (B) GET parameters and (C) POST data:
=======================================================
import sys, os, io
CAPTURE URL
myDomainSelf = os.environ.get('SERVER_NAME')
myPathSelf = os.environ.get('PATH_INFO')
myURLSelf = myDomainSelf + myPathSelf
CAPTURE GET DATA
myQuerySelf = os.environ.get('QUERY_STRING')
CAPTURE POST DATA
myTotalBytesStr=(os.environ.get('HTTP_CONTENT_LENGTH'))
if (myTotalBytesStr == None):
myJSONStr = '{"error": {"value": true, "message": "No (post) data received"}}'
else:
myTotalBytes=int(os.environ.get('HTTP_CONTENT_LENGTH'))
myPostDataRaw = io.open(sys.stdin.fileno(),"rb").read(myTotalBytes)
myPostData = myPostDataRaw.decode("utf-8")
Write RAW to FILE
mySpy = "myURLSelf: [" + str(myURLSelf) + "]\n"
mySpy = mySpy + "myQuerySelf: [" + str(myQuerySelf) + "]\n"
mySpy = mySpy + "myPostData: [" + str(myPostData) + "]\n"
You need to define your own myPath here
myFilename = "spy.txt"
myFilePath = myPath + "\" + myFilename
myFile = open(myFilePath, "w")
myFile.write(mySpy)
myFile.close()
=======================================================
Here are some other useful CGI environment vars:
AUTH_TYPE
CONTENT_LENGTH
CONTENT_TYPE
GATEWAY_INTERFACE
PATH_INFO
PATH_TRANSLATED
QUERY_STRING
REMOTE_ADDR
REMOTE_HOST
REMOTE_IDENT
REMOTE_USER
REQUEST_METHOD
SCRIPT_NAME
SERVER_NAME
SERVER_PORT
SERVER_PROTOCOL
SERVER_SOFTWARE

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Get current URL in Python - python

How would i get the current URL with Python, I need to grab the current URL so i can check it for query strings e.g requested_url = "URL_HERE" url = urlparse(requested_url) if url[4]: params = dict([part.split('=') for part in url[4].split('&')]) also this is running in Google App Engine

Try this: self.request.url Also, if you just need the querystring, this will work: self.request.query_string And, lastly, if you know the querystring variable that you're looking for, you can do this: self.request.get("name-of-querystring-variable")

For anybody finding this via google, i figured it out, you can get the query strings on your current request using: url_get = self.request.GET which is a UnicodeMultiDict of your query strings!

I couldn't get the other answers to work, but here is what worked for me: url = os.environ['HTTP_HOST'] uri = os.environ['REQUEST_URI'] return url + uri

Try this import os url = os.environ['HTTP_HOST']

requests module has 'url' attribute, that is changed url. just try this: import requests current_url=requests.get("some url").url print(current_url)

Related

How do I get the args from a post or get with Python without using cgi.FieldStorage

Check the domain availability via Python and Godaddy API

OpenShift REST API for scaling, invalid character 's' looking for beginning of value

MITMProxy: smart URL replacement

How to get current URL in python web page?

Categories

Resources