Have 2 html files mypage.html and page.html ,want to pass some values from one html to another and in another html how can fetch those values
Code which am using:
with open('page.html', 'w') as myFile:
myFile.write('<html>')
myFile.write('<body>')
myFile.write('<table>')
myFile.write('<tr>')
myFile.write('<td> Interface </td>')
myFile.write('<td> Global </td>')
myFile.write('</tr>')
myFile.write('<tr>')
myFile.write('<td> Interface</td>')
myFile.write('<td> Global</td>')
myFile.write('</% print c /%>')
#myFile.write('<var>params</var>')
myFile.write('</tr>')
myFile.write('</table>')
myFile.write('</body>')
myFile.write('</html>')
#render HTML page
with open('mypage.html', 'w') as myFile:
myFile.write('<html>')
myFile.write('<body>')
myFile.write('<table>')
myFile.write('<tr>')
myFile.write('<td> Host Name </td>')
myFile.write('<td> Result </td>')
myFile.write('</tr>')
myFile.write('<tr>')
c = Cookie.SimpleCookie()
c['mycookie'] = 'cookie_value'
myFile.write('<td> <a href="page.html?params="'+final_dict.keys()[0]+ '>'+final_dict.keys()[0]+'</a></td>')
myFile.write('<td> Fail</td>')
myFile.write('</tr>')
myFile.write('</table>')
myFile.write('</body>')
myFile.write('</html>')
You have a python script which generates html. That is fine.
You are saving that html to a file and then what? Serving the file? I guess you generate the files once and then you have static files. No python code is executed when user actually requests a url. You should change that concept completely.
What you want to happen is:
user types a url in the browser
browser sends request to the server
server executes some python which generates html, but does not save it to a file. It just sends it back to the user
user sees generated html
With that approach, in step 3, python script is executed every time and you have a chance to read the request parameters.
Now, to do that, you need a server.
You could go to the absolute low-level approach, configuring Apache, IIS or whatever to serve .py files by executing python as a CGI script. That would be very similar to what you are currently doing. If you do that, you will learn a lot about how HTTP works, but it is the hard way and not optimal performance-wise
A better option
You can run a python web server which calls python functions to serve each configured url. Take a look at CherryPy and Flask. Flask is very good, but CherryPy may be better suited for beginners.
I suggest you should start with CherryPy. Follow the documentation, look for tutorials...
You can get URL parameters using the built-in array $_GET in php.
To retreive the parameters from the URL, you can insert this PHP code into the html source of the webpage (the one you are opening with the data in the URL):
<?php
echo $_GET['key'] // Will print out the data of the given key
?>
Of course, you can do what you want with this data using PHP. This will only work if you are running the webpage on a web server with PHP enabled.
Related
Hey I'm trying to create something like fiddler autoresponder.
I want replace specific url content to other example:
I researched everything but can't found I tried create proxy server, nodejs script, python script. Nothing.
https://randomdomain.com content replace with https://cliqqi.ml
ps I'm doing this because I want intercept electron app getting main game file from site and then just intercept they game file site to my site.
If you're looking for a way to do this programmatically in Node.js, I've written a library you can use to do exactly that, called Mockttp.
You can use Mockttp to create a HTTPS-intercepting & rewriting proxy, which will allow you to send mock responses directly, redirect traffic from one address to another, rewrite anything including the headers & body of existing traffic, or just log everything that's sent & received. There's a full guide here: https://httptoolkit.tech/blog/javascript-mitm-proxy-mockttp/
I have about 8 reports that I need to pull from a system every week which takes quite a bit of time so I am working on automating this process. I am using requests to login to the site and download the files. However, when I download the file using my python script the file comes back blank. When I use the same link to download from the browser its not blank. Below is my code:
payload = {
'txtUsername': 'uid',
'txtPassword': 'pass'
}
domain = 'https://example.com/login.aspx?ReturnUrl=%2fiweb%2f'
path = 'C:\\Users\\workspace\\data-in\\'
with requests.Session() as s:
p = s.post(domain, data=payload)
r = s.get('https://example.com/forms/MSWordFromSql.aspx?ContentType=excel&object=Organization&FormKey=f326228c-3c49-4531-b80d-d59600485557')
with open(path + 'report1.xls', 'wb') as f:
f.write(r.content)
A little about the url. When I was looking for the url I found that it's wrapped in some JS.
Export Raw Data to Excel
However, when I take a look at the path from which the files was downloaded the true location for the report is this:
https://example.com/forms/MSWordFromSql.aspx?ContentType=excel&object=Organization&FormKey=f326228c-3c49-4531-b80d-d59600485557
This is the URL I am using in my code to download a report. After I run the script the file is created, named and saved to the correct directory but its empty. As I mentioned at the top of the thread, if I simply copy the URL about to the browser it downloads the report with no problem.
I was also thinking about using Selenium to get this done but the issue is I cannot rename the files while they are being downloaded. I need each file to have a specific name because all of the downloaded reports are then used in another automation script.
As #Lucas mentioned, your Python code likely sends a different request than your browser does, and thus receives a different response.
I'd use the browser dev tools to inspect the request the browser makes to initiate the download. Use "Copy as curl" and try to reproduce the correct behavior from the command line.
Then reduce the differences between the curl request and the one your python code makes by removing unnecessary parts from the curl invocations and adding the necessary headers to your python code. https://curl.trillworks.com/ can help with the latter.
I'm trying to export a CSV from this page via a python script. The complicated part is that the page opens after clicking the export button on this page, begins the download, and closes again, rather than just hosting the file somewhere static. I've tried using the Requests library, among other things, but the file it returns is empty.
Here's what I've done:
url = 'http://aws.state.ak.us/ApocReports/CampaignDisclosure/CDExpenditures.aspx?exportAll=True&%3bexportFormat=CSV&%3bisExport=True%22+id%3d%22M_C_sCDTransactions_csfFilter_ExportDialog_hlAllCSV?exportAll=True&exportFormat=CSV&isExport=True'
with open('CD_Transactions_02-27-2017.CSV', "wb") as file:
# get request
response = get(url)
# write to file
file.write(response.content)
I'm sure I'm missing something obvious, but I'm pulling my hair out.
It looks like the file is being generated on demand, and the url stays only valid as long as the session lasts.
There are multiple requests from the browser to the webserver (including POST requests).
So to get those files via code, you would have to simulate the browser, possibly including session state etc (and in this case also __VIEWSTATE ).
To see the whole communication, you can use developer tools in the browser (usually F12, then select NET to see the traffic), or use something like WireShark.
In other words, this won't be an easy task.
If this is open government data, it might be better to just ask that government for the data or ask for possible direct links to the (unfiltered) files (sometimes there is a public ftp server for example) - or sometimes there is an API available.
The file is created on demand but you can download it anyway. Essentially you have to:
Establish a session to save cookies and viewstate
Submit a form in order to click the export button
Grab the link which lies behind the popped-up csv-button
Follow that link and download the file
You can find working code here (if you don't mind that it's written in R): Save response from web-scraping as csv file
I'll start with saying I'm not very familiar with AS3 coding at all, which I'm pretty sure SWF files are coded with (someone can correct me if I'm wrong)
I have a SWF file which accepts an ID parameter, within the code it takes the ID and performs some hash routines on it, eventually produces a new 'token' and within the code loads a new url using this token
I found this by taking the swf file to showmycode and decompiling
My code is in Python and the SWF file is online, I could download and save it locally
Is it possible to somehow execute the swf in python or by using urllib to grab this new url?
It doesn't seem to act the same as a redirect url, as when I do:
request = urllib2.Request(url)
response = urllib2.urlopen(request)
print response.geturl()
Just returns the url that I am requesting, so I'm not sure how or even if I can grab what is being spit out
Edit - This is the MD5 that is being used - https://code.google.com/p/as3corelib/source/browse/trunk/src/com/adobe/crypto/MD5.as?r=51
Trying to find a Python equivalent
Execute the swf in Python? As far as I understand, you want to have the same token transformation functionality developed in Python, right?
If so - you just need to read the code and translate it into your own app. You cannot run swf from python, nor you will get any response (or "spit out" as you call it). Flash is an executable file ran from a plugin (virtual machine). You won't be able to grab anything from it nor you will be able to execute it by your own.
Looks like I was making things too complicated
I was able to just use python hashlib.md5 to produce the same results as the AS3 code
m = hashlib.md5()
m.update('test')
m.hexdigest()
I am trying to save a file (audio/mp3 in this case) to the App Engine blobstore, but with mixed success. Everything seems to work, a file is saved in the blobstore, of the right type, but it essentially empty (1.5kB vs. the expected 6.5kB) and so won't play. The URL in question is http://translate.google.com/translate_tts?ie=UTF-8&tl=en&q=revenues+in+new+york+were+56+million
The app engine logs do not show anything unusual - all parts are executing as expected... Any pointers would be appreciated!
class Dictation(webapp2.RequestHandler):
def post(self):
sentence = self.request.get('words')
# Google Translate API cannot handle strings > 100 characters
sentence = sentence[:100]
# Replace the non-alphanumeric characters
# The spaces in the sentence are replaced with the Plus symbol
sentence = urllib.urlencode({'q': sentence})
# Name of the MP3 file generated using the MD5 hash
mp3_file = hashlib.md5(sentence).hexdigest()
# Save the MP3 file in this folder with the .mp3 extension
mp3_file = mp3_file + ".mp3"
# Create the full URL
url = 'http://translate.google.com/translate_tts?ie=UTF-8&tl=en&' + sentence
# upload to blobstore
mp3_file = files.blobstore.create(mime_type = 'audio/mp3', _blobinfo_uploaded_filename = mp3_file)
mp3 = urllib.urlopen(url).read()
with files.open(mp3_file, 'a') as f:
f.write(mp3)
files.finalize(mp3_file)
blob_key = files.blobstore.get_blob_key(mp3_file)
logging.info('blob_key identified as %s', blob_key)
The problem has nothing to do with your code; it is correctly retrieving the data from the URL you gave.
For example, if I try this at the command line:
$ curl -O http://translate.google.com/translate_tts?ie=UTF-8&tl=en&q=revenues+in+new+york+were+56+million
I get a 1.5kB 403 error page, whose contents say:
403. That's an error.
Your client does not have permission to get URL /translate_tts?ie=UTF-8&tl=en&q=revenues+in+new+york+were+56+million from this server. (Client IP address: 1.2.3.4)
That’s all we know.
And your code does the exact same thing, whether run in GAE or directly in the interactive interpreter.
Most likely, the reason it works in your browser is that you do have permissions. So, what does that mean? It could mean that you have a valid SID cookie from google.com in your browser, but not your script. Or it could mean that your browser's user agent is recognized as something that can play HTML5 audio, but your script's isn't. Or…
Well, you can try to reverse-engineer what's different in the cookies, headers, etc. between your browser and your script, and narrow it down to the relevant difference, and use explicit headers or cookies or whatever you need to work around the problem.
But it will just break the next time Google changes anything.
And Google will probably not be happy with you if you try this. They offer a Google Translate API service that they want you to use, and they got rid of all of the free options for that API because of "substantial economic burden caused by extensive abuse." Trying to publish a Google App Engine web service that evades Google's API pricing by scraping their pages is probably not the kind of thing they enjoy their customers doing.