How can I make a bot which will automatically go to a file upload form on my webapp at a public link, click the upload a file button, select a file and submit it?
For example, I have a file called "stored.csv" on my desktop, and I have a webapp with an upload for that looks like this:
All I'm trying to do is have a script which can grab that stored.csv file, go to the public link (http://website.com/upload/) that takes you to this page and then submit the file so that it all happens automatically when the script is run.
It would be much easier to send post request that button does right away.
All you need to do is to:
On that page open dev. tools in your browser (F12 most likely)
In appeared window click on the "Network" tab
Then leaving this window opened, choose any file and click "submit"
New record will appear at the end in "Network" tab containing information about the request that was made
Then knowing the request that you need to make, you can easily implement it in python:
import requests as req
url = "Url that you will acquire"
data = {
"smth" : "path/to/file" # just copy the body from the known request
}
res = req.post(url=url, data=data)
print(res.status)
And that's it.
There is some stuff that you'll need to figure out by your own, but now you got a map.
Hope this will help!
Related
I am automating the process to download the csv file from the google trends after I do a search, It can be done by clicking the download button using selenium but I am thinking of doing it using requests, when I click the button definitely a request is posted in a form of url by any means can I fetch that url,
The url looks like this:
https://trends.google.com/trends/api/widgetdata/multiline/csv?req=%7B%22time%22%3A%222021-03-09%202022-03-09%22%2C%22resolution%22%3A%22WEEK%22%2C%22locale%22%3A%22en-US%22%2C%22comparisonItem%22%3A%5B%7B%22geo%22%3A%7B%22country%22%3A%22PK%22%7D%2C%22complexKeywordsRestriction%22%3A%7B%22keyword%22%3A%5B%7B%22type%22%3A%22BROAD%22%2C%22value%22%3A%22bahria%20town%20karachi%22%7D%5D%7D%7D%5D%2C%22requestOptions%22%3A%7B%22property%22%3A%22%22%2C%22backend%22%3A%22IZG%22%2C%22category%22%3A29%7D%7D&token=APP6_UEAAAAAYioYw6KoUJcW4_Xv6d4fc4sJn3swSoON&tz=-300
I am tring to download few hundreds of HTML pages in order to parse them and calculate some measures.
I tried it with linux WGET, and with a loop of the following code in python:
url = "https://www.camoni.co.il/411788/168022"
html = urllib.request.urlopen(url).read()
but the html file I got doen't contain all the content I see in the browser in the same page. for example text I see on the screen is not found in the HTML file. only when I right click the page in the browser and "Save As" i get the full page.
the problem - I need a big anount of pages and can not do it by hand.
URL example - https://www.camoni.co.il/411788/168022 - thelast number changes
thank you
That's because that site is not static. It uses JavaScript (in this example jQuery lib) to fetch additional data from server and paste on page.
So instead of trying to GET raw HTML you should inspect requests in developer tools. There's a POST request on https://www.camoni.co.il/ajax/tabberChangeTab with such data:
tab_name=tab_about
memberAlias=ד-ר-דינה-ראלט-PhD
currentURL=/411788/ד-ר-דינה-ראלט-PhD
And the result is HTML that pasted on page after.
So instead of trying to just download page you should inspect page and requests to get data or use headless browser such as Google Chrome to emulate 'Save As' button and save data.
I'm trying to download an pre-generated HTML file, but all that i've tried doesn't work.
Searching StackOverflow i found that return FileResponse(open(file_path, 'rb')) will download a file, but intead of download, the HTML just is rendered on the tab. I think the problem is the browser receive the HTML and instead of display the "Save as" dialog just render it to the current tab.
In my main template i have a form with target="_blank" tag, and a submit button that open (without AJAX) a new tab who suposed to download automatically the file.
What i want: After i submit the code a new tab appears, the view related to that URL do some code (not related to the download) and after that process (that is working fine) download an HTML file to the device. The HTML exists and don't have any problem, te only problem it's that i want to DOWNLOAD the file but the browser display it instead.
Note 1: Yes, i know that with right clic -> download i can download the HTML that i see, but this system is for non IT people, i need to do it the easest way possible.
Note 2: I put the without AJAX message because i've found on another post that FileResponsive don't word from AJAX.
You should put special header in your response
Content-Disposition: attachment; filename="cool.html"
response = FileResponse(open(file_path, 'rb'))
response['Content-Disposition'] = 'attachment; filename="cool.html"'
return response
So I am using geckodriver.exe (for Firefox), and I use the following code to acces whatsapp web:
from selenium import webdriver
browser = None
def init():
browser = webdriver.Firefox(executable_path=r"C:/Users/Pascal/Desktop/geckodriver.exe")
browser.get("https://web.whatsapp.com/")
init()
But everytime I rerun the code, the QR-Code from whatsappweb has to be scanned again and I dont want that. In my normal chrome browser I dont have to scan the QR-Code everytime. How can I fix this ?
Since every time you close your selenium driver/browser, the cookies that attached with the session will also be deleted. So to restore the cookies you haved saved, you can retrieve it after the end of the session and restore it in the beginning of the next.
For getting the cookies,
# Go to the correct domain, i.e. your Whatsapp web
browser.get("https://www.example.com")
# get all the cookies from this domain
cookies = browser.get_cookies()
# store it somewhere, maybe a text file
For restoring the cookies
# Go to the correct domain, i.e. your Whatsapp web
browser.get("https://www.example.com")
# get back the cookies
cookies = {‘name’ : ‘foo’, ‘value’ : ‘bar’}
browser.add_cookies(cookies)
What you could do is define a profile in Firefox. Then open firefox with that profile and open web.whatsapp.com. You will be prompted with the QR code. You link that instance. From there you can use the newly created profile in Python.
Creating a new profile can be done by typing about:profiles in the url section of Firefox:
Then open the browser by clicking 'Launch profile in new browser':
In your Python code you create a reference to this profile:
options.add_argument('-profile')
options.add_argument('/home/odroid/Documents/PythonProfile')
A step by step guide can also be found here.
I am using mechanize to automatically download some pdf documents from webpages. When there is a pdf icon on the page, I can do this to get the file:
b.find_link(text="PDF download")
req = b.click_link(text="PDF download")
b.open(req)
Then I just write it to a new file.
However, for some of the documents I need, there is no direct 'PDF download' link on the page. Instead I have to click a 'submit' button to make a "delivery request" for the document: after clicking this button, the download starts happening while I am taken to another page which says "delivery request in progress" and then, once the download has finished, " Your delivery request is complete".
I have tried using mechanize to click the submit button, and then save the file that downloads by doing this:
b.select_form(nr=0)
b.submit()
downloaded_file = b.response().read()
but this stores the html of the page I am redirected to, not the file that downloads.
How do I get the file that downloads after I click 'submit'?
For anyone with a similar problem, I found a workaround: mechanize emulates a browser that doesn't have JavaScript so I turned that off on my browser too, then when I went to the download page I could see a link that said 'if the download hasn't already started, click here to download'. Then I could just get mechanize to find that link and follow it in the normal way- and write the response to a new file.