I need to upload a file(txt) in a form like this:
the structure of the add button is as follows:
after upload I have to send the file with the send button:
the structure of the send button is as follows:
I searched and found only examples like this:
browser.select_form(name = 'formForm')
browser.form.add_file(open(directory))
response = browser.submit()
but I didn’t succeed, if anyone can help me thank you very much
this is what you're looking for:
br.form.add_file(open(filename), 'text/plain', filename)
Related
I'm writing a Python script that creates a COVID-19 dashboard for my country and state and updates it daily.
However, I am struggling to download one of the necessary files.
Basically to download the file I have to access the website (https://covid.saude.gov.br/) and click on a button (class="btn-white md button button-solid button-has-icon-only ion-activatable ion-focusable hydrated ion-activated").
I tried to download via the download link but the site creates a different link every time you click the button and it still has a blob URL before HTTP.
I am very grateful to anyone who tries to help, because the data will be used to monitor the progress of the disease here where I live.
You can use their API to get the file name:
import requests
headers = {
'authority':'xx9p7hp1p7.execute-api.us-east-1.amazonaws.com',
'x-parse-application-id':'unAFkcaNDeXajurGB7LChj8SgQYS2ptm',
}
with requests.Session() as session:
session.headers.update(headers)
resp = session.get('https://xx9p7hp1p7.execute-api.us-east-1.amazonaws.com/prod/PortalGeral').json()
path = resp['results'][0]['arquivo']['url']
The x-parse-application-id doesn't seem to change. If it does, you can get the correct one by querying https://xx9p7hp1p7.execute-api.us-east-1.amazonaws.com/prod/PortalGeralApi and extract it from ['planilha']['arquivo'][url].
I am trying to upload image file into the browser using mechanize.
Although there is no error, the uploaded file does not reflect when I check manually in the browser (post submit/saving).
I am using the following code to upload the files
import mechanize as mc
br = mc.Browser()
br.set_handle_robots(False)
br.select_form(nr=0)
br.form.add_file(open("test.png"), content_type="image/png",
filename='before',name="ctl00$ContentPlaceHolder1$fileuploadBeforeimages")
br.submit("ctl00$ContentPlaceHolder1$cmdSave")
# this is supposed to save the form on the webpage. It saves the texts in the other fields, whereas the image does not show up.
The add file command seems to work. I can confirm this because when I print br.forms()[0] the file details show up (<FileControl(ctl00$ContentPlaceHolder1$fileuploadBeforeimages=before)>).
But there is no sign of the image file post this code snippet. I have checked several examples which include br.submit() without any specific button control, when I do this no page is saved on the website.
What am I missing?
Thanks in advance.
EDIT
When I manually try to upload the file, I see a pop-up asking for confirmation. Under inspect, this is present as
onchange="if (confirm('Upload ' + this.value + '?')) this.form.submit();"
I am not sure if this is a JavaScript element and mechanize cannot pass through this part for upload function. Can someone confirm this.?
you can just put 'rb' in front of image name like this:
br.form.add_file(open("test.png",'rb'),'images/png',filename,name='file')
I'm trying to get a text file to TreeTagger Online to get it analyzed and get the link to the resulting file to download.
import mechanicalsoup
browser = mechanicalsoup.Browser()
homePage = browser.get("http://cental.fltr.ucl.ac.be/treetagger/")
formPart = homePage.soup.select("form[name=treetagger_form]")[0]
formPart.select("[name=file_to_tag]")[0]["name"]=open('test.txt', 'rb')
result = browser.post(formPart, homePage.url)
This gives me the following error:
: (, UnicodeEncodeError('ascii', u'No connection adapters were found for \'\n\n\n\n\n Texte \xe0 \xe9tiqueter : \n \n\n\n\n\n\n\n\n\n\n\n\n\n\n\'', 216, 217, 'ordinal not in range(128)'))
How should I proceed to get my file on site (using MechanicalSoup or another module)?
01/04/19 Edit
Even though I did not manage to get #Rolando Urquiza's answer to work on my machine, I was able to get the thing done from his suggestions.
import mechanicalsoup
browser = mechanicalsoup.Browser()
homePage = browser.get("http://cental.fltr.ucl.ac.be/treetagger/")
formPart = homePage.soup.select("form[name=treetagger_form]")[0]
form=mechanicalsoup.Form(formPart)
form.set('file_to_tag', 'test.txt')
upload=browser.submit(form,url="http://cental.fltr.ucl.ac.be/treetagger/")
Thanks #Rolando Urquiza
According to the documentation of MechanicalSoup, you can upload a file using the set function on a mechanicalsoup.Form instance, see here. For example, this is how you can use it:
import mechanicalsoup
browser = mechanicalsoup.StatefulBrowser()
browser.get("http://cental.fltr.ucl.ac.be/treetagger/")
form = browser.select_form()
form.set('file_to_tag', 'test.txt')
result = browser.submit_selected()
can anyone help me with "extracting" stuff from site using Python? Here is the info :
I have folder name with set of numbers (they are ID of item) and i have to use that ID for entering page and then "scrap" info from page to my notepad... It's like this : http://www.somesite.com/pic.mhtml?id=[ID]... I need to exctract picture link (picture link always have ID.jpg at the end of the file)from it and write it in notepad and then replace that txt name with name of the picture... Picture is always in title tags... Thanks in advance...
What you need is a data scraper - http://www.crummy.com/software/BeautifulSoup/ will help you pull data off of websites. You can then load that data into a variable, write it to a file, or do anything you normally do with data.
You could try parsing the html source for images.
Try something similar:
class Parser(object):
__rx = r'(url|src)="(http://www\.page\.com/path/?ID=\d*\.(jpeg|jpg|gif|png)'
def __crawl(self, url):
images = []
code = urllib.urlopen(url).read()
for line in code.split('\n'):
imagesearch = re.search(self.__rx, line)
if imagesearch:
image = '%s.%s' % (imagesearch.group(2), imagesearch.group(4))
images.append(image)
return images
it's untestet, you may want to check the regex
Sorry if this question is dumb. I converted a jqplot image to data-url and send it with a form. In the new page, I used cgi.FieldStorage() to get the submitted data-url, but got nothing. So could anyone give me some suggestions on my approach? Thanks!
I am using Python on Google App Engine
Input page Javascript
var imgData = $('#chart1').jqplotToImageStr({});
$('<tr style="display:none"><td><input type="hidden" name="extract1"></td></tr>')
.appendTo('.getpdf')
.find('input')
.data(imgData);
Output page:
def post(self):
form = cgi.FieldStorage()
extract1 = form.getvalue('extract1') #extract1 is empty
I tried to print the form and it looked as:MiniFieldStorage('extract1', '"
If I assigned the data-url (...), it worked.
I have (working) code that does something similar.
def post(self):
img_data = self.request.get('extract1')
# then strip off the prefix and convert from base64
is what I'm doing. Poking into cgi.FieldStorage isn't something I normally see done in Python App Engine apps.