File upload through python mechanize - python

I am trying to upload image file into the browser using mechanize.
Although there is no error, the uploaded file does not reflect when I check manually in the browser (post submit/saving).
I am using the following code to upload the files
import mechanize as mc
br = mc.Browser()
br.set_handle_robots(False)
br.select_form(nr=0)
br.form.add_file(open("test.png"), content_type="image/png",
filename='before',name="ctl00$ContentPlaceHolder1$fileuploadBeforeimages")
br.submit("ctl00$ContentPlaceHolder1$cmdSave")
# this is supposed to save the form on the webpage. It saves the texts in the other fields, whereas the image does not show up.
The add file command seems to work. I can confirm this because when I print br.forms()[0] the file details show up (<FileControl(ctl00$ContentPlaceHolder1$fileuploadBeforeimages=before)>).
But there is no sign of the image file post this code snippet. I have checked several examples which include br.submit() without any specific button control, when I do this no page is saved on the website.
What am I missing?
Thanks in advance.
EDIT
When I manually try to upload the file, I see a pop-up asking for confirmation. Under inspect, this is present as
onchange="if (confirm('Upload ' + this.value + '?')) this.form.submit();"
I am not sure if this is a JavaScript element and mechanize cannot pass through this part for upload function. Can someone confirm this.?

you can just put 'rb' in front of image name like this:
br.form.add_file(open("test.png",'rb'),'images/png',filename,name='file')

Related

How to upload file using mechanize python?

I need to upload a file(txt) in a form like this:
the structure of the add button is as follows:
after upload I have to send the file with the send button:
the structure of the send button is as follows:
I searched and found only examples like this:
browser.select_form(name = 'formForm')
browser.form.add_file(open(directory))
response = browser.submit()
but I didn’t succeed, if anyone can help me thank you very much
this is what you're looking for:
br.form.add_file(open(filename), 'text/plain', filename)

Download a xlsx file by clicking a website button using Python

I'm writing a Python script that creates a COVID-19 dashboard for my country and state and updates it daily.
However, I am struggling to download one of the necessary files.
Basically to download the file I have to access the website (https://covid.saude.gov.br/) and click on a button (class="btn-white md button button-solid button-has-icon-only ion-activatable ion-focusable hydrated ion-activated").
I tried to download via the download link but the site creates a different link every time you click the button and it still has a blob URL before HTTP.
I am very grateful to anyone who tries to help, because the data will be used to monitor the progress of the disease here where I live.
You can use their API to get the file name:
import requests
headers = {
'authority':'xx9p7hp1p7.execute-api.us-east-1.amazonaws.com',
'x-parse-application-id':'unAFkcaNDeXajurGB7LChj8SgQYS2ptm',
}
with requests.Session() as session:
session.headers.update(headers)
resp = session.get('https://xx9p7hp1p7.execute-api.us-east-1.amazonaws.com/prod/PortalGeral').json()
path = resp['results'][0]['arquivo']['url']
The x-parse-application-id doesn't seem to change. If it does, you can get the correct one by querying https://xx9p7hp1p7.execute-api.us-east-1.amazonaws.com/prod/PortalGeralApi and extract it from ['planilha']['arquivo'][url].

Logging in to website to access data using Python

I have a subscription to the site https://www.naturalgasintel.com/ for daily feeds of data that show up on their site directly as .txt files; their user login page being https://www.naturalgasintel.com/user/login/
For example a file for today's feed is given by the link https://naturalgasintel.com/ext/resources/Data-Feed/Daily-GPI/2019/01/20190104td.txt and shows up on the site like the picture below:
What I'd like to do is to log in using my user_email and user_password and scrape this data in the form of an Excel file.
When I use Twill to try and 'point' me to the data by first logging me into the site I use this code:
from email.mime.text import MIMEText
from subprocess import Popen, PIPE
import twill
from twill.commands import *
year= NOW[0:4]
month=NOW[5:7]
day=NOW[8:10]
date=(year+month+day)
path = "https://naturalgasintel.com/ext/resources/Data-Feed/Daily-GPI/"
end = "td.txt"
go("http://www.naturalgasintel.com/user/login")
fv("2", "user[email]", user_email)
fv("2", "user[password]", user_password)
fv("2", "commit", "Login")
datafilelocation = path + year + "/" + month + "/" + date + end
go(datafilelocation)
However, logging in from the user login page sends me to this referrer link when I go to the data's location.
https://www.naturalgasintel.com/user/login?referer=%2Fext%2Fresources%2FData-Feed%2FDaily-GPI%2F2019%2F01%2F20190104td.txt
Rather than:
https://naturalgasintel.com/ext/resources/Data-Feed/Daily-GPI/2019/01/20190104td.txt
I've tried using modules like requests as well to log in from the site and then access this data but whatever method I use sends me to the HTML source rather than the .txt data location itself.
I've posted my complete walk-through with the Python 2.7 module Twill which I attached a bounty to here:
Using Twill to grab .txt from login page Python
What would the best solution to being able to access these password protected files be?
If you have a compatible version of FireFox for this, then get the plugin javascript 0.0.1 by Chee and add the following to run on the page:
document.getElementById('user_email').value = "E-What";
document.getElementById('user_password').value = " ABC Password ";
Change the email and password as you like. It will load the page, then after that it will put in your username and password.
There are other ways to do this all by yourself with your own stand-alone process. You do not have to download other people's programs and try to learn them (beyond this little thing) if you change it this way.
I would have up voted this question.

Python Mechanize write to TinyMCE Text Editor

I'm using Python Mechanize for adding an event to WordPress but I can't seem to figure out how to write to the TinyMCE Editor in the 'Add New' Event section.
I've been able to make a draft so far by just setting the Title with some value for testing purposes but I am stuck here. What I've done so far is...
br = mechanize.Browser()
response = br.open(url)
Intermediate steps to get to the correct page that don't need to be listed...
Once on the correct page I choose the form that I want to work with, select it and set the title. Once I submit I can actually travel to my drafts section in my normal chrome/firefox browser to see a draft has been created.
for f in br.forms():
if f.name == postForm:
print f
br.select_form(f.name)
br.form['post_title'] = 'Creating from MECHANIZE'
br.submit(name='save', label='Save Draft')
What would be the intermediary steps to input data into the TinyMCE editor?
I realized that by writing:
br.form['content'] = "some content"
You are able to write to the custom textarea. Any HTML content that you have in triple double-quotes will show up as you want once you submit the post.

Upload a text file to a site for analysis using MechanicalSoup

I'm trying to get a text file to TreeTagger Online to get it analyzed and get the link to the resulting file to download.
import mechanicalsoup
browser = mechanicalsoup.Browser()
homePage = browser.get("http://cental.fltr.ucl.ac.be/treetagger/")
formPart = homePage.soup.select("form[name=treetagger_form]")[0]
formPart.select("[name=file_to_tag]")[0]["name"]=open('test.txt', 'rb')
result = browser.post(formPart, homePage.url)
This gives me the following error:
: (, UnicodeEncodeError('ascii', u'No connection adapters were found for \'\n\n\n\n\n Texte \xe0 \xe9tiqueter : \n \n\n\n\n\n\n\n\n\n\n\n\n\n\n\'', 216, 217, 'ordinal not in range(128)'))
How should I proceed to get my file on site (using MechanicalSoup or another module)?
01/04/19 Edit
Even though I did not manage to get #Rolando Urquiza's answer to work on my machine, I was able to get the thing done from his suggestions.
import mechanicalsoup
browser = mechanicalsoup.Browser()
homePage = browser.get("http://cental.fltr.ucl.ac.be/treetagger/")
formPart = homePage.soup.select("form[name=treetagger_form]")[0]
form=mechanicalsoup.Form(formPart)
form.set('file_to_tag', 'test.txt')
upload=browser.submit(form,url="http://cental.fltr.ucl.ac.be/treetagger/")
Thanks #Rolando Urquiza
According to the documentation of MechanicalSoup, you can upload a file using the set function on a mechanicalsoup.Form instance, see here. For example, this is how you can use it:
import mechanicalsoup
browser = mechanicalsoup.StatefulBrowser()
browser.get("http://cental.fltr.ucl.ac.be/treetagger/")
form = browser.select_form()
form.set('file_to_tag', 'test.txt')
result = browser.submit_selected()

Categories

Resources