How to set a text in a textarea by using Mechanical Soup? - python

I'm learning to create an Omegle bot, but the Omegle interface was created in HTML and I don't know very much about HTML nor MechanicalSoup.
In the part where the text is inserted, the code snippet is as follows:
<td class="chatmsgcell">
<div class="chatmsgwrapper">
<textarea class="chatmsg " cols="80" rows="3"></textarea>
</div>
</td>
In the part of the button to send the text, the code snippet is:
<td class="sendbthcell">
<div class="sendbtnwrapper">
<button class="sendbtn">Send<div class="btnkbshortcut">Enter</div></button>
</div>
</td>
I want to set a text in textarea and send it via button.
Looking at some examples in HTML, I guess the correct way to set text in a textarea is as follows:
<textarea>Here's a text.</textarea>
Also, I'm new at MechanicalSoup, but I think I know how to find and set a value in an HTML code:
# example in the Twitter interface
login_form = login_page.soup.find("form", {"class": "signin"})
LOGIN = "yourlogin"
login_form.find("input", {"name": "session[username_or_email]"})["value"] = LOGIN
From what I understand, the first argument is the name of the tag and a second argument is a dictionary whose first element is the name of the attribute and the second element is the value of the attribute.
But the tag textarea don't have an attribute for setting a text, like value="Here's a text.". What I should do for set a text in a textarea using MechanicalSoup?

I know it's not the answer you expect, but reading the doc would help ;-).
The full documentation is available at:
https://mechanicalsoup.readthedocs.io/
You probably want to start with the tutorial:
https://mechanicalsoup.readthedocs.io/en/stable/tutorial.html
In short, you need to select the form you want to fill-in:
browser.select_form('form[action="/post"]')
Then, filling-in fields is as simple as
browser["custname"] = "Me"
browser["custtel"] = "00 00 0001"
browser["custemail"] = "nobody#example.com"
browser["comments"] = "This pizza looks really good :-)"

Related

Adding a variable string to an expression in Airium (Python)

I'm working on a little jig that generates a static gallery page based on a folder full of images. My current hangup is generating the HTML itself-
I used Airium to reverse-translate my existing HTML to Airium's python code, and added the variables I want to modify for each anchor tag in a loop. But I can't for the life of me figure out how to get it to let me add 'thumblink'. I'm not sure why it's treating it so differently from the others, my guess is that Airium expects foo:bar but not foo:bar(xyz) with xyz being the only part I want to pull out and modify.
from airium import Airium
imagelink = "image name here" # after pulling image filename from list
thumblink = "thumb link here" # after resizing image to thumb size
artistname = "artist name here" # after extracting artist name from filename
a = Airium()
with a.a(href='javascript:void(0);', **{'data-image': imagelink}):
with a.div(klass='imagebox', style='background-image:url(images/2015-12-29kippy.png)'):
a.div(klass='artistname', _t= artistname)
html = str(a) # cast to string
print(html) # print to console
where "images/2015-12-29kippy.png" is what I'd replace with string variable "thumblink".
image and artist do translate correctly in the output after testing -
<a href="javascript:void(0);" data-image="image name here">
<div class="imagebox" style="background-image:url(images/2015-12-29kippy.png)">
<div class="artistname">artist name here</div>
</div>
</a>
>>>

Is there a way to find the exact path of an element in the requests module in Python?

Is there a way to select the exact "div" in a source of a Beautiful Soup object? For example, let's say we have soup like this:
<div class="dialog-shadow" id="popupMenu1" onblur="hidePopup();" onmouseout="closePopup = contextMenuInputHasFocus() ? null : setTimeout('hidePopup()',500);" onmouseover="if(closePopup!=null){clearTimeout(closePopup);closePopup=null}"></div>
<div id="popupMenu2" onblur="hidePopup();" onmouseout="closePopup = contextMenuInputHasFocus() ? null : setTimeout('hidePopup()',500);" onmouseover="if(closePopup!=null){clearTimeout(closePopup);closePopup=null}"></div>
<div class="shadow" id="popupMenu3" onblur="hidePopup3();hidePopup();" onmouseout="closePopup = setTimeout('hidePopup();', 500); closePopup3 = setTimeout('hidePopup3()',500);" onmouseover="if(closePopup!=null){clearTimeout(closePopup);closePopup=null};if(closePopup3!=null){clearTimeout(closePopup3);closePopup3=null};"></div>
<div id="container">
<div class="background-menu-dark shadow" id="navHolder">
<span class="customBranding" id="logo" onclick="loadView(V_SUMMARY);" title="Özet Görünümü"><img height="40" src="Branding/SmallBanner.jpg?ts=20140403111116"/></span>
<div id="navigation">
<ul id="navigationLargeWidth">
<li id="mainInboxLink">
And I want to find the third div whose class is "shadow" in this piece of soup. But when I do something like this, it returns None:
soup.find('div',attrs={"class":"shadow"})
I know that it should be something like "ABC-->BC-->C" If i want to find C in the soup, but is there a way that I can find C just by knowing its unique class or ID?
(soup.select("div:nth-of-type(3))) is not what I'm looking for)
I see only 2 divs with that class. However, the reason your nth-of-type could be failing is due to you not including the class. Unless there is some reason (you haven't given) as to why nth-of-type itself is not acceptable.
div.shadow:nth-of-type(3)
without proper html to test with I cannot be sure of index or whether content is dynamically loaded (if from webpage)
If you are trying to dynamically construct the path then something like this?
For a div with a unique class
select_one('div.shadow')

XPath delivering blank text

I am trying to pull the text out of a tag that follows an element I'm starting with. The HTML looks like this, with multiple entries of the same structure:
<h5>
Title
</h5>
<div class="author">
<p>"Author A, Author B"</p>
</div>
<div id="abstract-more#####" class="collapse">
<p>
<strong>Abstract:</strong>
"Text here..."
</p>
<p>...</p>
So once I've isolated a given title element/node (stored as 'paper'), I want to store the author and abstract text. It works when I use this to get the author:
author = paper.find_element_by_xpath("./following::div[contains(#class, 'author')]/p").text
But is returning a blank output for 'abstract' when I use this:
abstract = paper.find_element_by_xpath("./following::div[contains(#id, 'abstract-more')]/p").text
Why does it work fine for the author but not for the abstract? I've tried using .// instead of ./ and other slight tweaks but to no avail. I also don't know why it's not giving an error out and saying it can't find the abstract element and is instead just returning a blank...
Try this:
//div[contains(#id, 'abstract-more')]/p[1]
Please use starts-with in xpath instead of contains.
XPath: .//div[starts-with(#id, 'abstract-more')]/p"
abstract = paper.find_element_by_xpath(".//div[starts-with(#id, 'abstract-more')]/p").text
You can try this xpath :
//div[#class="author"]/following-sibling::div[contains(#id,'abstract-more')]/p[1]
in code :
author = paper.find_element_by_xpath("//div[#class="author"]/following-sibling::div[contains(#id,'abstract-more'')]/p[1]")
print(author.text)

Python Mechanize login form, sending input to a field with a randomly generated name

I'm trying to automate the login to a site, http://www.tthfanfic.org/login.php.
The problem I am having is that the password field has a name that is randomly generated, I have tried using it's label, type and id all of which remain static but to no avail.
Here is the HTML of the form:
<tr>
<th><label for="urealname">User Name</label></th>
<td><input type='text' id='urealname' name='urealname' value=''/> NOTE: Your user name may not be the same as your pen name.</td>
</tr>
<tr>
<th><label for="password">Password</label></th><td><input type='password' id='password' name='e008565a17664e26ac8c0e13af71a6d2'/></td>
</tr>
<tr>
<th>Remember Me</th><td><input type='checkbox' id='remember' name='remember'/>
<label for="remember">Log me in automatically for two weeks on this computer using a cookie. </label> Do not select this option if this is a public computer, or you have an evil sibling.</td>
</tr>
<tr>
<td colspan='2' style="text-align:center">
<input type='submit' value='Login' name='loginsubmit'/>
</td>
</tr>
I've tried to format that for readability but it still looks bad, consider checking the code on the supplied page.
Here is the code I get when printing the form through mechanize:
<POST http://www.tthfanfic.org/login.php application/x-www-form-urlencoded
<HiddenControl(ctkn=a40e5ff08d51a874d0d7b59173bf3d483142d2dde56889d35dd6914de92f2f2a) (readonly)>
<TextControl(urealname=)>
<PasswordControl(986f996e16074151964c247608da4aa6=)>
<CheckboxControl(remember=[on])>
<SubmitControl(loginsubmit=Login) (readonly)>>
The number sequence in the PasswordControl is the part that changes each time I reload the page, in the HTML from the site it seems to have several other tags ascribed to it but none of them work when I try to select them, that or I'm doing it incorrectly.
Here is the code I am using to try and select the control by label:
fieldTwo = br.form.find_control(label='password')
br[fieldOne] = identifier
br[fieldTwo] = password
I can post the rest of my login code if neccesary but this is the only part that is not working, I have had success with other sites where the password name remains the same.
So, is it possible for me to select the passwordControl using it's label, type or ID, or do I need to scrape its name?
EDIT: Oops, forgot to add the error message:
raise ControlNotFoundError("no control matching "+description)
mechanize._form.ControlNotFoundError: no control matching label 'password'
SOLVED:
Solution given by a guy on reddit, thanks Bliti.
Working code:
br.select_form(nr=2)
list = []
for f in br.form.controls:
list.append(f.name)
fieldTwo = list[2]
Solution given by a guy on reddit, thanks Bliti.
Working code:
#Select the form you want to use.
br.select_form(nr=2)
list = []
for f in br.form.controls:
#Add the names of each item in br.formcontrols
list.append(f.name)
#Select the correct one from the list.
fieldTwo = list[2]

Selecting an unnamed text field in a mechanize form (python)

So i'm making a program to batch convert street addresses to gps co-ordinates using mechanize and python. this is my first time using mechanize. I can select the form ("form2') on the page. however the text box in the form has no name. how do i select the textbox so that mechanize can enter my text? I've tried selecting it by its id. but that does not work.
br.select_form("Form2") #works as far as i know
br.form["search"] = ["1 lakewood drive, christchurch"] #this is the field that i cannot select
and here is the source code from the website.
<form name="Form2" >
or Type an <b>Address</b>
<input id="search" size="40" type="text" value="" >
<input type="button" onClick="EnteredAddress();" value="Enter" />
</form>
any help would be much appreciated.
form.find_control(id="search") ?
FWIW I solved this by using the above answer by lazy1 except that I was trying to assign a value after using the find_control method. That didn't work of course because of assignment, I looked deeper into the method and found setattr() and that worked great for assigning a value to to the field.
will not work
br.form.find_control(id="field id here") = "new value here"
will work
br.form.find_control(id="field id here").__setattr__("value", "new value here")

Categories

Resources