Read html table correctly - python

Here is a HTML table:
<table width="100%" cellpadding="4" cellspacing="0" style="page-break-before: always">
<col width="32*"/>
<col width="32*"/>
<col width="32*"/>
<col width="32*"/>
<col width="32*"/>
<col width="32*"/>
<col width="32*"/>
<col width="32*"/>
<tr valign="top">
<td colspan="2" width="25%" style="background: transparent" style="border: none; padding: 0cm"><p align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">A</font></font></font></p>
</td>
<td colspan="2" width="25%" style="background: transparent" style="border: none; padding: 0cm"><p align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">B</font></font></font></p>
</td>
<td colspan="2" width="25%" style="background: transparent" style="border: none; padding: 0cm"><p align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">C</font></font></font></p>
</td>
<td colspan="2" width="25%" style="background: transparent" style="border: none; padding: 0cm"><p align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">D</font></font></font></p>
</td>
</tr>
<tr valign="top">
<td colspan="2" width="25%" style="background: transparent" style="border: none; padding: 0cm"><p align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">E</font></font></font></p>
</td>
<td colspan="2" width="25%" style="background: transparent" style="border: none; padding: 0cm"><p align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">F</font></font></font></p>
</td>
<td colspan="2" width="25%" style="background: transparent" style="border: none; padding: 0cm"><p align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">G</font></font></font></p>
</td>
<td colspan="2" width="25%" style="background: transparent" style="border: none; padding: 0cm"><p align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">H</font></font></font></p>
</td>
</tr>
<tr valign="top">
<td colspan="2" width="25%" style="background: transparent" style="border: none; padding: 0cm"><p align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">I</font></font></font></p>
</td>
<td colspan="2" width="25%" style="background: transparent" style="border: none; padding: 0cm"><p align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">J</font></font></font></p>
</td>
<td colspan="2" width="25%" style="background: transparent" style="border: none; padding: 0cm"><p align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">K</font></font></font></p>
</td>
<td colspan="2" width="25%" style="background: transparent" style="border: none; padding: 0cm"><p align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">L</font></font></font></p>
</td>
</tr>
<tr valign="top">
<td width="12%" style="background: transparent" style="border: none; padding: 0cm"><p align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">M</font></font></font></p>
</td>
<td width="13%" style="background: transparent" style="border: none; padding: 0cm"><p lang="ru-RU" align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">M2</font></font></font></p>
</td>
<td width="12%" style="background: transparent" style="border: none; padding: 0cm"><p align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">N</font></font></font></p>
</td>
<td width="13%" style="background: transparent" style="border: none; padding: 0cm"><p lang="ru-RU" align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">N2</font></font></font></p>
</td>
<td width="12%" style="background: transparent" style="border: none; padding: 0cm"><p align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">O</font></font></font></p>
</td>
<td width="13%" style="background: transparent" style="border: none; padding: 0cm"><p lang="ru-RU" align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">O2</font></font></font></p>
</td>
<td width="12%" style="background: transparent" style="border: none; padding: 0cm"><p align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">P</font></font></font></p>
</td>
<td width="13%" style="background: transparent" style="border: none; padding: 0cm"><p lang="ru-RU" align="left" style="font-variant: normal; font-style: normal; font-weight: normal; text-decoration: none">
<font color="#000000"><font face="Liberation Serif, serif"><font size="3" style="font-size: 12pt">P2</font></font></font></p>
</td>
</tr>
</table>
The last row here has 2x more columns than others. When I'm trying to read it into the Pandas dataframe I get this result:
table = pd.read_html('1111.html')
table[0]
0 1 2 3 4 5 6 7
0 A A B B C C D D
1 E E F F G G H H
2 I I J J K K L L
3 M M2 N N2 O O2 P P2
How to read it correctly, without dubbing? I don't need the last row.

You can use BeautifulSoup to parse the table and then convert the results to a dataframe:
import pandas as pd
from bs4 import BeautifulSoup as soup
df = pd.DataFrame([[k[1:-1] for i in b.find_all('td') if (k:=i.text) is not None] for b in soup(html, 'html.parser').table.find_all('tr')])
Output:
0 1 2 3 4 5 6 7
0 A B C D None None None None
1 E F G H None None None None
2 I J K L None None None None
3 M M2 N N2 O O2 P P2
Edit: solution without assignment expression:
df = pd.DataFrame([[i.text[1:-1] if i else i for i in b.find_all('td')] for b in soup(html, 'html.parser').table.find_all('tr')])
Output:
0 1 2 3 4 5 6 7
0 A B C D None None None None
1 E F G H None None None None
2 I J K L None None None None
3 M M2 N N2 O O2 P P2

Related

create a dictionary or dataframe out of html that has no class keywords

I get transaction emails from my bank everytime I make a transaction. It comes in html. I want to be able to get certain information like confirmation_number, date, amount, etc. from the html content.
I tried to use regex extraction and also BeautifulSoup but the results are ugly and unwieldy. For example, the html code doesn't come with any useful attributes so it's not easy to do a find() with attributes filter. See snippet of html code below:
<table style="border: 1px solid black; border-collapse: collapse">
<tbody>
<tr>
<td colspan="2" style="border:1px solid black;padding:3px">
<center>
<font color="#000000" face="arial" style="FONT-SIZE:10pt">
<b>
Transfer Money Details
</b>
</font>
</center>
</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 3px">
<font color="#000000" face="arial" style="FONT-SIZE:10pt">
Confirmation Number
</font>
</td>
<td style="border: 1px solid black; padding: 3px">
<font color="#000000" face="arial" style="FONT-SIZE:10pt">
1594379907846
</font>
</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 3px">
<font color="#000000" face="arial" style="FONT-SIZE:10pt">
Transaction Date and Time
</font>
</td>
<td style="border: 1px solid black; padding: 3px">
<font color="#000000" face="arial" style="FONT-SIZE:10pt">
Friday, Jul 10 2020; 07:18:54 PM (GMT +8)
</font>
</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 3px">
<font color="#000000" face="arial" style="FONT-SIZE:10pt">
Transfer From
</font>
</td>
<td style="border: 1px solid black; padding: 3px">
<font color="#000000" face="arial" style="FONT-SIZE:10pt">
XXXX-XXX-247 (PESO SAVINGS)
</font>
</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 3px">
<font color="#000000" face="arial" style="FONT-SIZE:10pt">
Transfer To
</font>
</td>
<td style="border: 1px solid black; padding: 3px">
<font color="#000000" face="arial" style="FONT-SIZE:10pt">
XXXX-XXX-545
</font>
</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 3px">
<font color="#000000" face="arial" style="FONT-SIZE:10pt">
Amount
</font>
</td>
<td style="border: 1px solid black; padding: 3px">
<font color="#000000" face="arial" style="FONT-SIZE:10pt">
PHP 1,200.00
</font>
</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 3px">
<font color="#000000" face="arial" style="FONT-SIZE:10pt">
Service Fee
</font>
</td>
<td style="border: 1px solid black; padding: 3px">
<font color="#000000" face="arial" style="FONT-SIZE:10pt">
PHP 0.00
</font>
</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 3px">
<font color="#000000" face="arial" style="FONT-SIZE:10pt">
Total Amount
</font>
</td>
<td style="border: 1px solid black; padding: 3px">
<font color="#000000" face="arial" style="FONT-SIZE:10pt">
PHP 1,200.00
</font>
</td>
</tr>
<tr>
<td style="border: 1px solid black; padding: 3px">
<font color="#000000" face="arial" style="FONT-SIZE:10pt">
Notes
</font>
</td>
<td style="border: 1px solid black; padding: 3px">
<font color="#000000" face="arial" style="FONT-SIZE:10pt">
Mask filters
</font>
</td>
</tr>
</tbody>
</table>
I want to be able to have a dataframe or a dictionary that looks like this:
{
'Confirmation Number': '1594379907846',
'Transaction Date and Time': 'Friday, Jul 10 2020; 07:18:54 PM (GMT +8)',
'Transfer From': 'XXXX-XXX-247 (PESO SAVINGS)'
... and so on
}
The code I have:
def get_content(html_content):
soup = BeautifulSoup(html_content, 'html.parser')
rows = soup.find_all('tr')
content_ls = []
trans_details = {}
for row in rows:
cells = row.findChildren('td')
for cell in cells:
content_ls.append(cell.getText())
trans_details['Confirmation Number'] = content_ls[2]
trans_details['Date_Time'] = content_ls[4]
trans_details['From'] = content_ls[6]
trans_details['To'] = content_ls[8]
trans_details['Amount'] = content_ls[10]
trans_details['Notes'] = content_ls[12]
return trans_details
produces this dictionary:
{'Amount': 'PHP 1,200.00',
'Confirmation Number': '1594379907846',
'Date_Time': 'Friday, Jul 10 2020; 07:18:54 PM (GMT +8)',
'From': 'XXXX-XXX-247 (PESO SAVINGS)',
'Notes': 'PHP 0.00',
'To': 'XXXX-XXX-545'}
Is there a more elegant and pythonic way of doing it?
Ultimately, I'd like to produce a DataFrame, with columns 'Confirmation Number', 'Transaction Date and Time', and so on.
Thanks
What you can do is to use lxml lib. It allows you to use xpath to find elements.
Here is a method to extract information with the HTML you had provided.
def parse(html):
root = etree.fromstring(html)
trs = root.xpath("//tr")
result = dict()
for tr in trs:
fonts = tr.xpath(".//font")
key = fonts[0].text.strip()
value = fonts[1].text.strip()
result[key] = value
return result

Scraping specific td from table python

I have this piece of code i want to scrape from a table:
<tr id="vsViewer1_dgMainView_dgMainView_ctl02" class="GridItem odd">
<td class=" ">
<a class="hlPopup" id="lbdgMainView$ctl02" name="lbdgMainView$ctl02" onclick="wrjl_test(this,'lbdgMainView$ctl02','746402:O9oY58XKE+w=:746402:746402')" onmouseover="this.className='HLPopupOver'" onmouseout="this.className='HLPopup'"></a>
<span class="HLPopup" id="lbldgMainView$ctl02" name="lbldgMainView$ctl02" onclick="wrjl_test(this,'lbldgMainView$ctl02','746402:O9oY58XKE+w=:746402:746402')"> Info </span>
</td>
<td align="center" class=" ">746402</td>
<td align="center" class=" ">Wyndham Orlando Resort International Drive</td>
<td align="center" class=" ">Interiano, Ana</td>
<td align="center" class=" ">Yes</td>
<td align="center" class=" ">7.32</td>
<td align="left" class=" ">
<table width="250" class="TextTableSmall" border="0">
<tbody>
<tr>
<td align="center" style="background-color: rgb(128, 128, 128); text-align: center; font-size: 8pt;">Date</td>
<td align="center" style="background-color: rgb(128, 128, 128); text-align: center; font-size: 8pt;">In</td>
<td align="center" style="background-color: rgb(128, 128, 128); text-align: center; font-size: 8pt;">Out</td>
<td align="center" style="background-color: rgb(128, 128, 128); text-align: center; font-size: 8pt;">Hours</td>
<td style="background-color: rgb(128, 128, 128); text-align: center; font-size: 8pt;">Shift</td>
</tr>
<tr>
<td style="background-color: rgb(204, 204, 153); text-align: left; font-size: 8pt;">Thu 10/24/19</td>
<td align="right" style="background-color: rgb(204, 204, 153); text-align: left; font-size: 8pt;">8:00am</td>
<td align="right" style="background-color: rgb(204, 204, 153); text-align: left; font-size: 8pt;">1:20pm</td>
<td align="right" style="background-color: rgb(204, 204, 153); text-align: left; font-size: 8pt;">5.33</td>
<td align="center" style="background-color: rgb(204, 204, 153); text-align: left; font-size: 8pt;">1
<br>FL ORL Wyndham Resort I Drive 18128 - Housekeeping
<br>Room Attendant
</td>
</tr>
<tr>
<td style="background-color: rgb(204, 204, 153); text-align: left; font-size: 8pt;">Thu 10/24/19</td>
<td align="right" style="background-color: rgb(204, 204, 153); text-align: left; font-size: 8pt;">1:39pm</td>
<td align="right" style="background-color: rgb(204, 204, 153); text-align: left; font-size: 8pt;">3:38pm</td>
<td align="right" style="background-color: rgb(204, 204, 153); text-align: left; font-size: 8pt;">1.98</td>
<td align="center" style="background-color: rgb(204, 204, 153); text-align: left; font-size: 8pt;">1
<br>FL ORL Wyndham Resort I Drive 18128 - Housekeeping
<br>Room Attendant
</td>
</tr>
</tbody>
</table>
</td>
<td align="right" class=" ">12.25</td>
<td class=" ">9.0000</td>
<td align="center" class=" ">1</td>
<td align="center" class=" ">Housekeeper</td>
<td align="center" class=" ">HOUSEKEEPER</td>
<td align="center" class=" ">SE-FL-Orlando</td>
<td align="center" class=" ">Wyndham Hotel Group</td>
</tr>
i've done this:
from bs4 import BeautifulSoup
import requests
with open('vsShowViewTWO.html') as html_file:
soup = BeautifulSoup(html_file,'lxml')
tbody = soup.find('tbody',id='thetbody')
table_rows=tbody.find_all('tr')
for tr in table_rows:
td = tr.find_all('td')
row = [i.text for i in td]
print(row)
and the results are:
[' Info ', '746402', 'Resort International', 'Interiano, Ana', 'Yes', '7.32', 'DateInOutHoursShiftThu 10/24/198:00am1:20pm5.331Resort I Drive 18128 - HousekeepingRoom AttendantThu 10/24/191:39pm3:38pm1.981Resort I Drive 18128 - HousekeepingRoom Attendant', 'Date', 'In', 'Out', 'Hours', 'Shift', 'Thu 10/24/19', '8:00am', '1:20pm', '5.33', '1Resort I Drive 18128 - HousekeepingRoom Attendant', 'Thu 10/24/19', '1:39pm', '3:38pm', '1.98', '1 Resort I Drive 18128 - HousekeepingRoom Attendant', '12.25', '9.0000', '1', 'Housekeeper', 'HOUSEKEEPER', 'SE', 'Hotel Group']
but i don't need the whole row just the name "Interiano, Ana" and the last "HOUSEKEEPER", i've been trying with indexing the rows var with no luck

Regular expression to match field name value pairs from html

I'm trying to parse an HTML email from python code to extract various details and would appreciate a regular expression or two to help achieve this as it is too complex for my limited regex understanding. e.g. look for 'Travel Date' and extract 'October 30 2018 (Tue)'.
In all cases there is a field name contained within <td> tags followed by the field value contained within another set of <td> tags. Sometimes the name and value are contained within the same row <tr> tags (Case 1) and other times they are in separate row tags (Case 2). Other items like <span> and <img> need to be skipped over as well.
Case 1
<tr>
<td colspan="2"> </td></tr>
<tr><td style="vertical-align: top; font-size: 13px; font-family: Arial; color: #777777;">Travel Date</td>
<td style="vertical-align: top; font-size: 13px; font-family: Arial; color: #444444;">October 30 2018 (Tue)</td>
</tr>
Case 2
<tr><td style="vertical-align: top;">
<span style="font-size: 10px; font-family: Arial; color: #999999; font-weight: bold; line-height: 19px; text-transform: uppercase;">Drop-off to Address</span>
</td></tr>
<tr><td style="vertical-align: top;">
<span style="font-size: 13px; font-family: Arial; color: #444444;"><img style="vertical-align:text-bottom;" src="https://d1lk4k9zl9klra.cloudfront.net/Email/Common/address_icon.png" alt="" width="14" height="14" /> 200 George St, Sydney NSW 2000, Australia</span>
</td></tr>
Instead of using regex, I would use Beautiful Soup. It makes it easier to go through HTML elements and scrape what you need. If you know the relationship between the key and value, then you could use that to extract information. Here's an example for case 1:
In [8]: from bs4 import BeautifulSoup
In [9]: text = """
...: <tr>
...: <td colspan="2"> </td></tr>
...: <tr><td style="vertical-align: top; font-size: 13px; font-family: Arial; color:
#777777;">Travel Date</td>
...: <td style="vertical-align: top; font-size: 13px; font-family: Arial; color:
#444444;">October 30 2018 (Tue)</td>
...: </tr>"""
In [11]: soup = BeautifulSoup(text, 'lxml')
In [13]: soup.find_all('td')
Out[13]:
[<td colspan="2"> </td>,
<td style="vertical-align: top; font-size: 13px; font-family: Arial; color:
#777777;">Travel Date</td>,
<td style="vertical-align: top; font-size: 13px; font-family: Arial; color:
#444444;">October 30 2018 (Tue)</td>]
In [15]: for tag in soup.find_all('td'):
...: if tag.text == "Travel Date":
...: print tag.find_next().text
...:
October 30 2018 (Tue)
Beautiful Soup gives a lot of flexibility when scraping HTML from the web.

I got a calender picker. How to select the available day with Selenium and Python?

the available days has a class .calendarCellOpen:
table.calendario .calendarCellOpen input {
}
Here it is the calendar css:
#calwrapper
{
min-height:230px;
margin-top:10px;
}
#calendar
{
float:left;
margin-left: 15px; /*Daniele 10-04-2014*/
}
span.calendario
{
display:block;
margin:0;
}
table.fasce
{
margin-left:20px;
}
table.fasce th
{
background-image: url( '../images/tab_body.png' );
background-repeat: repeat-x;
font-size:12px;
}
table.fasce tr
{
border-bottom: #f5f4e7 thin dotted;
}
table.calendario
{
border-top: 0px !important;
}
table.calendario, table.fasce
{
width: 300px;
background-color: White !important;
font-size: 15px;
border-right: #f5f4e7 1px solid !important;
border-left: #f5f4e7 1px solid !important;
border-bottom: #f5f4e7 1px solid !important;
}
table.calendario td, table.fasce td
{
text-align:center;
}
table.calendario .calTitolo
{
background-image: url( '../images/tab_body.png' );
background-repeat: repeat-x;
margin: 0px !important;
padding: 0px !important;
font-size:12px;
}
table.calendario .calTitolo td
{
padding:0px 5px 0px 5px;
width:14.3%;
}
table.calendario .calDayHeader /* RIGA */
{
background-color:#FCFBF7;
font-size:12px;
}
table.calendario .otherMonthDay
{
color: #C0C0C0;
}
table.calendario .cellaSelezionata /* CELLA */
{
background-color:#EDEBD5 !important;
border-collapse:collapse !important;
font-weight:bold;
}
table.calendario .calendarCellOpen input
{
color:#208020 !important; /*High availability (green)*/
font-weight:bold;
}
table.calendario .calendarCellRed
{
color:Red !important; /*noe availability*/
font-weight:bold;
}
table.calendario .calendarCellMed input
{
color:#F09643 !important; /*Disponibilità media*/
font-weight:bold;
}
.pulsanteCalendario
{
border: 0px;
background-color: Transparent;
cursor: pointer;
padding: 0px 0px 0px 0px;
margin: 0px;
height:20px;
width:100%;
overflow:visible;
text-align:center;
font-size:16px;
}
.pulsanteCalendario:hover
{
text-decoration:underline;
}
#legend
{
margin-bottom:8px;
width:100%;
}
#legend ul
{
list-style-type:none;
}
#legend ul li
{
display:inline;
margin-left:20px;
}
The thing is that i want to select (clicking on it with Selenium) the day available(doesn`t matter which day).Just any day which appears to be available(green).
Here is the calendar:
elementos = driver.find_elements_by_class_name("calendarCellOpen")
while True:
if elementos:
driver.find_element_by_class_name("calendarCellOpen").click()
driver.find_element_by_id("ctl00_ContentPlaceHolder1_acc_Calendario1_repFasce_ctl01_btnConferma").click() #confirm button
else:
driver.find_element_by_xpath("//input[#value='<']").click() #back
if elementos:
driver.find_element_by_class_name("calendarCellOpen").click()
driver.find_element_by_id("ctl00_ContentPlaceHolder1_acc_Calendario1_repFasce_ctl01_btnConferma").click()
driver.find_element_by_xpath("//input[#value='>']").click() #forward
if elementos:
driver.find_element_by_class_name("calendarCellOpen").click()
driver.find_element_by_id("ctl00_ContentPlaceHolder1_acc_Calendario1_repFasce_ctl01_btnConferma").click()
This some code i made
I made back and foward because is th only way to reload the calendar..
This is the HTML of the calendar:
<div id="calwrapper">
<div id="legend" style="padding-left:15px; margin-bottom:20px">
<table style="width:90%; border-collapse:collapse; border: 0px">
<tr style="line-height:15px">
<td style="background-color:Red; width:80px; margin-right:10px">
</td>
<td style="width: 383px; padding-left:5px">
Tutto occupato # all none available
</td>
<td style="background-color:#F09643; width:80px">
</td>
<td style="width: 450px; padding-left:5px">
Media disponibilità #half available
</td>
<td style="background-color:#058d08; width:80px">
</td>
<td style="width: 383px; padding-left:5px">
Posti disponibili #available
</td>
<td style="background-color:#000000; width:80px">
</td>
<td style="width: 383px; padding-left:5px">
Non disponibile # none available
</td>
</tr>
</table>
</div>
<div id="calendar">
<span id="ctl00_ContentPlaceHolder1_acc_Calendario1_myCalendario1"
class="calendario">
<table class="calendario" summary="Summary" cellspacing="0">
<caption>Calendario eventi</caption>
<tr class="calTitolo">
<th>
<input type="submit"
name="ctl00$ContentPlaceHolder1$acc_Calendario1$myCalendario1$ctl01"
value="<" title="Clicca qui per andare al mese precedente"
class="pulsanteCalendario" />
</th>
<th colspan="5">
<span>agosto, 2017</span>
</th>
<th>
<input type="submit"
name="ctl00$ContentPlaceHolder1$acc_Calendario1$myCalendario1$ctl03"
value=">" title="Clicca qui per andare al mese successivo"
class="pulsanteCalendario" />
</th>
</tr>
<tr>
<th class="calDayHeader" scope="col">lun</th>
<th class="calDayHeader"
scope="col">mar</th>
<th class="calDayHeader" scope="col">mer</th>
<th class="calDayHeader" scope="col">gio</th>
<th class="calDayHeader" scope="col">ven</th>
<th class="calDayHeader" scope="col">sab</th>
<th class="calDayHeader" scope="col">dom</th>
</tr>
<tr>
<td title="Giorno non disponibile" class="otherMonthDay">31</td>
<td title="Tutto occupato" class="calendarCellRed">1</td>
<td title="Giorno non disponibile" class="noSelectableDay">2</td>
<td title="Tutto occupato" class="calendarCellRed">3</td>
<td title="Tutto occupato" class="calendarCellRed">4</td>
<td title="Giorno non disponibile" class="noSelectableDay">5</td>
<td title="Giorno non disponibile" class="noSelectableDay">6</td>
</tr>
<tr>
<td title="Tutto occupato" class="calendarCellRed">7</td>
<td class="calendarCellOpen">
<input type="submit"
name="ctl00$ContentPlaceHolder1$acc_Calendario1$myCalendario1$ctl12"
value="8" title="8 agosto 2017, Posti disponibili"
class="pulsanteCalendario" />
</td>
<td class="calendarCellOpen">
<input type="submit"
name="ctl00$ContentPlaceHolder1$acc_Calendario1$myCalendario1$ctl12"
value="8" title="8 agosto 2017, Posti disponibili"
class="pulsanteCalendario" />
</td>
<td class="calendarCellOpen">
<input type="submit"
name="ctl00$ContentPlaceHolder1$acc_Calendario1$myCalendario1$ctl12"
value="8" title="8 agosto 2017, Posti disponibili"
class="pulsanteCalendario" />
</td>
<td class="calendarCellOpen">
<input type="submit"
name="ctl00$ContentPlaceHolder1$acc_Calendario1$myCalendario1$ctl12"
value="8" title="8 agosto 2017, Posti disponibili"
class="pulsanteCalendario" />
</td>
<td title="Giorno non disponibile" class="noSelectableDay">9</td>
<td title="Giorno non disponibile" class="noSelectableDay">10</td>
</tr><tr>
<td title="Giorno non disponibile" class="noSelectableDay">14</td>
<td title="Giorno non disponibile" class="noSelectableDay">15</td>
<td title="Giorno non disponibile" class="noSelectableDay">16</td>
<td title="Giorno non disponibile" class="noSelectableDay">17</td>
<td title="Giorno non disponibile" class="noSelectableDay">18</td>
<td title="Giorno non disponibile" class="noSelectableDay">19</td>
<td title="Giorno non disponibile" class="noSelectableDay">20</td>
</tr><tr>
<td title="Giorno non disponibile" class="noSelectableDay">21</td>
<td title="Giorno non disponibile" class="noSelectableDay">22</td>
<td title="Giorno non disponibile" class="noSelectableDay">23</td>
<td title="Giorno non disponibile" class="noSelectableDay">24</td>
<td title="Giorno non disponibile" class="noSelectableDay">25</td>
<td title="Giorno non disponibile" class="noSelectableDay">26</td>
<td title="Giorno non disponibile" class="noSelectableDay">27</td>
</tr><tr>
<td title="Giorno non disponibile" class="noSelectableDay">28</td>
<td title="Giorno non disponibile" class="noSelectableDay">29</td>
<td title="Giorno non disponibile" class="noSelectableDay">30</td>
<td title="Giorno non disponibile" class="noSelectableDay">31</td>
<td title="Giorno non disponibile" class="otherMonthDay">1</td>
<td title="Giorno non disponibile" class="otherMonthDay">2</td>
<td title="Giorno non disponibile" class="otherMonthDay">3</td>
</tr></table></span>
</div>
<div id="orari" >
<input type="hidden"
name="ctl00$ContentPlaceHolder1$acc_Calendario1$HiddenField1"
id="ctl00_ContentPlaceHolder1_acc_Calendario1_HiddenField1" />
</div>
</div>
This is what i gain to do, but im not quite sure that this is going to work:
while True:
for dates in elementos:
if dates.is_enabled():
dates.click()
driver.find_element_by_id("ctl00_ContentPlaceHolder1_acc_Calendario1_repFasce_ctl01_btnConferma").click()
#if elementos > 0:
#driver.find_element_by_class_name("calendarCellOpen").click()
#else:
driver.find_element_by_xpath("//input[#value='<']").click()
driver.find_element_by_xpath("//input[#value='>']").click()

Python selenium webdriver dropdown menu how to select items

<div id="isc_3B" class="scrollingMenu" onscroll="return isc_PickListMenu_0.$lh()" style="position: absolute; left: 403px; top: 63px; width: 450px; height: 298px; z-index: 800684; visibility: inherit; padding: 0px; box-sizing: border-box; overflow: hidden;" role="listbox" eventproxy="isc_PickListMenu_0" aria-hidden="false">
<div id="isc_3C" style="position: relative; display: inline-block; box-sizing: border-box; width: 100%; vertical-align: top; visibility: inherit; z-index: 800684; cursor: default;" eventproxy="isc_PickListMenu_0">
<div id="isc_3D" role="toolbar" tabindex="-1" onblur="if(window.isc)isc.EH.blurFocusCanvas(isc_Toolbar_1,true);" onfocus="if(event.target!=this)return;isc.EH.focusInCanvas(isc_Toolbar_1,true);" onscroll="return isc_Toolbar_1.$lh()" style="position: absolute; left: 0px; top: 0px; width: 434px; height: 22px; z-index: 200936; overflow: hidden; box-sizing: border-box; cursor: default; display: inline-block;" eventproxy="isc_Toolbar_1">
<div id="isc_3A" class="pickListMenuBody" tabindex="1439" onblur="if(window.isc)isc.EH.blurFocusCanvas(isc_PickListMenu_0_body,true);" onfocus="if(event.target!=this)return;isc.EH.focusInCanvas(isc_PickListMenu_0_body,true);" onscroll="return isc_PickListMenu_0_body.$lh()" style="position: absolute; left: 0px; top: 22px; width: 434px; height: 276px; z-index: 201026; overflow: hidden; background-color: white; box-sizing: border-box; cursor: default; display: inline-block; outline-style: none;" eventproxy="isc_PickListMenu_0_body">
<div id="isc_3N" style="position:absolute;overflow:visible;z-index:1000;width:432px">
<div id="isc_PickListMenu_0_body$28s" style="width:1px;height:0px;overflow:hidden;display:none;">
<table id="isc_3Atable" class="listTable" width="432" cellspacing="0" cellpadding="2" border="0" style="table-layout:fixed;overflow:hidden;padding-left:0px;padding-right:0px;" role="presentation">
<tbody></tbody>
<colgroup>
<tbody>
<tr id="isc_PickListMenu_0_row_0" aria-posinset="1" aria-setsize="686" role="option" aria-selected="false">
<td class="pickListCell" height="16" align="left" style="padding-top: 0px; padding-bottom: 0px; width: 143px; overflow: hidden;">
<div style="overflow:hidden;text-overflow:ellipsis;white-space:nowrap;WIDTH:139px;" cellclipdiv="true" role="presentation">Pens Stabiliner 808 Ballpoint Fine Black</div>
</td>
<td class="pickListCell" height="16" align="left" style="padding-top: 0px; padding-bottom: 0px; width: 144px; overflow: hidden;">
<div style="overflow:hidden;text-overflow:ellipsis;white-space:nowrap;WIDTH:140px;" cellclipdiv="true" role="presentation">Ea</div>
</td>
<td class="pickListCell" height="16" align="left" style="padding-top: 0px; padding-bottom: 0px; width: 145px; overflow: hidden;">
</tr>
<tr id="isc_PickListMenu_0_row_1" aria-posinset="2" aria-setsize="686" role="option">
<td class="pickListCellDark" height="16" align="left" style="WIDTH:143px;OVERFLOW:hidden;padding-top:0px;padding-bottom:0px;;white-space: nowrap;">
<td class="pickListCellDark" height="16" align="left" style="WIDTH:144px;OVERFLOW:hidden;padding-top:0px;padding-bottom:0px;;white-space: nowrap;">
<td class="pickListCellDark" height="16" align="left" style="WIDTH:145px;OVERFLOW:hidden;padding-top:0px;padding-bottom:0px;;white-space: nowrap;">
</tr>
<tr id="isc_PickListMenu_0_row_2" aria-posinset="3" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_3" aria-posinset="4" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_4" aria-posinset="5" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_5" aria-posinset="6" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_6" aria-posinset="7" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_7" aria-posinset="8" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_8" aria-posinset="9" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_9" aria-posinset="10" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_10" aria-posinset="11" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_11" aria-posinset="12" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_12" aria-posinset="13" aria-setsize="686" role="option">
**<tr id="isc_PickListMenu_0_row_13" aria-posinset="14" aria-setsize="686" role="option" aria-selected="true">
<td class="pickListCellSelectedDark" height="16" align="left" style="padding-top: 0px; padding-bottom: 0px; width: 143px; overflow: hidden;">
<div style="overflow:hidden;text-overflow:ellipsis;white-space:nowrap;WIDTH:139px;" cellclipdiv="true" role="presentation">Adding Machine Roll 57x57mm Lint Free</div>
</td>
<td class="pickListCellSelectedDark" height="16" align="left" style="padding-top: 0px; padding-bottom: 0px; width: 144px; overflow: hidden;">
<td class="pickListCellSelectedDark" height="16" align="left" style="padding-top: 0px; padding-bottom: 0px; width: 145px; overflow: hidden;">
</tr>**
<tr id="isc_PickListMenu_0_row_14" aria-posinset="15" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_15" aria-posinset="16" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_16" aria-posinset="17" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_17" aria-posinset="18" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_18" aria-posinset="19" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_19" aria-posinset="20" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_20" aria-posinset="21" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_21" aria-posinset="22" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_22" aria-posinset="23" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_23" aria-posinset="24" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_24" aria-posinset="25" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_25" aria-posinset="26" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_26" aria-posinset="27" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_27" aria-posinset="28" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_28" aria-posinset="29" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_29" aria-posinset="30" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_30" aria-posinset="31" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_31" aria-posinset="32" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_32" aria-posinset="33" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_33" aria-posinset="34" aria-setsize="686" role="option">
<tr id="isc_PickListMenu_0_row_34" aria-posinset="35" aria-setsize="686" role="option">
</tbody>
</table>
<div id="isc_PickListMenu_0_body$284" style="width:1px;height:10416px;overflow:hidden;">
<table style="position:absolute;top:0px;font-size:1px;height:100%;width:100%;z-index:1;overflow:hidden;visibility:hidden;">
</div>
</div>
<div id="isc_3Q" class="scrollbar" onscroll="return isc_PickListMenu_0_body_vscroll.$lh()" style="position: absolute; left: 434px; top: 22px; width: 16px; height: 276px; z-index: 201027; overflow: hidden; box-sizing: border-box; cursor: default; display: inline-block;" dir="ltr" eventproxy="isc_PickListMenu_0_body_vscroll">
<div id="isc_3R" class="vScrollThumb" aria-label=" " onscroll="return isc_PickListMenu_0_body_vscroll_thumb.$lh()" style="position: absolute; left: 434px; top: 38px; width: 15px; height: 20px; z-index: 201033; overflow: hidden; box-sizing: border-box; cursor: default; display: inline-block;" eventproxy="isc_PickListMenu_0_body_vscroll_thumb">
<div id="isc_3O" class="scrollbarDisabled" onscroll="return isc_PickListMenu_0_body_hscroll.$lh()" style="position: absolute; left: 0px; top: 22px; width: 1px; height: 1px; z-index: 201027; overflow: hidden; box-sizing: border-box; cursor: default; display: inline-block; visibility: hidden;" dir="ltr" eventproxy="isc_PickListMenu_0_body_hscroll" aria-hidden="true">
<div id="isc_3P" class="hScrollThumb" aria-label=" " onscroll="return isc_PickListMenu_0_body_hscroll_thumb.$lh()" style="position: absolute; left: 16px; top: 22px; width: 5px; height: 1px; z-index: 201033; overflow: hidden; box-sizing: border-box; cursor: default; display: inline-block; visibility: hidden;" eventproxy="isc_PickListMenu_0_body_hscroll_thumb" aria-hidden="true">
<div id="isc_3T" aria-label="corner menu" role="button" tabindex="1490" onblur="if(window.isc)isc.EH.blurFocusCanvas(isc_PickListMenu_0_sorter,true);" onfocus="if(event.target!=this)return;isc.EH.focusInCanvas(isc_PickListMenu_0_sorter,true);" onscroll="return isc_PickListMenu_0_sorter.$lh()" style="POSITION:absolute;LEFT:434px;TOP:0px;WIDTH:16px;HEIGHT:22px;Z-INDEX:200942;OVERFLOW:hidden;box-sizing:border-box;CURSOR:default;display:inline-block" eventproxy="isc_PickListMenu_0_sorter">
</div>
</div>
</body>
Do you have any ideas how to select options from that menu using python webdriver? selenium ide is not helpful at all in this case. I was trying to select it by row id, name text and It's not working
every tr is option on the dropdown menu like:
>
Adding Machine Roll 57x57mm Lint Free
Following your exemple, you can use this:
select = Select(driver.find_element_by_xpath("//div[contains(text(), 'Adding Machine Roll 57x57mm Lint Free')]"))

Categories

Resources