Scraper problems with ASP.NET locating objects - Selenium - python

Im new into python, and im trying to make a scraper into a ASPX website.
I got two types of results in this page, the empty ones and the results,
My code can get the empty ones but i cant get the results when they exist,
I try all the kinds of paths and still cant get the result,
Can someone help me?
thats my code
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import openpyxl
from openpyxl import load_workbook
planilha = load_workbook('./BASE 05-09.xlsx')
driver = webdriver.Chrome(executable_path=r'C:\Python37\webdriver\chromedriver.exe')
wait = WebDriverWait(driver, 10)
sheet = planilha['Aba1']
driver.get("http://www1.cfc.org.br/sisweb/siscnai/externaConsultaCadastro.aspx")
for Count in range(2, 1101):
driver.find_element_by_id("ContentPlaceHolder1_tbxCPF").send_keys(sheet.cell(row=Count, column=5).value, Keys.RETURN)
results = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table[id*='ContentPlaceHolder1_gvwProfissional'] > tbody > tr")))
resultado_pesquisa = results[0].text.strip() if "ContentPlaceHolder1_gvwProfissional" in results[0].get_attribute("class") else results[0].find_element_by_xpath("./td[1]").text.strip()
driver.find_element_by_id("ContentPlaceHolder1_tbxCPF").clear()
sheet.cell(row=Count, column=7).value = resultado_pesquisa
planilha.save("BASE 05-09.xlsx")
driver.quit()
thats the page code when i got results, i wanna get the "5433"
<html>
<head id="Head1"><meta http-equiv="X-UA-Compatible" content="IE=EmulateIE7" /><title>
CNAI
</title></head>
<body>
<form method="post" action="externaConsultaCadastro.aspx" id="form1">
<div class="aspNetHidden">
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="AM7kGtthPHcCeQeZ3wWqQvzOI0fCr5HN29F2i/xZ5Ix7EkcYSSc9FlCfCcbHtX2Qulw1TLFpz/+RNvGQPU1/OqZpxByvUPSE2gaVonfaQsQn7zvoossHNNUDTiQmHv9XT5KkXiFi4Oa2B2Ix/MNkWIIw86rgaBK3NhQHUE7S+DsAlvsqZ1sy59fb1+/d/FF32dYRXcocqfcP4TL8ZtLhlRKt3rP1C+kS8/CkywxSTqBxQQ3h52z9Fm9dxcfjgHXQzisjVQuYhYWPnV6gcfJU2r9Hed49Zmx/mC4ydsTI7mbNpVYbwi4AqZKQvg0KAa+K+5ZLto2yg61qut6rUG0HyrpY5yOQk5XEH/BfK8qYoHbouJbYY9mbMwspkzg0bkNSFPz1dG45NLdibrvGoO5PSrHOzJpZhTufdzUPu5gVpUhlhrpU98c8ZzHJjS07xBZ72BwPp1eb1e9hPwUuPkD+SQ7w4ekSdaFVqUi2dWVP+uTcgL8pISRKt7viiraxvarsnQBiuyI7I+8gIMb5KMP0rB6R/AIKHNZZJI9fFipjabgtixU/+c5qsCvT1yLxx9XhO+nLBdYtgxOXuhjZ1dQ2DGe5E19ypAYDcqyGJotx4xQwXjMyYAhKLCWwZV9hPFVuQ3I/FRkI9u4+zWB782qmVkRZPl8Hde5wHrOW4V1DfxQz0191Ti+esid2SicQZZReSA1U5l1rv7qtKfWx+5nSJRdP13Z/vZVazAdpq1N6r2WzSOaDaa/1To87twg4kZP8kz/7VHU6fIoGIrrovke0XWvgsKiOUa9xqQ4fiW+Dl7HB1JrnLOPENKOnvmFfaI0DnWbKuWwB0CBao2pzxUtpd5Up195UesvowkUjNq4GgtsYo3I4NRag/M0ALN+0zz+3XVoqKzWHMWcy0yGJtbHcR5B++S66UlJOKdX0mGS6swfHz5twjLIOYxiuhRN6PBX0ZukZajaoRH3/GfN/kaj2GykyeVvhd+ds+qIpWKz+7d9PKqkwZiQLbXgaY3YjxjS9LpHseL5bAJkEMnundiHnjMVpjt0fZARNugggeEbei0xNntXUltc5A8xqQ3O5LXmUsw+i9QpsGcb5rFPO6ybOwAchyvZckeuEWsNC+blZY9iybQzGR7dyI1XhMHnJyEPvodso2tqwzVP/R4W9jMcUhr/V6gOnztsvGUnY6dfEW949ep9x9kkVPNJIpabJF1Cmgl/SVVm1/4TR7FZPx0PNpgyeieHvL0ieRSdlwgcuJm/rrgpNT8ka8u40I3PZB05288oTVagKY2fwdLUiU4gE9E2PSzyi/i224cjSPZ9b+yrnJz+Kn27Q+spsgzo0WW6QkwtxZx2hJ5q2n1WQRICU9oVmCY1BLyUxdIHq2jcb0gQ=" />
</div>
<script src='masks.js'></script>
<div class="aspNetHidden">
<input type="hidden" name="__VIEWSTATEENCRYPTED" id="__VIEWSTATEENCRYPTED" value="" />
<input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="8EDkWLerXUVmMKlcXe/qqVujpBci2zX1ECnhJ4g+15vGRAo3rFrF5XP2X12Kr6nfAFlOpsUn3Sk/WM9LlAM0W+s6+LTZ6pSJUuUQu6ct75AmlJs+cWh7RTtWu1D6Arg8HDLFWhdvHDK/0sPW+VZ2wM58r+zcvQx/1wJmx/xkhtWuh0gkFFHVfq6zEAaL4SEnWvfH0wF5JZtGdnWgKhq0PQPkQPCk4gwWjZf9UJWX/I7BMFZetip0QShBtkQKaYFPyQ1riFre9eizciXNPJYrSU42IhZGnEWK4CCOBKetrpMTHaJiO2/lCpYWtMiMArUqeJz6gicoZc/q4GF6bgWAYIT+ItMiQC6N5eQFhwwGgKr/oRDush9H1IKBmg2kty1juv54o20yTrR19urRTyMut35n55+dHkkbMc2QKouXCKGrxXNE7t8/tOhAbaV+56FJjYFydcxrvWCpOKJzy5By3QR6xl4RPAFZrcAP5qGsSxugndJVM8lbgneoQEqjceeC8b8BFcZOSYIPOLD0CRAOSXD9FljgX8N5yz1RkJkOvYPpi6TIjugrILSgXMJtOx1BKfSL7vmYLVmm8hAHGssGnQXfBWnCqTu7e242s6TUotUbIuiJKFGpGhXnzbleDqXBMxjXLbOHQgsMxDPw9SoZYEVgtA2DZMfDWobpetTeQTc/ykyDmwXyCS9q+VK6seNRtFUIG62lVnzlMloIvGIWZkm7RVpz+FdtVXo75qAotGIhzDMhnbw1tvSW+huEdnBllFEJDedPdiUTM8ONKdkdaKsDbpPDI/K3vXGvc9V8t1MKihxXD42SPHdhzhSUNmsB6uxgOFP4iXBSATzdLBDD5FaaoJI/EaLVzSCpQGAMNwHilXBGMo97h77TLSnQu8x1adkEFUmkF/wmiQcyzEHhmxwI/bY7lKdtELEDO4JOP3g=" />
</div>
<div>
<table border="0" cellpadding="0" cellspacing="0">
<tr>
<td colspan="2" style="height: 68px; width: 801px;">
<img src="Imagens/banner_cnai_externo.jpg" /></td>
</tr>
<tr>
<td colspan="2" style="width: 801px; height: 232px;">
<div align=center>
<br />
<table style="font-weight: bold; font-size: 12pt; width: 800px; color: white; font-family: verdana;
height: 7px; background-color: firebrick">
<tr>
<td>
CONSULTAR CADASTRO CNAI</td>
</tr>
</table>
<br />
<span style="font-size: 10pt; color: red; font-family: Verdana"><strong>Utilize <span
style="text-decoration: underline">qualquer um</span> dos campos abaixo para fazer
a pesquisa:</strong></span><br />
<br />
<table>
<tr>
<td style="text-align: right; font-weight: bold; color: firebrick; font-family: verdana;">
<span style="font-size: 10pt; font-family: Verdana">Nome:</span></td>
<td style="text-align: left">
<input name="ctl00$ContentPlaceHolder1$tbxNome" type="text" maxlength="100" id="ContentPlaceHolder1_tbxNome" style="font-family:Verdana;font-size:10pt;width:295px;" /></td>
</tr>
<tr>
<td style="text-align: right; font-weight: bold; color: firebrick; font-family: verdana;">
<span style="font-size: 10pt; font-family: Verdana">Número CNAI:</span></td>
<td style="text-align: left">
<input name="ctl00$ContentPlaceHolder1$tbxNumeroCNAI" type="text" maxlength="8" id="ContentPlaceHolder1_tbxNumeroCNAI" style="font-family:Verdana;font-size:10pt;width:100px;" /></td>
</tr>
<tr>
<td style="text-align: right; font-weight: bold; color: firebrick; font-family: verdana;">
<span style="font-size: 10pt; font-family: Verdana">CPF:</span></td>
<td style="text-align: left">
<input name="ctl00$ContentPlaceHolder1$tbxCPF" type="text" value="057.367.539-28" maxlength="14" id="ContentPlaceHolder1_tbxCPF" style="font-family:Verdana;font-size:10pt;width:150px;" /></td>
</tr>
<tr>
<td style="text-align: right; font-weight: bold; color: firebrick; font-family: verdana;">
<span style="font-size: 10pt; font-family: Verdana">Registro:</span></td>
<td style="text-align: left">
<input name="ctl00$ContentPlaceHolder1$tbxNumeroRegistro" type="text" maxlength="8" id="ContentPlaceHolder1_tbxNumeroRegistro" style="font-family:Verdana;font-size:10pt;width:100px;" /></td>
</tr>
<tr>
<td style="text-align: right; font-weight: bold; color: firebrick; font-family: verdana;">
<span style="font-size: 10pt; font-family: Verdana">Habilitação:</span></td>
<td style="text-align: left">
<table id="ContentPlaceHolder1_cbxlCredenciamento" style="font-family:Verdana;font-size:10pt;">
<tr>
<td><input id="ContentPlaceHolder1_cbxlCredenciamento_0" type="checkbox" name="ctl00$ContentPlaceHolder1$cbxlCredenciamento$0" value="1" /><label for="ContentPlaceHolder1_cbxlCredenciamento_0">QTG</label></td><td><input id="ContentPlaceHolder1_cbxlCredenciamento_1" type="checkbox" name="ctl00$ContentPlaceHolder1$cbxlCredenciamento$1" value="2" /><label for="ContentPlaceHolder1_cbxlCredenciamento_1">BCB</label></td><td><input id="ContentPlaceHolder1_cbxlCredenciamento_2" type="checkbox" name="ctl00$ContentPlaceHolder1$cbxlCredenciamento$2" value="3" /><label for="ContentPlaceHolder1_cbxlCredenciamento_2">SUSEP</label></td><td><input id="ContentPlaceHolder1_cbxlCredenciamento_3" type="checkbox" name="ctl00$ContentPlaceHolder1$cbxlCredenciamento$3" value="4" /><label for="ContentPlaceHolder1_cbxlCredenciamento_3">CVM</label></td>
</tr>
</table></td>
</tr>
<tr>
<td style="text-align: right; font-weight: bold; color: firebrick; font-family: verdana;">
<span style="font-size: 10pt; font-family: Verdana">UF:</span></td>
<td style="text-align: left">
<select name="ctl00$ContentPlaceHolder1$ddlUF" id="ContentPlaceHolder1_ddlUF" style="font-family:Verdana;font-size:10pt;">
<option selected="selected" value=""></option>
<option value="AC">AC</option>
<option value="AL">AL</option>
<option value="AM">AM</option>
<option value="AP">AP</option>
<option value="BA">BA</option>
<option value="CE">CE</option>
<option value="DF">DF</option>
<option value="ES">ES</option>
<option value="GO">GO</option>
<option value="MA">MA</option>
<option value="MG">MG</option>
<option value="MS">MS</option>
<option value="MT">MT</option>
<option value="PA">PA</option>
<option value="PB">PB</option>
<option value="PE">PE</option>
<option value="PI">PI</option>
<option value="PR">PR</option>
<option value="RJ">RJ</option>
<option value="RN">RN</option>
<option value="RO">RO</option>
<option value="RR">RR</option>
<option value="RS">RS</option>
<option value="SE">SE</option>
<option value="SC">SC</option>
<option value="SP">SP</option>
<option value="TO">TO</option>
</select></td>
</tr>
<tr>
<td colspan="2">
<br />
<input type="submit" name="ctl00$ContentPlaceHolder1$btnConsultar" value="Consultar" id="ContentPlaceHolder1_btnConsultar" style="font-family:Verdana;font-size:8pt;width:100px;" /> <input type="submit" name="ctl00$ContentPlaceHolder1$btnVoltar" value="<<< Voltar" id="ContentPlaceHolder1_btnVoltar" style="font-family:Verdana;font-size:8pt;width:100px;" /></td>
</tr>
</table>
<br />
<span id="ContentPlaceHolder1_lblQtdRegistros" style="color:Firebrick;font-family:Verdana;font-size:10pt;font-weight:bold;">Quantidade de registros encontrados: 1</span><br />
<br />
<div>
<table cellspacing="0" cellpadding="4" id="ContentPlaceHolder1_gvwProfissional" style="color:#333333;font-family:Verdana;font-size:8pt;width:790px;border-collapse:collapse;">
<tr style="color:White;background-color:DimGray;font-weight:bold;">
<th scope="col">Nº CNAI</th><th scope="col">Nome</th><th scope="col">Registro CRC</th><th scope="col">UF</th><th scope="col">Ativo Desde</th><th scope="col">Habilitação</th>
</tr><tr style="color:#333333;background-color:#FFFBD6;">
<td>5433</td><td align="left" valign="middle">ADRIEL PAUL</td><td>SC-038746/O</td><td>SC</td><td>16/10/2017</td><td>QTG</td>
</tr>
</table>
</div>
<br />
<br />
</div>
</td>
</tr>
<tr>
<td colspan="2" style="height: 29px; background-color: #ffff92; text-align: center">
<span style="font-size: 8pt; color: firebrick; font-family: Verdana"><strong>
<hr style="width: 790px" />
<span style="color: firebrick">CFC/DEINF - Departamento de Informática</span></strong></span></td>
</tr>
</table>
</div>
<script>_b0ea08358a064398935a96570c90f08e = new Mask("###.###.###-##");_b0ea08358a064398935a96570c90f08e.attach(document.getElementById('ContentPlaceHolder1_tbxCPF'));</script></form>
</body>
</html>
thats the page code when the result is empty, in this case i wanna get the "Nenhum registro encontrado."
<html xmlns="http://www.w3.org/1999/xhtml" >
<head id="Head1"><meta http-equiv="X-UA-Compatible" content="IE=EmulateIE7" /><title>
CNAI
</title></head>
<body>
<form method="post" action="externaConsultaCadastro.aspx" id="form1">
<div class="aspNetHidden">
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="N4CP+jvK/5b+U+rTB9wr1ebhSvqp5jRbCS2nCn9YQmpBGkOzPBNz77XaZqDmpks4sRpruRnk5/iODtmwHpy/TgS6IoY1opVEWGrstOsGKd9qS12fLEJcrl0C4qMMX6749LvuwRu85AopjkujK6QBv1+IEz18b30UAbvkGt9UELokaKjcjtOSOLK7AsBGf0EQ20q97wEeiJm9TE85TMflKNLDXWm/juP5rpG9cU/THT/piFUCakmhaupUwYKt84cRk2Ax7Cg45MUJXLMlOBqqiBvZYiDachCY4HYWVzt0/HNny5+Ylsw9GS3Ay/VnSVJ3+FFQnhAzpEgQqGubFeW3/fmeOI/vcA/JWB6cFux8rfKD0jnCjJvwWetFPlrtRr+O1xj9jmrzwo6cpV+KsAIQvdkmDN4rPQocbKH8gL7Na3zEUM9eCse8IGFIb4ZTdspkD7LcN9irH3bYyrBZsR1P6RQPWwX//nw99cFO72DDrCAZPUQZ/oyxNt7OPolmL88KEtCvedK/aNdbrjjZLlUeqQk41VwNZ/H8CO6NX2Gv1Kf/F6bQoWfVsUP5UZN53kCaaYitCdsgJp+Pnvyrh2oh49IhYp7VKXCK5a5HcZuWFPB7iabfi2EU8W1xonpvSG2PPsrg0rU4/CdLIKuhHtXV9fNiAREpqkq4g7m6u8heKmCXBrvxwODcpScXuFnSwRgGh3Yfv2EDQWcpV23Gcz/aBSoSw0i+g9tU8RmQgVI3KqlyEPQ29T95wAlS4inUiyXzhf5x4egIgJ8pd9/2XxS2+N29HSlWuuOYetLezzA+SL9CWP7QB9kg73o6vvJNmLAsQju91/H0pF1dDkJYb/Gd1hO3vATKttcvGtyEN/GmI6grXnwgx4bTkhJTEdoEuN8C6kD7x77sTXk1IqTSgBLvWF4KeOJvzgic6BgIFDxJyb0REGmXTgLnB/b6NA7fjLP/" />
</div>
<script src='masks.js'></script>
<div class="aspNetHidden">
<input type="hidden" name="__VIEWSTATEENCRYPTED" id="__VIEWSTATEENCRYPTED" value="" />
<input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="KcA31Q+gCa7NxEiVU/OBNKTzdcD+hDJlBKesntp7xzs3YJoMskiWyBNqo48LSDxqEwpoAVhjW6XfPoOB6lAyHWd+/ffQ/UCoXcJLUVk4OsecSTxThSzxv3SdnIx4pE+ytZJVeAG7Ix3UCOgVU6tAOMY0Atbta3Kz9cnNAsQ8C2IHF/vORmd1XwPBYHXCe2FSjU+G1kfQwKV1du386WIfBCbbwl5DBW7qsVdbVaGMR+qgOd6Tjk3IV1IuJU0oCUDUm8CcVhm/R6mFrXfCUXl6LVyPHVPKiKaMsdqnGI/IKjI2TjkwkU4+UJJjjobo6ABr4v+Xc1Gwpj4/QVxMBoF5g6izDSGDO9sk5WWeQqFBQKRhABUHEpHnuNgZwYmDC+UbjJ8pArD4Gg9SJexKzZgXkAgwHp/glsGoa5/dYolKx2Nu03tomY14YXkbNq/ml4LmZ3HSPKGuEniZq5gcmd+oCNtQulHCFijcUW39e7PmrKp4MGPk9/0sjmYPa2UZAwF0/RQ0QikZQmOxLokzN/5U865m8hjp4Gj3ndmZpPHKPBa5iHbTqTHSj1qPVnn/v+9wlU4mG7fISLwaALSQHBtOGXyNHNq2F4JExT7R1QskvwzQMF8kJPnysoLhqVmN04i2rXLTH6xY+iUnAN4NOPoIP+T5YBs5DniT5K4RyjMioWQmv6a2eQES1tRxtkKBaPbztolYIVxKmabkzsEjXdOxHIxj21Z/R5UHa6bVnOPaeHKgSpSqyqhDMRu9e5vLkbA3o953g0TZx9xEfB0lw+j/MhqnI35mwplWucjxm9uA/0zTEDAHZ2ATd//iCKR4SWaxjL+y3BTBEn9Icy+LFh77qfj4yHn4Ye7Y5gyIn8oiFJOiNei51in80ZJyGkDP/MG5bKsC+f8R1LukFlur5JoefSmB6oRj7g9KVOw+FW31suQ=" />
</div>
<div>
<table border="0" cellpadding="0" cellspacing="0">
<tr>
<td colspan="2" style="height: 68px; width: 801px;">
<img src="Imagens/banner_cnai_externo.jpg" /></td>
</tr>
<tr>
<td colspan="2" style="width: 801px; height: 232px;">
<div align=center>
<br />
<table style="font-weight: bold; font-size: 12pt; width: 800px; color: white; font-family: verdana;
height: 7px; background-color: firebrick">
<tr>
<td>
CONSULTAR CADASTRO CNAI</td>
</tr>
</table>
<br />
<span style="font-size: 10pt; color: red; font-family: Verdana"><strong>Utilize <span
style="text-decoration: underline">qualquer um</span> dos campos abaixo para fazer
a pesquisa:</strong></span><br />
<br />
<table>
<tr>
<td style="text-align: right; font-weight: bold; color: firebrick; font-family: verdana;">
<span style="font-size: 10pt; font-family: Verdana">Nome:</span></td>
<td style="text-align: left">
<input name="ctl00$ContentPlaceHolder1$tbxNome" type="text" maxlength="100" id="ContentPlaceHolder1_tbxNome" style="font-family:Verdana;font-size:10pt;width:295px;" /></td>
</tr>
<tr>
<td style="text-align: right; font-weight: bold; color: firebrick; font-family: verdana;">
<span style="font-size: 10pt; font-family: Verdana">Número CNAI:</span></td>
<td style="text-align: left">
<input name="ctl00$ContentPlaceHolder1$tbxNumeroCNAI" type="text" maxlength="8" id="ContentPlaceHolder1_tbxNumeroCNAI" style="font-family:Verdana;font-size:10pt;width:100px;" /></td>
</tr>
<tr>
<td style="text-align: right; font-weight: bold; color: firebrick; font-family: verdana;">
<span style="font-size: 10pt; font-family: Verdana">CPF:</span></td>
<td style="text-align: left">
<input name="ctl00$ContentPlaceHolder1$tbxCPF" type="text" value="462.929.158-08" maxlength="14" id="ContentPlaceHolder1_tbxCPF" style="font-family:Verdana;font-size:10pt;width:150px;" /></td>
</tr>
<tr>
<td style="text-align: right; font-weight: bold; color: firebrick; font-family: verdana;">
<span style="font-size: 10pt; font-family: Verdana">Registro:</span></td>
<td style="text-align: left">
<input name="ctl00$ContentPlaceHolder1$tbxNumeroRegistro" type="text" maxlength="8" id="ContentPlaceHolder1_tbxNumeroRegistro" style="font-family:Verdana;font-size:10pt;width:100px;" /></td>
</tr>
<tr>
<td style="text-align: right; font-weight: bold; color: firebrick; font-family: verdana;">
<span style="font-size: 10pt; font-family: Verdana">Habilitação:</span></td>
<td style="text-align: left">
<table id="ContentPlaceHolder1_cbxlCredenciamento" style="font-family:Verdana;font-size:10pt;">
<tr>
<td><input id="ContentPlaceHolder1_cbxlCredenciamento_0" type="checkbox" name="ctl00$ContentPlaceHolder1$cbxlCredenciamento$0" value="1" /><label for="ContentPlaceHolder1_cbxlCredenciamento_0">QTG</label></td><td><input id="ContentPlaceHolder1_cbxlCredenciamento_1" type="checkbox" name="ctl00$ContentPlaceHolder1$cbxlCredenciamento$1" value="2" /><label for="ContentPlaceHolder1_cbxlCredenciamento_1">BCB</label></td><td><input id="ContentPlaceHolder1_cbxlCredenciamento_2" type="checkbox" name="ctl00$ContentPlaceHolder1$cbxlCredenciamento$2" value="3" /><label for="ContentPlaceHolder1_cbxlCredenciamento_2">SUSEP</label></td><td><input id="ContentPlaceHolder1_cbxlCredenciamento_3" type="checkbox" name="ctl00$ContentPlaceHolder1$cbxlCredenciamento$3" value="4" /><label for="ContentPlaceHolder1_cbxlCredenciamento_3">CVM</label></td>
</tr>
</table></td>
</tr>
<tr>
<td style="text-align: right; font-weight: bold; color: firebrick; font-family: verdana;">
<span style="font-size: 10pt; font-family: Verdana">UF:</span></td>
<td style="text-align: left">
<select name="ctl00$ContentPlaceHolder1$ddlUF" id="ContentPlaceHolder1_ddlUF" style="font-family:Verdana;font-size:10pt;">
<option selected="selected" value=""></option>
<option value="AC">AC</option>
<option value="AL">AL</option>
<option value="AM">AM</option>
<option value="AP">AP</option>
<option value="BA">BA</option>
<option value="CE">CE</option>
<option value="DF">DF</option>
<option value="ES">ES</option>
<option value="GO">GO</option>
<option value="MA">MA</option>
<option value="MG">MG</option>
<option value="MS">MS</option>
<option value="MT">MT</option>
<option value="PA">PA</option>
<option value="PB">PB</option>
<option value="PE">PE</option>
<option value="PI">PI</option>
<option value="PR">PR</option>
<option value="RJ">RJ</option>
<option value="RN">RN</option>
<option value="RO">RO</option>
<option value="RR">RR</option>
<option value="RS">RS</option>
<option value="SE">SE</option>
<option value="SC">SC</option>
<option value="SP">SP</option>
<option value="TO">TO</option>
</select></td>
</tr>
<tr>
<td colspan="2">
<br />
<input type="submit" name="ctl00$ContentPlaceHolder1$btnConsultar" value="Consultar" id="ContentPlaceHolder1_btnConsultar" style="font-family:Verdana;font-size:8pt;width:100px;" /> <input type="submit" name="ctl00$ContentPlaceHolder1$btnVoltar" value="<<< Voltar" id="ContentPlaceHolder1_btnVoltar" style="font-family:Verdana;font-size:8pt;width:100px;" /></td>
</tr>
</table>
<br />
<span id="ContentPlaceHolder1_lblQtdRegistros" style="color:Firebrick;font-family:Verdana;font-size:10pt;font-weight:bold;">Quantidade de registros encontrados: 0</span><br />
<br />
<div>
<table cellspacing="0" cellpadding="4" id="ContentPlaceHolder1_gvwProfissional" style="color:#333333;font-family:Verdana;font-size:8pt;width:790px;border-collapse:collapse;">
<tr style="color:Red;font-family:verdana;font-size:10pt;">
<td colspan="9">Nenhum registro encontrado.</td>
</tr>
</table>
</div>
<br />
<br />
</div>
</td>
</tr>
<tr>
<td colspan="2" style="height: 29px; background-color: #ffff92; text-align: center">
<span style="font-size: 8pt; color: firebrick; font-family: Verdana"><strong>
<hr style="width: 790px" />
<span style="color: firebrick">CFC/DEINF - Departamento de Informática</span></strong></span></td>
</tr>
</table>
</div>
<script>_20d372f0c34740b2ae81fb5d201835ad = new Mask("###.###.###-##");_20d372f0c34740b2ae81fb5d201835ad.attach(document.getElementById('ContentPlaceHolder1_tbxCPF'));</script></form>
</body>
</html>
i keep receiving this error:
---------------------------------------------------------------------------
NoSuchElementException Traceback (most recent call last)
<ipython-input-48-eb337bf8471d> in <module>
19
20 results = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table[id*='ContentPlaceHolder1_gvwProfissional'] > tbody > tr")))
---> 21 resultado_pesquisa = results[0].text.strip() if "ContentPlaceHolder1_gvwProfissional" in results[0].get_attribute("class") else results[0].find_element_by_xpath("./td[1]").text.strip()
22
23 driver.find_element_by_id("ContentPlaceHolder1_tbxCPF").clear()
c:\python37\lib\site-packages\selenium\webdriver\remote\webelement.py in find_element_by_xpath(self, xpath)
349 element = element.find_element_by_xpath('//div/td[1]')
350 """
--> 351 return self.find_element(by=By.XPATH, value=xpath)
352
353 def find_elements_by_xpath(self, xpath):
c:\python37\lib\site-packages\selenium\webdriver\remote\webelement.py in find_element(self, by, value)
657
658 return self._execute(Command.FIND_CHILD_ELEMENT,
--> 659 {"using": by, "value": value})['value']
660
661 def find_elements(self, by=By.ID, value=None):
c:\python37\lib\site-packages\selenium\webdriver\remote\webelement.py in _execute(self, command, params)
631 params = {}
632 params['id'] = self._id
--> 633 return self._parent.execute(command, params)
634
635 def find_element(self, by=By.ID, value=None):
c:\python37\lib\site-packages\selenium\webdriver\remote\webdriver.py in execute(self, driver_command, params)
319 response = self.command_executor.execute(driver_command, params)
320 if response:
--> 321 self.error_handler.check_response(response)
322 response['value'] = self._unwrap_value(
323 response.get('value', None))
c:\python37\lib\site-packages\selenium\webdriver\remote\errorhandler.py in check_response(self, response)
240 alert_text = value['alert'].get('text')
241 raise exception_class(message, screen, stacktrace, alert_text)
--> 242 raise exception_class(message, screen, stacktrace)
243
244 def _value_or_default(self, obj, key, default):
NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"./td[1]"}
(Session info: chrome=77.0.3865.90)

Change code to check empty results with code below:
results = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table[id*='ContentPlaceHolder1_gvwProfissional'] > tbody > tr")))
resultado_pesquisa = "Nenhum registro encontrado." if "Nenhum registro encontrado." in results[0].text else results[1].find_element_by_xpath("./td[1]").text.strip()
To check not empty, share one value to enter to the CPF field.

To looking at the table the data you are trying to get is the second row not the first row.
Try this one.
results = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "table[id*='ContentPlaceHolder1_gvwProfissional'] > tbody > tr")))
if "ContentPlaceHolder1_gvwProfissional" in results[0].get_attribute("class"):
resultado_pesquisa = results[0].text.strip()
else:
resultado_pesquisa=results[1].find_element_by_xpath("./td[1]").text.strip()
Print(resultado_pesquisa)

Related

I need to pass the result of soup.find_all to another soup.find_all function to filter the HTML code for a project

I have this HTML code for example:
<table class="nested4">
<tr>
<td colspan="1"></td>
<td colspan="2">
<h2 class="zeroMargin" id="govtMsg" visible="false"></h2>
</td>
<td colspan="2">
<h2 class="zeroMargin "> Net Metering Conn. </h2>
</td>
<td colspan="2">
<h2 class="zeroMargin" hidden> Life Line Consumer</h2>
</td>
</tr>
<tr>
<td colspan="2">
<p style="margin: 0; text-align: left; padding-left: 5px">
<span>NAME & ADDRESS</span>
<br />
<span>MUHAMMAD AMIN </span>
<br />
<span>S/O MUHAMMAD KHAN </span>
<br />
<span>H-NO.38 MARGALLA ROAD </span>
<br />
<span>F-6/3 ISLAMABAD3 </span>
<br />
<span></span>
</p>
</td>
<td colspan="3" style="text-align: left">
<h2 class="color-red">Say No To Corruption</h2>
<span style="font-size: 8pt; color: #78578e"> MCO Date : 10-Aug-2018</span>
<br />
</td>
<td>
<h3 style="font-size: 14pt;"> </h3>
<h2> <br /> </h2>
</td>
</tr>
<tr>
<td style="margin-top: 0;" class="border-b">
<br />
</td>
<td colspan="1" style="margin-top: 0;" class="border-b">
</td>
<td colspan="1" style="margin-top: 0;" class="border-b">
</td>
</tr>
<tr style="height: 7%;" class="border-tb">
<td style="width: 130px" class="border-r">
<h4>METER NO</h4>
</td>
<td style="width: 90px" class="border-r">
<h4>PREVIOUS READING</h4>
</td>
<td style="width: 90px" class="border-r">
<h4>PRESENT READING</h4>
</td>
<td style="width: 60px" class="border-r">
<h4>MF</h4>
</td>
<td style="width: 60px" class="border-r">
<h4>UNITS</h4>
</td>
<td>
<h4>STATUS</h4>
</td>
</tr>
<tr style="height: 30px" class="content">
<td class="border-r">
3-P I 3301539<br> I 3301539<br> E 3301539<br> E 3301539<br>
</td>
<td class="border-r">
78693<br>16823<br>19740<br>8<br>
</td>
<td class="border-r">
80086<br>17210<br>20139<br>8<br>
</td>
<td class="border-r">
1<br>1<br>1<br>1<br>
</td>
<td class="border-r">
1393<br>387<br>399<br>0<br>
</td>
<td>
</td>
</tr>
<tr id="roshniMsg" style="height: 30px" class="content">
<td colspan="6">
<div style="width: 452pt">
<img style="max-width: 100%; max-height: 35%" src="/images/companies/iesco/roshniMsg.jpg"
alt="Roshni Message" />
</div>
</td>
</tr>
</table>
From this table I want to extract the paragraph and from there I want to get all the span tags in that paragraph.
I used soup.find_all() to get the table but I don't know how to use this function iteratively to pass it back to the original soup object so that I could find the paragraph and, moreover the span tags in that paragraph.
This is the code Python code I wrote:
soup = BeautifulSoup(string, 'html.parser')
#Getting the table tag
results = soup.find_all('table', attrs={'class':'nested4'})
#Getting the paragragh tag
results = soup.find_all('p', attrs={'style':'margin: 0; text-align: left; padding-left: 5px'})
#Getting all the span tags
results = soup.find_all('span', attrs={})
I just want help on how to get the paragraphs within the table. And then how to get the spans within the paragraph as I am getting the spans in all of the original HTML code. I don't know how to pass the bs4 object list back to the soup object to use soup.find_all iteratively.
from bs4 import BeautifulSoup
html = '''
<table class="nested4">
<tr>
<td colspan="1"></td>
<td colspan="2">
<h2 class="zeroMargin" id="govtMsg" visible="false"></h2>
</td>
<td colspan="2">
<h2 class="zeroMargin "> Net Metering Conn. </h2>
</td>
<td colspan="2">
<h2 class="zeroMargin" hidden> Life Line Consumer</h2>
</td>
</tr>
<tr>
<td colspan="2">
<p style="margin: 0; text-align: left; padding-left: 5px">
<span>NAME & ADDRESS</span>
<br />
<span>MUHAMMAD AMIN </span>
<br />
<span>S/O MUHAMMAD KHAN </span>
<br />
<span>H-NO.38 MARGALLA ROAD </span>
<br />
<span>F-6/3 ISLAMABAD3 </span>
<br />
<span></span>
</p>
</td>
<td colspan="3" style="text-align: left">
<h2 class="color-red">Say No To Corruption</h2>
'''
soup = BeautifulSoup(html, 'html.parser')
spans = soup.select_one('table.nested4').select('span')
for span in spans:
print(span.text)
This returns:
NAME & ADDRESS
MUHAMMAD AMIN
S/O MUHAMMAD KHAN
H-NO.38 MARGALLA ROAD
F-6/3 ISLAMABAD3
if you have one table:
soup = BeautifulSoup(string, 'html.parser')
table = soup.find('table', attrs={'class': 'nested4'})
p = table.find('p', attrs={'style': 'margin: 0; text-align: left; padding-left: 5px'})
results = p.find_all('span')
for result in results:
print(result.get_text(strip=True))
if you have list of tables:
soup = BeautifulSoup(string, 'html.parser')
for table in soup.find_all('table', attrs={'class': 'nested4'}):
for p in table.find_all('p', attrs={'style': 'margin: 0; text-align: left; padding-left: 5px'}):
for span in p.find_all('span'):
print(span.get_text(strip=True))

Problem extracting text of td from table row (tr) with scrapy

I am parsing data table from the following URL:
https://www.signalstart.com/search-signals
In particular, I am trying to extract the data from the table rows.
The table row has a series of table-data cells:
<table class="table table-striped table-bordered dataTable table-hover" id="searchSignalsTable">
<thead>
<tr>
<th class="sorting sorting_asc">Rank</th>
<th class="sorting ">Name</th>
<th class="sorting ">Gain</th>
<th class="sorting ">Pips</th>
<th class="sorting ">DD</th>
<th class="sorting ">Trades</th>
<th class="sorting ">Type</th>
<th>Monthly</th>
<th>Chart</th>
<th class="sorting ">Price</th>
<th class="sorting " style="width: 40px">Age</th>
<th class="sorting " style="width: 70px">Added</th>
<th>Action</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center;"> - </td>
<td><a class="pointer" target="_blank" href="https://www.signalstart.com/analysis/joker-1k/110059">Joker 1k</a> </td>
<td><span class="red">-9.99%</span></td>
<td><span class="green">2,092.3</span></td>
<td>15.3%</td>
<td>108</td>
<td>Real</td>
<td><span class="monthlySparkline" id="monthlySpark110059"><canvas width="12" height="25" style="display: inline-block; vertical-align: top; width: 12px; height: 25px;"></canvas></span></td>
<td><span class="dayliSparkline" id="dayliSpark110059"><canvas width="100" height="25" style="display: inline-block; vertical-align: top; width: 100px; height: 25px;"></canvas></span></td>
<td>$30</td>
<td>
1m 24d
</td>
<td>
Mar 29, 2020
</td>
<td><a onclick="getMasterPricingData('110059');" data-toggle="modal"><button id="subscribeToMasterBtn110059" class="btn btn-circle btn-sm green" type="button">Copy</button></a>
<div style="display: none;">
<input type="hidden" class="monthlyData" oid="110059" value="-1.78,-3.68,-4.86">
<input type="hidden" class="dailyGrowthData" oid="110059" value="0.00,-0.03,-1.78,-5.69,-6.75,-5.59,-7.61,-5.31,-6.20,-3.81,-4.40,-8.00,-2.88,-3.78,-4.38,-0.20,-5.40,-10.66,-13.69,-12.51,-13.23,-9.99">
<input type="hidden" class="dailyEquityData" oid="110059" value="0.00,-0.23,-1.41,-5.02,-6.25,-4.29,-6.68,-3.91,-5.37,-4.10,-4.40,-3.59,-1.78,-1.75,-2.65,-0.21,-4.87,-10.76,-13.90,-11.58,-13.23,-10.18">
</div>
</td>
</tr>
<tr>
<td style="text-align: center;"> - </td>
<td><a class="pointer" target="_blank" href="https://www.signalstart.com/analysis/fxabakus/56043">FXabakus</a> </td>
<td><span class="red">-19.57%</span></td>
<td><span class="red">-8,615.2</span></td>
<td>42%</td>
<td>1642</td>
<td>Real</td>
<td><span class="monthlySparkline" id="monthlySpark56043"><canvas width="80" height="25" style="display: inline-block; vertical-align: top; width: 80px; height: 25px;"></canvas></span></td>
<td><span class="dayliSparkline" id="dayliSpark56043"><canvas width="100" height="25" style="display: inline-block; vertical-align: top; width: 100px; height: 25px;"></canvas></span></td>
<td>$30</td>
<td>
1y 7m
</td>
<td>
May 4, 2019
</td>
<td><a onclick="getMasterPricingData('56043');" data-toggle="modal"><button id="subscribeToMasterBtn56043" class="btn btn-circle btn-sm green" type="button">Copy</button></a>
<div style="display: none;">
<input type="hidden" class="monthlyData" oid="56043" value="1.22,1.35,3.92,1.35,-1.57,1.77,2.01,1.11,0.38,-14.89,-14.70,-5.21,5.97,7.03,-17.54,2.92,3.11,-8.94,13.38,1.77">
<input type="hidden" class="dailyGrowthData" oid="56043" value="-27.87,-29.29,-29.01,-26.76,-25.76,-25.59,-30.57,-30.13,-29.78,-29.60,-29.25,-28.34,-28.07,-27.89,-25.20,-25.08,-23.66,-23.46,-21.54,-21.02,-21.62,-20.28,-18.31,-26.97,-27.48,-27.00,-28.21,-24.20,-23.46,-30.04,-31.37,-34.62,-33.84,-32.87,-32.20,-30.99,-30.43,-30.30,-29.75,-27.64,-27.45,-24.34,-24.71,-24.09,-24.15,-21.48,-21.08,-20.97,-19.54,-19.57">
<input type="hidden" class="dailyEquityData" oid="56043" value="-27.87,-29.29,-28.89,-26.76,-25.76,-28.10,-34.47,-32.34,-31.54,-40.80,-32.76,-32.90,-33.50,-30.65,-25.37,-25.05,-22.88,-23.29,-21.54,-21.02,-21.54,-20.90,-19.11,-27.76,-35.15,-29.17,-27.79,-24.20,-26.23,-34.32,-35.95,-51.20,-33.84,-32.76,-32.71,-31.62,-30.43,-39.93,-29.75,-27.64,-28.35,-27.62,-28.41,-24.20,-24.51,-22.06,-21.08,-20.97,-18.82,-30.27">
</div>
</td>
</tr>
<tr>
<td style="text-align: center;"> - </td>
<td><a class="pointer" target="_blank" href="https://www.signalstart.com/analysis/af-investing-pro-final/122603">AF Investing Pro Final</a> </td>
<td><span class="green">56.69%</span></td>
<td><span class="green">29,812</span></td>
<td>8.6%</td>
<td>476</td>
<td>Real</td>
<td><span class="monthlySparkline" id="monthlySpark122603"><canvas width="8" height="25" style="display: inline-block; vertical-align: top; width: 8px; height: 25px;"></canvas></span></td>
<td><span class="dayliSparkline" id="dayliSpark122603"><canvas width="100" height="25" style="display: inline-block; vertical-align: top; width: 100px; height: 25px;"></canvas></span></td>
<td>$250</td>
<td>
17d 12h
</td>
<td>
Apr 30, 2020
</td>
<td><a onclick="getMasterPricingData('122603');" data-toggle="modal"><button id="subscribeToMasterBtn122603" class="btn btn-circle btn-sm green" type="button">Copy</button></a>
<div style="display: none;">
<input type="hidden" class="monthlyData" oid="122603" value="55.18,0.98">
<input type="hidden" class="dailyGrowthData" oid="122603" value="-0.02,0.04,54.78,55.02,55.18,55.82,55.86,55.99,56.06,56.25,56.69">
<input type="hidden" class="dailyEquityData" oid="122603" value="-8.60,16.85,54.86,54.11,55.44,55.85,54.38,52.15,45.00,51.07,56.25">
</div>
</td>
</tr>
<tr>
<td style="text-align: center;"> - </td>
<td><a class="pointer" target="_blank" href="https://www.signalstart.com/analysis/rapid-growth/111340">Rapid growth</a> </td>
<td><span class="green">130.78%</span></td>
<td><span class="green">1,102.9</span></td>
<td>44.3%</td>
<td>126</td>
<td>Real</td>
<td><span class="monthlySparkline" id="monthlySpark111340"><canvas width="12" height="25" style="display: inline-block; vertical-align: top; width: 12px; height: 25px;"></canvas></span></td>
<td><span class="dayliSparkline" id="dayliSpark111340"><canvas width="100" height="25" style="display: inline-block; vertical-align: top; width: 100px; height: 25px;"></canvas></span></td>
<td>$31</td>
<td>
2m 8d
</td>
<td>
Apr 1, 2020
</td>
<td><a onclick="getMasterPricingData('111340');" data-toggle="modal"><button id="subscribeToMasterBtn111340" class="btn btn-circle btn-sm green" type="button">Copy</button></a>
<div style="display: none;">
<input type="hidden" class="monthlyData" oid="111340" value="87.85,18.28,3.87">
<input type="hidden" class="dailyGrowthData" oid="111340" value="0.00,0.64,1.40,1.40,1.90,2.91,7.53,8.21,11.19,11.30,17.60,19.60,23.03,37.74,47.75,54.75,59.91,69.79,73.60,79.36,87.85,93.14,93.40,94.70,95.93,96.01,99.95,100.71,101.85,102.10,102.12,104.36,108.76,110.11,110.14,110.23,112.58,115.10,115.54,117.17,121.24,122.19,123.40,124.18,124.88,124.89,130.09,130.78">
<input type="hidden" class="dailyEquityData" oid="111340" value="-1.80,0.67,0.97,1.91,-0.64,2.58,6.82,6.72,8.65,8.46,16.29,17.71,19.96,34.10,47.24,51.91,59.07,69.79,73.58,79.26,88.01,91.03,93.43,87.85,96.19,95.80,100.29,95.63,98.94,101.71,98.33,104.12,108.26,108.46,86.24,108.42,112.83,114.51,94.42,116.29,120.16,121.93,123.05,115.67,122.81,124.45,130.47,130.14">
</div>
</td>
</tr>
<tr>
<td style="text-align: center;"> - </td>
<td><a class="pointer" target="_blank" href="https://www.signalstart.com/analysis/dream-presentation-1/66543">Dream Presentation 1</a> </td>
<td><span class="red">-99.9%</span></td>
<td><span class="red">-2,724.1</span></td>
<td>99.9%</td>
<td>1612</td>
<td>Real</td>
<td><span class="monthlySparkline" id="monthlySpark66543"><canvas width="28" height="25" style="display: inline-block; vertical-align: top; width: 28px; height: 25px;"></canvas></span></td>
<td><span class="dayliSparkline" id="dayliSpark66543"><canvas width="100" height="25" style="display: inline-block; vertical-align: top; width: 100px; height: 25px;"></canvas></span></td>
<td>$30</td>
<td>
6m 13d
</td>
<td>
Nov 8, 2019
</td>
<td><a onclick="getMasterPricingData('66543');" data-toggle="modal"><button id="subscribeToMasterBtn66543" class="btn btn-circle btn-sm green" type="button">Copy</button></a>
<div style="display: none;">
<input type="hidden" class="monthlyData" oid="66543" value="-100.14,-98.54,-98.79,-91.71,-98.23,-100.00,-88.82">
<input type="hidden" class="dailyGrowthData" oid="66543" value="24.18,-99.90,-99.89,-99.88,-99.88,-99.88,-99.87,-99.87,-99.86,-99.84,-99.83,-99.90,-99.89,-99.90,-99.90,-99.81,-99.81,-99.80,-99.90,-99.90,-99.86,-99.83,-99.79,-99.90,-99.90,-99.90,-99.88,-99.89,-99.89,-99.88,-99.82,-99.74,-99.85,-99.37,-99.88,-99.90,-99.90,-99.90,-99.90,-99.87,-99.83,-99.80,-99.75,-99.64,-99.56,-99.90,-99.90">
<input type="hidden" class="dailyEquityData" oid="66543" value="7.87,-99.90,-99.89,-99.88,-99.88,-99.88,-99.88,-99.87,-99.86,-99.84,-99.83,-99.90,-99.89,-99.90,-99.89,-99.83,-99.88,-99.88,-99.90,-99.90,-99.87,-99.83,-99.84,-99.72,-99.90,-99.90,-99.88,-99.89,-99.88,-99.92,-99.86,-99.74,-99.86,-99.39,-99.88,-99.90,-99.90,-99.90,-99.90,-99.87,-99.83,-99.79,-99.76,-99.63,-99.55,-100.16,-99.83">
</div>
</td>
</tr>
<tr>
<td style="text-align: center;"> - </td>
<td><a class="pointer" target="_blank" href="https://www.signalstart.com/analysis/limerence-ea-suite-3/93679">Limerence EA Suite 3</a> </td>
<td><span class="green">1,246.66%</span></td>
<td><span class="green">199.8</span></td>
<td>34.2%</td>
<td>8</td>
<td>Real</td>
<td><span class="monthlySparkline" id="monthlySpark93679"><canvas width="20" height="25" style="display: inline-block; vertical-align: top; width: 20px; height: 25px;"></canvas></span></td>
<td><span class="dayliSparkline" id="dayliSpark93679"><canvas width="100" height="25" style="display: inline-block; vertical-align: top; width: 100px; height: 25px;"></canvas></span></td>
<td>$75</td>
<td>
7m 11d
</td>
<td>
Feb 11, 2020
</td>
<td><a onclick="getMasterPricingData('93679');" data-toggle="modal"><button id="subscribeToMasterBtn93679" class="btn btn-circle btn-sm green" type="button">Copy</button></a>
<div style="display: none;">
<input type="hidden" class="monthlyData" oid="93679" value="95.40,82.01,94.38,87.49,3.90">
<input type="hidden" class="dailyGrowthData" oid="93679" value="0.00,95.40,255.64,591.28,552.49,1234.12,1196.10,1246.66">
<input type="hidden" class="dailyEquityData" oid="93679" value="0.00,95.40,255.64,591.28,1034.76,1234.12,1196.10,1246.66">
</div>
</td>
</tr>
<tr>
<td style="text-align: center;"> - </td>
<td><a class="pointer" target="_blank" href="https://www.signalstart.com/analysis/easy-money/31727">Easy Money</a> </td>
<td><span class="red">-99.9%</span></td>
<td><span class="green">2,430.6</span></td>
<td>100%</td>
<td>1095</td>
<td>Real</td>
<td><span class="monthlySparkline" id="monthlySpark31727"><canvas width="96" height="25" style="display: inline-block; vertical-align: top; width: 96px; height: 25px;"></canvas></span></td>
<td><span class="dayliSparkline" id="dayliSpark31727"><canvas width="100" height="25" style="display: inline-block; vertical-align: top; width: 100px; height: 25px;"></canvas></span></td>
<td>$30</td>
<td>
2y 2m
</td>
<td>
Apr 1, 2018
</td>
<td><a onclick="getMasterPricingData('31727');" data-toggle="modal"><button id="subscribeToMasterBtn31727" class="btn btn-circle btn-sm green" type="button">Copy</button></a>
<div style="display: none;">
<input type="hidden" class="monthlyData" oid="31727" value="6.22,-6.15,22.04,-5.08,0.08,12.08,-69.31,-99.82,245.26,88.44,113.73,52.29,25.38,77.72,-29.07,-24.73,-86.48,-89.27,195.77,-7.65,-99.98,278.89,-69.98,-65.48">
<input type="hidden" class="dailyGrowthData" oid="31727" value="-99.66,-99.69,-99.72,-99.73,-99.77,-99.77,-99.78,-99.81,-99.90,-99.90,-99.89,-99.84,-99.83,-99.82,-99.81,-99.75,-99.78,-99.77,-99.79,-99.78,-99.77,-99.48,-99.46,-99.36,-99.34,-99.33,-99.33,-99.31,-99.33,-99.34,-99.40,-99.45,-99.33,-99.58,-99.65,-99.73,-99.71,-99.70,-99.68,-99.68,-99.69,-99.68,-99.71,-99.68,-99.80,-99.80,-99.77,-99.81,-99.84,-99.90">
<input type="hidden" class="dailyEquityData" oid="31727" value="-99.66,-99.69,-99.73,-99.70,-99.85,-99.89,-99.95,-99.77,-99.85,-99.90,-99.88,-99.84,-99.83,-99.82,-99.79,-99.75,-99.78,-99.77,-99.70,-99.68,-99.59,-99.48,-99.46,-99.36,-99.34,-99.33,-99.32,-99.25,-99.30,-99.34,-99.37,-99.37,-99.35,-99.58,-99.61,-99.73,-99.71,-99.69,-99.68,-99.68,-99.68,-99.68,-99.71,-99.68,-99.80,-99.76,-99.73,-99.79,-99.80,-99.89">
</div>
</td>
</tr>
</tbody>
</table>
My code successfully extracts the data from the first table-data cell (the rank). But it is showing as blank for the second table data cell (the name). What is wrong with this source code:
import scrapy
from behold import Behold
class SignalStartSpider(scrapy.Spider):
name = 'signalstart'
start_urls = [
'https://www.signalstart.com/search-signals',
]
def parse(self, response):
for provider in response.xpath("//div[#class='row']//tr"):
yield {
'rank': provider.xpath('td[1]/text()').get(),
'name': provider.xpath('td[2]/text()').get(),
}
UPDATE
I am now iterating over the td cells within tr and getting the td cells, but my final problem is: how to get the text from the td cells that I have?
import scrapy
from behold import Behold
class SignalStartSpider(scrapy.Spider):
name = 'signalstart'
start_urls = [
'https://www.signalstart.com/search-signals',
]
def parse(self, response):
cols = "rank name gain pips drawdown trades type monthly chart price age added action"
skip = [9,13]
td = dict()
for i, col in enumerate(cols.split()):
td[i] = col
Behold().show('td')
for provider in response.xpath("//div[#class='row']//tr"):
data_row = dict()
for i, datum in enumerate(provider.xpath('td')):
if i in skip:
continue
data_row[td[i]] = datum
# Behold().show('datum')
yield data_row
The correct answer was provided by gallaecio_ in the Scrapy IRC channel - here is the code:
import scrapy
from behold import Behold
class SignalStartSpider(scrapy.Spider):
name = 'signalstart'
start_urls = [
'https://www.signalstart.com/search-signals',
]
def parse(self, response):
cols = "rank name gain pips drawdown trades type monthly chart price age added action"
skip = [9,13]
td = dict()
for i, col in enumerate(cols.split()):
td[i] = col
Behold().show('td')
for provider in response.xpath("//div[#class='row']//tr"):
data_row = dict()
for i, datum in enumerate(provider.xpath('td/text()')):
if i in skip:
continue
data_row[td[i]] = datum.get()
# Behold().show('datum')
yield data_row
for more involved cases you may need https://github.com/TeamHG-Memex/html-text

Selenium selecting first row in a table even though I'm iterating through the rows

I'm really not sure what's going on here and cannot figure it out so hoping someone can help me.
Essentially, I am selecting a table, then finding all of the rows within that table. From here, I am looping through the rows and selecting three checkboxes and checking if they are selected or not. However, the results I am getting for every row is always what the first row is. For example
False
True
False
then...
False
True
False
when it should change each row.
Python/Selenium code:
table = driver.find_element_by_xpath("//table[#id='tableMscRatesModal']/tbody")
rows = table.find_elements(By.XPATH, "//tr[#class='ng-scope']") # get all of the rows in the tables
for row in rows:
print(row.text)
manager_box = row.find_element_by_xpath("//td/input[#type='radio'][contains(#name, 'userRole')][#value='1']")
standard_box = row.find_element_by_xpath("//td/input[#type='radio'][contains(#name, 'userRole')][#value='2']")
no_box = row.find_element_by_xpath("//td/input[#type='radio'][contains(#name, 'userRole')][#value='3']")
print(manager_box.is_selected())
print(standard_box.is_selected())
print(no_box.is_selected())
if not (manager_box.is_selected() or standard_box.is_selected() or no_box.is_selected()):
standard_box.send_keys(Keys.SPACE)
I have included an example of two rows below. This is repeated exactly the same for x amount of rows.
HTML:
<table id="tableMscRatesModal" ng-table="tableParams" class="table table-striped table-bordered" show-filter="false">
<colgroup>
<col span="1" style="width: 13%;">
<col span="1" style="width: 13%;">
<col span="1" style="width: 8%;">
<col span="1" style="width: 8%;">
<col span="1" style="width: 8%;">
<col span="1" style="width: 25%;">
<col span="1" style="width: 25%;">
</colgroup>
<tbody>
<tr>
<th>First name</th>
<th>Last name</th>
<th>Manager</th>
<th>Standard user</th>
<th>No access</th>
<th>Proposed change</th>
<th ng-show="iAmRequestManagerWithProposals(getUserList())" class="ng-hide">
Approval
<br>
<br>
<div class="row approvals">
<div class="col-xs-12">
<div class="input-group">
<span class="input-group-addon">
<label ng-click="rejectAll()">
<input type="radio" name="rejectapproveall" id="rejectall" value="rejectall">
Reject all
</label>
</span>
<span class="input-group-addon">
<label ng-click="approveAll()">
<input type="radio" name="rejectapproveall" id="acceptall" value="acceptall">
Approve all
</label>
</span>
</div>
</div>
</div>
</th>
</tr>
<!-- ngRepeat: accessUser in manageAccessControl.usersAndRoles | orderBy:['sortOrder','lastName','firstName'] -->
<tr ng-repeat="accessUser in manageAccessControl.usersAndRoles | orderBy:['sortOrder','lastName','firstName']" class="ng-scope">
<td class="ng-binding">First Name</td>
<td class="ng-binding">Last Name</td>
<td>
<input type="radio" name="userRole" ng-value="hmwAccess.roles.manager" ng-model="accessUser.selectedRoleId" ng-disabled="accessUser.proposedRoleId || accessUser.isAmsAdmin" class="ng-pristine ng-untouched ng-valid ng-not-empty" value="1">
</td>
<td>
<input type="radio" name="userRole" ng-value="hmwAccess.roles.stdUser" ng-model="accessUser.selectedRoleId" ng-disabled="accessUser.proposedRoleId || accessUser.isAmsAdmin" class="ng-pristine ng-untouched ng-valid ng-not-empty" value="2">
</td>
<td>
<input type="radio" name="userRole" ng-value="hmwAccess.roles.noAccess" ng-model="accessUser.selectedRoleId" ng-disabled="accessUser.proposedRoleId || accessUser.isAmsAdmin" class="ng-pristine ng-untouched ng-valid ng-not-empty" value="3">
</td>
<td>
<span ng-show="accessUser.proposedRoleId" class="ng-hide">
<img tooltip-placement="left" tooltip-append-to-body="true" uib-tooltip-html="getRoleChangeProposal(accessUser)" src="/img/infoIcon.png" style="width: 12px; height: 12px">
Pending manager approval
</span>
<span ng-show="accessUser.isAmsAdmin" class="ng-hide">
<img tooltip-placement="left" tooltip-append-to-body="true" uib-tooltip-html="amsAdminUserTooltip" src="/img/infoIcon.png" style="width: 12px; height: 12px">
AMS admin user
</span>
</td>
<td ng-show="iAmRequestManagerWithProposals(getUserList())" class="ng-hide">
<div class="row approvals">
<div class="col-xs-12">
<div class="input-group ng-hide" ng-show="accessUser.proposedRoleId && accessUser.proposedRoleId > 0">
<span class="input-group-addon">
<label ng-click="setApprovalForUser(accessUser, hmwAccess.rejectApprove.rejected)">
<input type="radio" name="rejectapprove" id="reject" value="rejected" ng-model="accessUser.rejectApprove" class="ng-pristine ng-untouched ng-valid ng-empty">
Reject
</label>
</span>
<span class="input-group-addon">
<label ng-click="setApprovalForUser(accessUser, hmwAccess.rejectApprove.approved)">
<input type="radio" name="rejectapprove" id="accept" value="approved" ng-model="accessUser.rejectApprove" class="ng-pristine ng-untouched ng-valid ng-empty">
Approve
</label>
</span>
</div>
</div>
</div>
</td>
</tr>
<!-- end ngRepeat: accessUser in manageAccessControl.usersAndRoles | orderBy:['sortOrder','lastName','firstName'] -->
<tr ng-repeat="accessUser in manageAccessControl.usersAndRoles | orderBy:['sortOrder','lastName','firstName']" class="ng-scope">
<td class="ng-binding">First Name</td>
<td class="ng-binding">Last Name</td>
<td>
<input type="radio" name="userRole" ng-value="hmwAccess.roles.manager" ng-model="accessUser.selectedRoleId" ng-disabled="accessUser.proposedRoleId || accessUser.isAmsAdmin" class="ng-pristine ng-untouched ng-valid ng-not-empty" value="1">
</td>
<td>
<input type="radio" name="userRole" ng-value="hmwAccess.roles.stdUser" ng-model="accessUser.selectedRoleId" ng-disabled="accessUser.proposedRoleId || accessUser.isAmsAdmin" class="ng-pristine ng-untouched ng-valid ng-not-empty" value="2">
</td>
<td>
<input type="radio" name="userRole" ng-value="hmwAccess.roles.noAccess" ng-model="accessUser.selectedRoleId" ng-disabled="accessUser.proposedRoleId || accessUser.isAmsAdmin" class="ng-pristine ng-untouched ng-valid ng-not-empty" value="3">
</td>
<td>
<span ng-show="accessUser.proposedRoleId" class="ng-hide">
<img tooltip-placement="left" tooltip-append-to-body="true" uib-tooltip-html="getRoleChangeProposal(accessUser)" src="/img/infoIcon.png" style="width: 12px; height: 12px">
Pending manager approval
</span>
<span ng-show="accessUser.isAmsAdmin" class="ng-hide">
<img tooltip-placement="left" tooltip-append-to-body="true" uib-tooltip-html="amsAdminUserTooltip" src="/img/infoIcon.png" style="width: 12px; height: 12px">
AMS admin user
</span>
</td>
<td ng-show="iAmRequestManagerWithProposals(getUserList())" class="ng-hide">
<div class="row approvals">
<div class="col-xs-12">
<div class="input-group ng-hide" ng-show="accessUser.proposedRoleId && accessUser.proposedRoleId > 0">
<span class="input-group-addon">
<label ng-click="setApprovalForUser(accessUser, hmwAccess.rejectApprove.rejected)">
<input type="radio" name="rejectapprove" id="reject" value="rejected" ng-model="accessUser.rejectApprove" class="ng-pristine ng-untouched ng-valid ng-empty">
Reject
</label>
</span>
<span class="input-group-addon">
<label ng-click="setApprovalForUser(accessUser, hmwAccess.rejectApprove.approved)">
<input type="radio" name="rejectapprove" id="accept" value="approved" ng-model="accessUser.rejectApprove" class="ng-pristine ng-untouched ng-valid ng-empty">
Approve
</label>
</span>
</div>
</div>
</div>
</td>
</tr>
Thanks in advance!
When locating an element from another element with xpath you need to use current context .
for row in rows:
manager_box = row.find_element_by_xpath(".//td/input[#type='radio'][contains(#name, 'userRole')][#value='1']")
standard_box = row.find_element_by_xpath(".//td/input[#type='radio'][contains(#name, 'userRole')][#value='2']")
no_box = row.find_element_by_xpath(".//td/input[#type='radio'][contains(#name, 'userRole')][#value='3']")
You can also simplify your code if you use list
checkboxes = row.find_elements_by_xpath(".//td/input[#type='radio'][contains(#name, 'userRole')]")
for checkbox in checkboxes:
print(checkboxes.is_selected())
And even farther if you drop the print(row)
checkboxes = row.find_elements_by_xpath("//table[#id='tableMscRatesModal']/tbody//tr[#class='ng-scope']//td/input[#type='radio']")
for i in range(0, len(checkboxes), 3):
print(checkboxes[i].is_selected())
print(checkboxes[i + 1].is_selected())
print(checkboxes[i + 2].is_selected())

BadRequestKeyError: 400 Bad Request Python Web Scraping

I´m getting the following error on python when I trying to do some scraping:
Traceback (most recent call last):
File "", line 26, in
signin2.fields["ctl06$txtParam_1"].value = '139210'
File "C:\Users\Alvaro
Pabon\Anaconda3\lib\site-packages\werkzeug\datastructures.py", line
781, in getitem
raise exceptions.BadRequestKeyError(key)
BadRequestKeyError: 400 Bad Request: The browser (or proxy) sent a
request that this server could not understand.
I provide the html and the python code, what am I doing wrong?
HTML:
<form method="post" action="Default.aspx?IdControl=SolicitarReporteUC&TipoProceso=G" id="Form1">
<div class="aspNetHidden">
<input type="hidden" name="__EVENTTARGET" id="__EVENTTARGET" value="" />
<input type="hidden" name="__EVENTARGUMENT" id="__EVENTARGUMENT" value="" />
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUKLTE2MjczMjc4MQ9kFgICAw9kFgICBQ9kFgJmD2QWDgIBDxAPFgYeDkRhdGFWYWx1ZUZpZWxkBQpDb2RSZXBvcnRlHg1EYXRhVGV4dEZpZWxkBQdSZXBvcnRlHgtfIURhdGFCb3VuZGdkEBUBI0NlcnRpZmljYWRvIGRlIGhpc3RvcmlhIGxhYm9yYWwgRlBNFQEFMTAwOTUUKwMBZxYBZmQCAw9kFgJmD2QWAgIBD2QWAgIBDw9kFgIeB29uY2xpY2sFdmphdmFzY3JpcHQ6cmV0dXJuIEJ1c2NhckNvblBvc3RCYWNrKCdFbXBsZWFkb19WSVBQJywnQ29kRW1wbGVhZG8nLCdFbXBsZWFkbycsJycsJ2N0bDA2X3R4dFBhcmFtXzEnLCdjdGwwNl90eHREZXNjXzEnKTtkAgcPDxYCHgRUZXh0ZWRkAgkPEA8WAh4HVmlzaWJsZWdkEBUBA1BERhUBA1BERhQrAwFnZGQCCw8PFgIeB0VuYWJsZWRnZGQCDQ8PFgIfBGVkZAIRDzwrAAsBAA8WCB4IRGF0YUtleXMWAB4LXyFJdGVtQ291bnQCAR4JUGFnZUNvdW50AgEeFV8hRGF0YVNvdXJjZUl0ZW1Db3VudAIBZBYCZg9kFgICAg9kFgxmD2QWAgIDDw8WAh4LTmF2aWdhdGVVcmwFOkRlZmF1bHQuYXNweD9JZENvbnRyb2w9UGV0aWNpb25lc1ZlclVDJkNvZFBldGljaW9uPTk4NDI0NjZkZAIBDw8WAh8EBQc5ODQyNDY2ZGQCAg8PFgIfBAUKMDQvMDcvMjAxN2RkAgMPDxYCHwQFLENlcnRpZmljYWRvIGRlIGhpc3RvcmlhIGxhYm9yYWwgRlBNKDEzOTIxMCwpZGQCBA8PFgIfBAUBVGRkAgUPDxYCHwQFCVRlcm1pbmFkb2RkZG9xWba643oqthJTATkgc95Acvr6oJVDDdMGc4QiUOHQ" />
</div>
<script type="text/javascript">
//<![CDATA[
var theForm = document.forms['Form1'];
if (!theForm) {
theForm = document.Form1;
}
function __doPostBack(eventTarget, eventArgument) {
if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
theForm.__EVENTTARGET.value = eventTarget;
theForm.__EVENTARGUMENT.value = eventArgument;
theForm.submit();
}
}
//]]>
</script>
<script src="/peoploEL/WebResource.axd?d=Vo5dwRm0erdgUaaz932BKtVNZGJOgXKXcR91FZwwFfehyhj6Sl2EkKnl2mAONakSWUxeINyfjibWOjKY8z8OLswtutIQ6CR4NPqhOOhW3-c1&t=635195493660000000" type="text/javascript"></script>
<script type="text/javascript">
//<![CDATA[
var __cultureInfo = {"name":"es-CO","numberFormat":{"CurrencyDecimalDigits":2,"CurrencyDecimalSeparator":",","IsReadOnly":true,"CurrencyGroupSizes":[3],"NumberGroupSizes":[3],"PercentGroupSizes":[3],"CurrencyGroupSeparator":".","CurrencySymbol":"$","NaNSymbol":"NeuN","CurrencyNegativePattern":14,"NumberNegativePattern":1,"PercentPositivePattern":0,"PercentNegativePattern":0,"NegativeInfinitySymbol":"-Infinito","NegativeSign":"-","NumberDecimalDigits":2,"NumberDecimalSeparator":",","NumberGroupSeparator":".","CurrencyPositivePattern":2,"PositiveInfinitySymbol":"Infinito","PositiveSign":"+","PercentDecimalDigits":2,"PercentDecimalSeparator":",","PercentGroupSeparator":".","PercentSymbol":"%","PerMilleSymbol":"‰","NativeDigits":["0","1","2","3","4","5","6","7","8","9"],"DigitSubstitution":1},"dateTimeFormat":{"AMDesignator":"a.m.","Calendar":{"MinSupportedDateTime":"\/Date(-62135578800000)\/","MaxSupportedDateTime":"\/Date(253402300799999)\/","AlgorithmType":1,"CalendarType":1,"Eras":[1],"TwoDigitYearMax":2029,"IsReadOnly":true},"DateSeparator":"/","FirstDayOfWeek":0,"CalendarWeekRule":0,"FullDateTimePattern":"dddd, dd\u0027 de \u0027MMMM\u0027 de \u0027yyyy hh:mm:ss tt","LongDatePattern":"dddd, dd\u0027 de \u0027MMMM\u0027 de \u0027yyyy","LongTimePattern":"hh:mm:ss tt","MonthDayPattern":"dd MMMM","PMDesignator":"p.m.","RFC1123Pattern":"ddd, dd MMM yyyy HH\u0027:\u0027mm\u0027:\u0027ss \u0027GMT\u0027","ShortDatePattern":"dd/MM/yyyy","ShortTimePattern":"hh:mm tt","SortableDateTimePattern":"yyyy\u0027-\u0027MM\u0027-\u0027dd\u0027T\u0027HH\u0027:\u0027mm\u0027:\u0027ss","TimeSeparator":":","UniversalSortableDateTimePattern":"yyyy\u0027-\u0027MM\u0027-\u0027dd HH\u0027:\u0027mm\u0027:\u0027ss\u0027Z\u0027","YearMonthPattern":"MMMM\u0027 de \u0027yyyy","AbbreviatedDayNames":["dom","lun","mar","mié","jue","vie","sáb"],"ShortestDayNames":["do","lu","ma","mi","ju","vi","sá"],"DayNames":["domingo","lunes","martes","miércoles","jueves","viernes","sábado"],"AbbreviatedMonthNames":["ene","feb","mar","abr","may","jun","jul","ago","sep","oct","nov","dic",""],"MonthNames":["enero","febrero","marzo","abril","mayo","junio","julio","agosto","septiembre","octubre","noviembre","diciembre",""],"IsReadOnly":true,"NativeCalendarName":"calendario gregoriano","AbbreviatedMonthGenitiveNames":["ene","feb","mar","abr","may","jun","jul","ago","sep","oct","nov","dic",""],"MonthGenitiveNames":["enero","febrero","marzo","abril","mayo","junio","julio","agosto","septiembre","octubre","noviembre","diciembre",""]},"eras":[1,"d.C.",null,0]};//]]>
</script>
<script src="/peoploEL/ScriptResource.axd?d=oxaJQOalmF_Pc9FHyAFTk_k6TF1NEbUrjIYsB44pk6WCbYo_nSIw4yk5tC2xEtvEorNRA5gOfFsIU4ZnWzjKxobYxQm7qlMyDI-yMbMSd2l6ZDbJap8N8TY6mfiS7PCqS0ZD_N1nysIMDoEuJENdCQ2&t=23c9c237" type="text/javascript"></script>
<script type="text/javascript">
//<![CDATA[
if (typeof(Sys) === 'undefined') throw new Error('ASP.NET Ajax client-side framework failed to load.');
//]]>
</script>
<div class="aspNetHidden">
<input type="hidden" name="__SCROLLPOSITIONX" id="__SCROLLPOSITIONX" value="0" />
<input type="hidden" name="__SCROLLPOSITIONY" id="__SCROLLPOSITIONY" value="0" />
<input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="/wEdAArkW6hVSYy1X/RA+Sj0CGQLGp+bdMCDYaJlV2GIWm9IvBdcfX0kLMsTvDhzcFP+5BCmu+5iWjvwd5K06ry8EbPN8eAu30BFMFNpn4fF9w5RD0sfx0Rt1Zoo22r6RgHWIEvbk+/Q0viP1b4fioHhV6vuLByhWnJD/fsZOTyD54nbDa+qASD48033XmTIh5CNr4axLA/MabVFryGhaiI+QVUeJtZhbNAXh60wJUXNyENePpp0PUjhju74p8tImEJGpMk=" />
</div>
<TABLE id="Table1" border="0" cellSpacing="0" cellPadding="0" width="80%" align="center"
height="72%">
<TR>
<TD height="25" vAlign="top" width="165" align="center"></TD>
<TD height="25" width="10"></TD>
<TD height="25" vAlign="top"></TD>
</TR>
<TR>
<TD vAlign="top" width="165" align="center">
<LINK rel="stylesheet" type="text/css" href="EstilosWeb.css">
<LINK rel="stylesheet" type="text/css" href="EstilosWeb.css">
<TABLE style="WIDTH: 160px; HEIGHT: 64px" id="tMain" class="main" cellPadding="0" width="160">
<TR vAlign="top">
<TD id="NavTd">
<DIV id="Nav">
<H4 align="center">Menu
<table id="PanelIzquierdoUC1_htbCategorias" cellspacing="0" cellpadding="0" style="border-width:0px;width:160px;border-collapse:collapse;">
<tr>
<td><a id="PanelIzquierdoUC1_ConsultarLiquidacion" title="Consulta de Liquidación" href="Default.aspx?IdControl=ConsultaLiquidacionFltUC">Consultar Liquidación</a></td>
</tr><tr>
<td><a id="PanelIzquierdoUC1_Reportes" title="Certificado Ing. y Ret." href="Default.aspx?IdControl=ReportesUC">Certificado Ing. y Ret.</a></td>
</tr><tr>
<td><a id="PanelIzquierdoUC1_CambiarClave" title="Cambio de Clave" href="Default.aspx?IdControl=CambioClaveUC">Cambio de Clave</a></td>
</tr><tr>
<td><a id="PanelIzquierdoUC1_ReportesGeneral" title="Reportes" href="Default.aspx?IdControl=SolicitarReporteUC&TipoProceso=G">Reportes</a></td>
</tr><tr>
<td><a id="PanelIzquierdoUC1_CerrarSesion" title="Cerrar Sesion" href="Default.aspx?IdControl=CerrarSesionUC">Cerrar Sesion</a></td>
</tr>
</table></H4>
</DIV>
</TD>
</TR>
</TABLE>
</TD>
<td width="10"> </td>
<TD vAlign="top">
<div id="pnlCargaUserControl" style="width:100%;">
<LINK href="EstilosWeb.css" type="text/css" rel="stylesheet">
<style type="text/css">
.style1
{
height: 26px;
width: 36px;
}
</style>
<TABLE class="FormaTabla" id="Table1" cellSpacing="1" cellPadding="1" width="300" border="0">
<TR>
<TD class="FormaEncabezado" colSpan="2">Reportes</TD>
</TR>
<TR>
<TD colSpan="2">
<P align="center"> </P>
</TD>
</TR>
<TR>
<TD colSpan="2"><select size="4" name="ctl06$lstReportes" onchange="javascript:setTimeout('__doPostBack(\'ctl06$lstReportes\',\'\')', 0)" id="ctl06_lstReportes" class="FormaInfo" style="height:215px;width:564px;">
<option selected="selected" value="10095">Certificado de historia laboral FPM</option>
</select></TD>
</TR>
<TR>
<TD colSpan="2">Parametros</TD>
</TR>
<TR>
<TD style="HEIGHT: 45px" colSpan="2"><table id="ctl06_tbParametros" rules="all" border="1">
<tr>
<td>Empleado</td><td><input name="ctl06$txtParam_1" type="text" value="139211" readonly="readonly" onchange="javascript:setTimeout('__doPostBack(\'ctl06$txtParam_1\',\'\')', 0)" onkeypress="if (WebForm_TextBoxKeyHandler(event) == false) return false;" id="ctl06_txtParam_1" Tabla="Empleado_VIPP" CodigoCampo="CodEmpleado" DescripcionCampo="Empleado" Condicion="" TipoDato="N" Parametro="Empleado" /><input type="submit" name="ctl06$btnParam_1" value="..." id="ctl06_btnParam_1" disabled="disabled" class="aspNetDisabled" onclick="javascript:return BuscarConPostBack('Empleado_VIPP','CodEmpleado','Empleado','','ctl06_txtParam_1','ctl06_txtDesc_1');" style="width:25px;" /></td><td><input name="ctl06$txtDesc_1" type="text" value="JUAN DE LOS PALOTES" readonly="readonly" id="ctl06_txtDesc_1" style="width:250px;" /></td>
</tr>
</table></TD>
</TR>
<TR>
<TD class="style1">
</TD>
<td>
<P align="center"><select name="ctl06$ddlFormato" id="ctl06_ddlFormato" style="width:104px;">
<option value="PDF">PDF</option>
</select> <input type="submit" name="ctl06$btnAceptar" value="Aceptar" id="ctl06_btnAceptar" />
</P>
</td>
</TR>
<TR>
<TD colSpan="2">
<P align="left"><span id="ctl06_lblMensaje" style="color:Red;font-family:Arial;"></span></P>
</TD>
</TR>
</TABLE>
<P>
<input type="submit" name="ctl06$ButActualizar" value="Actualizar" id="ctl06_ButActualizar" /></P>
<P><table class="FormaGrid" cellspacing="0" rules="all" border="1" id="ctl06_dtgDatos" style="border-collapse:collapse;">
<tr>
<td> </td><td>CodPeticion</td><td>FechaHora</td><td>Peticion</td><td>Estado</td><td>DetalleEstado</td>
</tr><tr>
<td style="white-space:nowrap;">
<a id="ctl06_dtgDatos_ctl03_cmdVer" href="javascript:__doPostBack('ctl06$dtgDatos$ctl03$cmdVer','')">Ver</a>
</td><td>9842466</td><td>04/07/2017</td><td>Certificado(139211,)</td><td>T</td><td>Terminado</td>
</tr><tr>
<td colspan="6"><span>1</span></td>
</tr>
</table></P>
</div>
</TD>
</TR>
</TABLE>
PYTHON:
form2 = browser.get_form(id='Form1')
form2["ctl06$txtParam_1"].value = '139211'
form2["ctl06$txtDesc_1"].value = 'JUAN DE LOS POTES'
form2["ctl06$ddlFormato"].value = 'PDF'
form2["ctl06$lstReportes"].value = '10095'
form2["__EVENTTARGET"].value = 'ctl06$dtgDatos$ctl03$cmdVer'
form2["__EVENTARGUMENT"].value = ''
browser.submit_form(signin2)
Use python request lib for that
Create Json and pass it through the headers and remember <__EVENTTARGET>
<__EVENTARGUMENT> This previous <> mention parameter always changing after few minute (based on website).
It Will easy if you use POST method and for before sending request check it in POSTMAN once.
header = {
"ctl00$ContentPlaceHolder1$txt_tradename": str(index),
"ctl00$ContentPlaceHolder1$txtSearchTin": "",
"ctl00$ContentPlaceHolder1$ddl_dist": 2,
"ctl00$ContentPlaceHolder1$btnDlrSearch": "Search",
"__EVENTVALIDATION": token.get("__EVENTVALIDATION", "")
, "__VIEWSTATEGENERATOR": token.get("__VIEWSTATEGENERATOR"),
"__VIEWSTATE": token.get("__VIEWSTATE")
}
try:
req = requests.post(url, header)

Parsing HTML with BeautifulSoup in Python

I am trying to parse HTML with Python using BeautifulSoup, but I can't manage to get what I need.
This is a little module of a personal app I want to do, and it consists in a web login part with credentials, and once the script is logged in the web, I need to parse some information in order to manage it and process it.
The HTML code after getting logged is:
<div class="widget_title clearfix">
<h2>Account Balance</h2>
</div>
<div class="widget_body">
<div class="widget_content">
<table class="simple">
<tr>
<td>Daily Earnings</td>
<td style="text-align: right; width: 125px; color: #119911; font-weight: bold;">
150
</td>
</tr>
<tr>
<td>Weekly Earnings</td>
<td style="text-align: right; border-bottom: 1px solid #000; color: #119911; font-weight: bold;">
500 </td>
</tr>
<tr>
<td>Monthly Earnings</td>
<td style="text-align: right; color: #119911; font-weight: bold;">
1500 </td>
</tr>
<tr>
<td>Total expended</td>
<td style="text-align: right; border-bottom: 1px solid #000; color: #880000; font-weight: bold;">
430 </td>
</tr>
<tr>
<td>Account Balance</td>
<td style="text-align: right; border-bottom: 3px double #000; color: #119911; font-weight: bold;">
840 </td>
</tr>
<tr>
<td></td>
<td style="padding: 5px;">
<center>
<form id="request_bill" method="POST" action="index.php?page=dashboard">
<input type="hidden" name="secret_token" value="" />
<input type="hidden" name="request_payout" value="1" />
<input type="submit" class="btn blue large" value="Request Payout" />
</form>
</center>
</td>
</tr>
</table>
</div>
</div>
</div>
As you can see, it's not a very well-formatted HTML, but I'd need to extract the elements and their values, I mean, for example: "Daily earnings" and "150" | "Weekly earnings" and "500"...
I think that the "id" attribute may help, but when I try to parse it, it crashes.
The Python code I'm working with is:
def parseo(archivohtml):
html = archivohtml
parsed_html = BeautifulSoup(html)
par = parsed_html.find('td', attrs={'id':'west1'}).string
print par
Where archivohtml is the saved html file after logging in the web
When I run the script, I only get errors.
I've also tried doing this:
def parseo(archivohtml):
soup = BeautifulSoup()
html = archivohtml
parsed_html = soup(html)
par = soup.parsed_html.find('td', attrs={'id':'west1'}).string
print par
But the result is still the same.
The tag with id="west1" is an <a> tag. You are looking for the <td> tag that comes after this <a> tag:
import BeautifulSoup as bs
content = '''<div class="widget_title clearfix">
<h2>Account Balance</h2>
</div>
<div class="widget_body">
<div class="widget_content">
<table class="simple">
<tr>
<td>Daily Earnings</td>
<td style="text-align: right; width: 125px; color: #119911; font-weight: bold;">
150
</td>
</tr>
<tr>
<td>Weekly Earnings</td>
<td style="text-align: right; border-bottom: 1px solid #000; color: #119911; font-weight: bold;">
500 </td>
</tr>
<tr>
<td>Monthly Earnings</td>
<td style="text-align: right; color: #119911; font-weight: bold;">
1500 </td>
</tr>
<tr>
<td>Total expended</td>
<td style="text-align: right; border-bottom: 1px solid #000; color: #880000; font-weight: bold;">
430 </td>
</tr>
<tr>
<td>Account Balance</td>
<td style="text-align: right; border-bottom: 3px double #000; color: #119911; font-weight: bold;">
840 </td>
</tr>
<tr>
<td></td>
<td style="padding: 5px;">
<center>
<form id="request_bill" method="POST" action="index.php?page=dashboard">
<input type="hidden" name="secret_token" value="" />
<input type="hidden" name="request_payout" value="1" />
<input type="submit" class="btn blue large" value="Request Payout" />
</form>
</center>
</td>
</tr>
</table>
</div>
</div>
</div>'''
def parseo(archivohtml):
html = archivohtml
parsed_html = bs.BeautifulSoup(html)
par = parsed_html.find('a', attrs={'id':'west1'}).findNext('td')
print par.string.strip()
parseo(content)
yields
150
I can't tell from your question if this will be applicable to you, but here's another method:
def parseo(archivohtml):
html = archivohtml
parsed_html = BeautifulSoup(html)
for line in parsed_html.stripped_strings:
print line.strip()
which yields:
Account Balance
Daily Earnings
150
Weekly Earnings
500
Monthly Earnings
1500
Total expended
430
Account Balance
840
And if you wanted the data in a list:
data = [line.strip() for line in parsed_html.stripped_strings]
[u'Account Balance', u'Daily Earnings', u'150', u'Weekly Earnings', u'500', u'Monthly Earnings', u'1500', u'Total expended', u'430', u'Account Balance', u'840']

Categories

Resources