I am trying to call my method which is returning values. I want to fetch the values and use them in my report.
#api.one
def check_month(self,record,res):
fd = datetime.strptime(str(record.from_date), "%Y-%m-%d")
for rec in record.sales_record_ids:
res.append(rec.jan_month)
#api.one
def get_sales_rec(self):
result=[]
target_records = self.env['sales.target'].search([('sales_team','=', self.sales_team_ids.id)])
for rec in target_records:
self.check_month(rec,result)
return result
like this in xml:
<tbody>
<tr t-foreach="get_sales_rec()" t-as="data">
<tr>
<td>
<span t-esc="data[0]" />
</td>
</tr>
</tr>
</tbody>
Change your xml code to:
<tbody>
<tr t-foreach="o.get_sales_rec()" t-as="data">
<tr>
<td>
<span t-esc="data[0]" />
</td>
</tr>
</tr>
</tbody>
Here o stands for the report model object , so make sure you have added a python method in the same object.
Related
is it possible to capture all EAN numbers in such a construct using XPath, or do I need to use regular expressions?
<table>
<tr>
<td>
EAN Giftbox
</td>
<td>
7350034654483
</td>
</tr>
<tr>
<td>
EAN Export Carton:
</td>
<td>
17350034643958
</td>
</tr>
</table>
I want to get a list of ['7350034654483', '17350034643958']
from lxml import html as lh
html = """<table>
<tr>
<td>
EAN Giftbox
</td>
<td>
7350034654483
</td>
</tr>
<tr>
<td>
EAN Export Carton:
</td>
<td>
17350034643958
</td>
</tr>
</table>
"""
root = lh.fragment_fromstring(html)
tds = root.xpath('//tr[*]/td[2]')
for td in tds:
print(td.text.strip())
Output:
7350034654483
17350034643958
Within each of the main tables respectively, there are two tables nested of which the first one contains the data A_A_A_A that i want to extract to a pandas.dataframe
<table>
<tr valign="top">
<td> </td>
<td>
<br/>
<center>
<h2>asd</h2>
</center>
<h4>asd</h4>
<table>
<tr>
</tr>
</table>
<table border="0" cellpadding="0" cellspacing="0" class="tabcol" width="100%">
<tr>
<td> </td>
</tr>
<tr>
<td width="3%"> </td>
<td>
<table border="0" width="100%">
<tr>
<td width="2%"> </td>
<td> A_A_A_A <br/> A_A_A_A 111-222<br/> </td>
<td width="2%"> </td>
</tr>
</table>
</td>
<td width="3%"> </td>
</tr>
<tr>
<td width="3%"> </td>
<td>
<table border="0" cellpadding="0" cellspacing="0" width="100%">
<tr>
<td width="4%"> </td>
<td class="unique"> asd <br/> asd </td>
<td width="4%"> </td>
</tr>
</table>
</td>
<td width="3%"> </td>
</tr>
<tr>
<td> </td>
</tr>
</table>
<table border="0" cellpadding="0" cellspacing="0" class="tabcol" width="100%">
.
.
.
</table>
<br/>
<table>
</table>
</td>
</tr>
</table>
I figured that because of the limited availiability of attributes the only way to go forward would be an iteration over a td siblings with .next_siblings and if needed .next_elements
data1 = []
for item in soup.find_all('td', attrs={'width': '2%'}):
data = item.find_next_sibling().text
data1.append(data)
returns and empty list []. Now i dont know forward because i cannot identify any other helpful attributes/classes that would help me get to the middle td that contains the information.
.find_next(name=None, attrs={}, text=None, **kwargs)
Returns the first item that matches the given criteria and appears after this Tag in the document. So in your case:
item = soup.find('td', attrs={'width': '2%'})
data = item.find_next('td').text
Note that, I removed for loop since the desired data is coming after first td with width: '2%'. After running this, data will be:
' A_A_A_A A_A_A_A 111-222 '
I took #Wiktor Stribiżew answer from here regex for loop over list in python
and kind of merged it with yours #Rustam Garayev
item = soup.find_all('td', attrs={'width': '2%'})
data = [x.find_next('td').text for x in item]
since i needed not only the first AAAA but from all the following tables as well. The code above gives this output:
['A_A_A_A',
'\xa0',
'A_A_A_A',
'\xa0', ...]
which is good enough for my purpose. I think the '\xa0' comes from it trying to do the find_next on the third td sibling, which does not have a consecutive.
I am using scrapy to extract data.
There are thousands of product which i am scraping
The problem is the data on these pages is not consistent
ie.
<table class="c999 fs12 mt10 f-bold">
<tbody><tr>
<td width="16%">Type</td>
<td class="c222">Kurta</td>
</tr>
<tr>
<td>Fabric</td>
<td class="c222">Cotton</td>
</tr>
<tr>
<td>Sleeves</td>
<td class="c222">3/4th Sleeves</td>
</tr>
<tr>
<td>Neck</td>
<td class="c222">Mandarin Collar</td>
</tr>
<tr>
<td>Wash Care</td>
<td class="c222">Gentle Wash</td>
</tr>
<tr>
<td>Fit</td>
<td class="c222">Regular</td>
</tr>
<tr>
<td>Length</td>
<td class="c222">Knee Length</td>
</tr>
<tr>
<td>Color</td>
<td class="c222">Brown</td>
</tr>
<tr>
<td>Fabric Details</td>
<td class="c222">Cotton</td>
</tr>
<tr>
<td>
Style </td>
<td class="c222"> Printed</td>
</tr>
<tr>
<td>
SKU </td>
<td id="qa-sku" class="c222"> SR227WA70ROJINDFAS</td>
</tr>
<tr>
<td></td>
</tr>
</tbody></table>
So these rows are not consistent .
Sometimes the "Type" is at first position and sometimes it is at second.
I wrote the code to loop through the values and compare the value of 1st td if it is "Type" the get the value of its corresponding td but it is not working
Here is the code.
table_data = response.xpath('//*[#id="productInfo"]/table/tr')
for data in table_data:
name = data.xpath('td/text()').extract()
What should i do??
You can try using the following xpath :
name = data.xpath("td[position()=(count(../../tr/td[.='Type']/preceding-sibling::td)+1)]/text()").extract()
Above xpath filters <td> by position, returning only <td> in position equal to position of <td>Type</td>. Getting position of <td>Type</td> done by counting number of it's preceding sibling <td> plus one.
If you want to get sibling node of td containing string 'Type' no matter what is position of this td you can try following xpath:
//td[contains(text(),'Type')]/following-sibling::td/text()
Try this,
In [29]: response.xpath('//table[#class="c999 fs12 mt10 f-bold"]/tr[contains(td/text(), "Type")]/td[contains(text(), "Type")]/following-sibling::td/text()|//table[#class="c999 fs12 mt10 f-bold"]/tr[contains(td/text(), "Type")]/td[contains(text(), "Type")]/preceding-sibling::td/text()').extract()
Out[29]: [u'Kurta']
no matter whether td is coming after Type or before Type, This will work.
//table/tbody/tr/td[.="Fabric"]/../td[2]/text()
Did it with the above code
I am parsing a HTML file with BeautifulSoup and got stuck with < br> tags.
I want to append < br> tag after inserting a list element, but it didn't work.
What is the easiest way to do this?
soup = BeautifulSoup(open("test.html"))
mylist = [Item_1,Item_2]
for i in range(len(mylist)):
#insert Items to the 4. column
This is the default HTML:
<html>
<body>
<table>
<tr>
<th>
1. Column
</th>
<th>
2. Column
</th>
<th>
3. Column
</th>
<th>
4. Column
</th>
<th>
5. Column
</th>
<th>
6. Column
</th>
<th>
7. Column
</th>
<th>
8. Column
</th>
</tr>
<tr class="a">
<td class="h">
Text in first column
</td>
<td>
<br/>
</td>
<td>
<br/>
</td>
<td>
<!--I want to insert items here-->
</td>
<td>
1
</td>
<td>
37
</td>
<td>
38
</td>
<td>
38
</td>
</tr>
</table>
</body>
</html>
This is the HTML i want to make
<html>
<body>
<table>
<tr>
<th>
1. Column
</th>
<th>
2. Column
</th>
<th>
3. Column
</th>
<th>
4. Column
</th>
<th>
5. Column
</th>
<th>
6. Column
</th>
<th>
7. Column
</th>
<th>
8. Column
</th>
</tr>
<tr class="a">
<td class="h">
Text in first column
</td>
<td>
<br/>
</td>
<td>
<br/>
</td>
<td>
Item_1 <br>
Item_2
</td>
<td>
1
</td>
<td>
37
</td>
<td>
38
</td>
<td>
38
</td>
</tr>
</table>
</body>
</html>
To append a tag, first create it with the new_tag() factory function, like so:
soup.td.append(soup.new_tag('br'))
Consider the following program. For every table cell (that is, every td) in the html, it appends a <br/> tag and some text to the cell.
from bs4 import BeautifulSoup
html_doc = '''
<html>
<body>
<table>
<tr>
<td>
data1
</td>
<td>
data2
</td>
</tr>
</table>
</body>
</html>
'''
soup = BeautifulSoup(html_doc)
mylist = ['addendum 1', 'addendum 2']
for td,item in zip(soup.find_all('td'), mylist):
td.append(soup.new_tag('br'))
td.append(item)
print soup.prettify()
Result:
<html>
<body>
<table>
<tr>
<td>
data1
<br/>
addendum 1
</td>
<td>
data2
<br/>
addendum 2
</td>
</tr>
</table>
</body>
</html>
I am new to selenium (PYTHON) and stuck at one point and need help from experts.
My html is some thing like this:
<div id='d3_tree'>
<svg>
<g transform="translate(20,50)>
<g class='node'>
<foreignobject></foreignobject>
<original_title>
<table>
<tbody>
<tr>
<td>t1key1</td>
<td>t1val1</td>
</tr>
<tr>
<td>t1key2</td>
<td>t1val2</td>
</tr>
<tr>
<td>t1key3</td>
<td>t1val3</td>
</tr>
</tbody>
</table>
</original_title>
</g>
<g class='node pe_node'>
<foreignobject></foreignobject>
<original_title>
<table>
<tbody>
<tr>
<td>t2key1</td>
<td>t2val1</td>
</tr>
<tr>
<td>t2key2</td>
<td>t2val2</td>
</tr>
<tr>
<td>t2key3</td>
<td>t2val3</td>
</tr>
</tbody>
</table>
</original_title>
</g>
<g class='node pe_node'>
<foreignobject></foreignobject>
<original_title>
<table>
<tbody>
<tr>
<td>t3key1</td>
<td>t3val1</td>
</tr>
<tr>
<td>t3key2</td>
<td>t3val2</td>
</tr>
<tr>
<td>t3key3</td>
<td>t3val3</td>
</tr>
</tbody>
</table>
</original_title>
</g>
</g>
</svg>
</div>
what I need is all the elements of having class node.pe_node and inside every node.pe node element I need the text of third row second column of table. (t2val3 t3val3)
I am able to get the elements having class node.pe_node
pe_nodes = self.driver.find_elements_by_css_selector(".node.vm.node_pe")
Now I am iterating over the pe_nodes to get the the value of third column by
for node in pe_nodes:
petext = node.find_element(By.XPATH, "//tr[3]/td[2]").text //not working
petext = node.find_element(By.XPATH, "//tr[3]/td[2]").get_text() //not working
Can any one please guide me as how to get the required text? Is there a way refer the table column inside every node element?
I have found the way
name = node.find_element_by_xpath(".//tr[2]/td[2]")text
import bs4
soup = bs4.BeautifulSoup(html)
node_pe = [s for s in soup.find_all('g')
if 'pe_node' in s.attrs.get('class', [])]
col_texts = [s.find_all('tr')[2].find_all('td')[1].text
for s in node_pe]
print col_texts
It produces:
[u't2val3', u't3val3']