This is a link to HTML I want to scrape
https://pk.khaadi.com/unstitched/r20206-red-r20206-red-pk.html
<div class="swatch-attribute-options clearfix">
<div class="swatch-option color selected" option-type="1" option-
id="61" option-label="RED" option-tooltip-thumb="" option-tooltip-
value="#ee0000" "="" style="background: #ee0000 no-repeat center;
background-size: initial;">
</div>
<div class="swatch-option color selected" option-type="1" option-
id="73" option-label="YELLOW" option-tooltip-thumb="" option-tooltip-
value="#feed00" "="" style="background: #feed00 no-repeat center;
background-size: initial;">
</div>
</div>
Color = S_Driver.find_elements_by_xpath( '//*[#id="product-options-wrapper"]/div/div/div[1]/div' )
The Xpath is of the outer div in which both color div are present
for c in Color:
n_Color.append(c.get_attribute( 'option-label' ))
print( n_Color + '\n' )
This how i tried to extract the color through 'option-label' attribute
Change the xpath with:
//div[#class='swatch-option color']
Created based on the provided screenshot, hope that there are no other matches on page based on this one. If so, change it with:
//div[#class='swatch-option color' and #option-type='1']
I'm new to selenium. I'm trying to insert add text in the INSERT HTML HERE section. The code:
<iframe src="" frameborder="0" class="cke_wysiwyg_frame cke_reset" style="width: 100%; height: 100%;" title="editor, postContent" aria-describedby="cke_93" tabindex="0" allowtransparency="true">
#document
<html dir="rtl" lang="he">
<head>
<title data-cke-title="editor, postContent">editor, postContent</title>
<style data-cke-temp="1">html{cursor:text;*cursor:auto}
img,input,textarea{cursor:default}
</style>
<link type="text/css" rel="stylesheet" href="[SOME_URL]/Blogs/ckeditor/contents.css?t=F969">
<style data-cke-temp="1">.cke_editable{cursor:text}.cke_editable img,.cke_editable input,.cke_editable textarea{cursor:default}
img.cke_flash{background-image: url([SOME_URL]/Blogs/ckeditor/plugins/flash/images/placeholder.png?t=F969);background-position: center center;background-repeat: no-repeat;border: 1px solid #a9a9a9;width: 80px;height: 80px;}
.cke_editable form{border: 1px dotted #FF0000;padding: 2px;}
img.cke_hidden{background-image: url([SOME_URL]/Blogs/ckeditor/plugins/forms/images/hiddenfield.gif?t=F969);background-position: center center;background-repeat: no-repeat;border: 1px solid #a9a9a9;width: 16px !important;height: 16px !important;}
img.cke_iframe{background-image: url([SOME_URL]/Blogs/ckeditor/plugins/iframe/images/placeholder.png?t=F969);background-position: center center;background-repeat: no-repeat;border: 1px solid #a9a9a9;width: 80px;height: 80px;}
.cke_contents_ltr a.cke_anchor,.cke_contents_ltr a.cke_anchor_empty,.cke_editable.cke_contents_ltr a[name],.cke_editable.cke_contents_ltr a[data-cke-saved-name]{background:url([SOME_URL]/Blogs/ckeditor/plugins/link/images/anchor.png?t=F969) no-repeat left center;border:1px dotted #00f;background-size:16px;padding-left:18px;cursor:auto;}.cke_contents_ltr img.cke_anchor{background:url([SOME_URL]/Blogs/ckeditor/plugins/link/images/anchor.png?t=F969) no-repeat left center;border:1px dotted #00f;background-size:16px;width:16px;min-height:15px;height:1.15em;vertical-align:text-bottom;}.cke_contents_rtl a.cke_anchor,.cke_contents_rtl a.cke_anchor_empty,.cke_editable.cke_contents_rtl a[name],.cke_editable.cke_contents_rtl a[data-cke-saved-name]{background:url([SOME_URL]/Blogs/ckeditor/plugins/link/images/anchor.png?t=F969) no-repeat right center;border:1px dotted #00f;background-size:16px;padding-right:18px;cursor:auto;}.cke_contents_rtl img.cke_anchor{background:url([SOME_URL]/Blogs/ckeditor/plugins/link/images/anchor.png?t=F969) no-repeat right center;border:1px dotted #00f;background-size:16px;width:16px;min-height:15px;height:1.15em;vertical-align:text-bottom;}
div.cke_pagebreak{background:url([SOME_URL]/Blogs/ckeditor/plugins/pagebreak/images/pagebreak.gif?t=F969) no-repeat center center !important;clear:both !important;width:100% !important;border-top:#999 1px dotted !important;border-bottom:#999 1px dotted !important;padding:0 !important;height:7px !important;cursor:default !important;}
.cke_show_blocks h6:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks h5:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks h4:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks h3:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks h2:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks h1:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks blockquote:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks address:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks pre:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks div:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks p:not([contenteditable=false]):not(.cke_show_blocks_off){background-repeat:no-repeat;border:1px dotted gray;padding-top:8px}.cke_show_blocks h6:not([contenteditable=false]):not(.cke_show_blocks_off){background-image:url([SOME_URL]/Blogs/ckeditor/plugins/showblocks/images/block_h6.png?t=F969)}.cke_show_blocks h5:not([contenteditable=false]):not(.cke_show_blocks_off){background-image:url([SOME_URL]/Blogs/ckeditor/plugins/showblocks/images/block_h5.png?t=F969)}.cke_show_blocks h4:not([contenteditable=false]):not(.cke_show_blocks_off){background-image:url([SOME_URL]/Blogs/ckeditor/plugins/showblocks/images/block_h4.png?t=F969)}.cke_show_blocks h3:not([contenteditable=false]):not(.cke_show_blocks_off){background-image:url([SOME_URL]/Blogs/ckeditor/plugins/showblocks/images/block_h3.png?t=F969)}.cke_show_blocks h2:not([contenteditable=false]):not(.cke_show_blocks_off){background-image:url([SOME_URL]/Blogs/ckeditor/plugins/showblocks/images/block_h2.png?t=F969)}.cke_show_blocks h1:not([contenteditable=false]):not(.cke_show_blocks_off){background-image:url([SOME_URL]/Blogs/ckeditor/plugins/showblocks/images/block_h1.png?t=F969)}.cke_show_blocks blockquote:not([contenteditable=false]):not(.cke_show_blocks_off){background-image:url([SOME_URL]/Blogs/ckeditor/plugins/showblocks/images/block_blockquote.png?t=F969)}.cke_show_blocks address:not([contenteditable=false]):not(.cke_show_blocks_off){background-image:url([SOME_URL]/Blogs/ckeditor/plugins/showblocks/images/block_address.png?t=F969)}.cke_show_blocks pre:not([contenteditable=false]):not(.cke_show_blocks_off){background-image:url([SOME_URL]/Blogs/ckeditor/plugins/showblocks/images/block_pre.png?t=F969)}.cke_show_blocks div:not([contenteditable=false]):not(.cke_show_blocks_off){background-image:url([SOME_URL]/Blogs/ckeditor/plugins/showblocks/images/block_div.png?t=F969)}.cke_show_blocks p:not([contenteditable=false]):not(.cke_show_blocks_off){background-image:url([SOME_URL]/Blogs/ckeditor/plugins/showblocks/images/block_p.png?t=F969)}.cke_show_blocks.cke_contents_ltr h6:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_ltr h5:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_ltr h4:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_ltr h3:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_ltr h2:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_ltr h1:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_ltr blockquote:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_ltr address:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_ltr pre:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_ltr div:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_ltr p:not([contenteditable=false]):not(.cke_show_blocks_off){background-position:top left;padding-left:8px}.cke_show_blocks.cke_contents_rtl h6:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_rtl h5:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_rtl h4:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_rtl h3:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_rtl h2:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_rtl h1:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_rtl blockquote:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_rtl address:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_rtl pre:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_rtl div:not([contenteditable=false]):not(.cke_show_blocks_off),.cke_show_blocks.cke_contents_rtl p:not([contenteditable=false]):not(.cke_show_blocks_off){background-position:top right;padding-right:8px}
.cke_show_borders table.cke_show_border,.cke_show_borders table.cke_show_border > tr > td, .cke_show_borders table.cke_show_border > tr > th,.cke_show_borders table.cke_show_border > tbody > tr > td, .cke_show_borders table.cke_show_border > tbody > tr > th,.cke_show_borders table.cke_show_border > thead > tr > td, .cke_show_borders table.cke_show_border > thead > tr > th,.cke_show_borders table.cke_show_border > tfoot > tr > td, .cke_show_borders table.cke_show_border > tfoot > tr > th{border : #d3d3d3 1px dotted}
.cke_upload_uploading img{opacity: 0.3}
.cke_widget_wrapper{position:relative;outline:none}.cke_widget_inline{display:inline-block}.cke_widget_wrapper:hover>.cke_widget_element{outline:2px solid yellow;cursor:default}.cke_widget_wrapper:hover .cke_widget_editable{outline:2px solid yellow}.cke_widget_wrapper.cke_widget_focused>.cke_widget_element,.cke_widget_wrapper .cke_widget_editable.cke_widget_editable_focused{outline:2px solid #ace}.cke_widget_editable{cursor:text}.cke_widget_drag_handler_container{position:absolute;width:15px;height:0;left:-9999px;opacity:0.75;transition:height 0s 0.2s;line-height:0}.cke_widget_wrapper:hover>.cke_widget_drag_handler_container{height:15px;transition:none}.cke_widget_drag_handler_container:hover{opacity:1}img.cke_widget_drag_handler{cursor:move;width:15px;height:15px;display:inline-block}.cke_widget_mask{position:absolute;top:0;left:0;width:100%;height:100%;display:block}.cke_editable.cke_widget_dragging, .cke_editable.cke_widget_dragging *{cursor:move !important}
</style>
</head>
<body contenteditable="true" class="cke_editable cke_editable_themed cke_contents_rtl cke_show_borders" spellcheck="false">INSERT HTML HERE</body>
</html>
</iframe>
I'm struggling to understand how to use selenium with iframe. I already saw quite a few previous topics but could not understand from them how to solve my specific problem. It looks like the html should be inserted into the body tag which is in the iframe tag. I want the iframe to render the HTML. I tried:
body = driver.find_element_by_class_name("cke_editable")
But got:
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":".cke_editable"}
How should I do it?
EDIT: I probably should have explained the issue better. I the text box looks as following:
If I use the text-box without clicking the Source button then if I insert HTML it will treat it as text. But if I switch to Source mode and insert HTML and then go back from the Source mode it will display the HTML. The code I showed above is when i'm not in source code. I was suggested by #0buz to do:
iframe=driver.find_element_by_xpath("//iframe[#title='editor, postContent']")
driver.switch_to.frame(iframe)
body=driver.find_element_by_xpath("//body[#contenteditable='true']")
driver.execute_script("arguments[0].innerText = 'INSERT HTML HERE'", body)
And it worked! But it inserted text and for HTML it does not work. When I switch to the Source mode I get:
<div id="cke_74_contents" class="cke_contents cke_reset" role="presentation" style="height: 350px;">
<textarea dir="ltr" class="cke_source cke_reset cke_enable_context_menu cke_editable cke_editable_themed cke_contents_rtl" style="width: 100%; height: 100%; resize: none; outline: none; text-align: left; tab-size: 4;" tabindex="0" role="textbox" aria-label="editor, postContent" title="editor, postContent" aria-describedby="cke_165"></textarea>
<span id="cke_165" class="cke_voice_label">Press</span>
</div>
But if I enter text when I in Source mode, it does not show my in the Chrome's source code where it's being changed. Only when I in the non-source mode it shows my that that text is in the body. My goal is to insert HTML and not text. Is there a way to achieve it?
Selenium doesn't support multiple class name using find_element_by_class_name('cke_wysiwyg_frame cke_reset') instead use css selector.
I would suggest induce WebDriverWait() and frame_to_be_available_and_switch_to_it() and following css selector.
and induce WebDriverWait() and visibility_of_element_located() and following css selector.
WebDriverWait(driver,10).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe.cke_wysiwyg_frame.cke_reset")))
element=WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.CSS_SELECTOR,'body.cke_editable.cke_editable_themed.cke_contents_rtl.cke_show_borders')))
element.send_keys("test here")
You need to import following libraries.
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
First you would switch to the iframe:
iframe=driver.find_element_by_xpath("//iframe[#title='editor, postContent']")
driver.switch_to.frame(iframe)
You should now be able to interact with the element. One way to update it is:
body=driver.find_element_by_xpath("//body[#contenteditable='true']")
driver.execute_script("arguments[0].innerText = 'insert text here'", body)
Edit:
Adding html node with js example:
script = '''
var new_el = document.createElement('div');
var text_value = document.createTextNode("Text for new element.");
new_el.appendChild(text_value);
var body_tag=document.getElementsByClassName('cke_editable cke_editable_themed cke_contents_rtl cke_show_borders');
body_tag[0].appendChild(new_el);
'''
driver.execute_script(script)
Or - injecting html structure as innerHTML (the node structure below is obviously just an example for you to replace; it is added line by line to the variable called 'html'):
script = '''
var html = '<div id="div1">text</div>';
html += 'bla';
html += '<div id="dvi2"><div id="div3"></div></div>';
html += '>>>more html nodes etc<<<';
var body_tag=document.getElementsByClassName('cke_editable cke_editable_themed cke_contents_rtl cke_show_borders');
body_tag[0].innerHTML = html;
'''
driver.execute_script(script)
I'm trying to make an automation script for tests. My issue is that I can't select a value from drop down menu. I've try a lot of things , but can't make it. My goal is the script to choose every time different value from the menu. When I click the hidden menu , it create a 'ul class' with about 100 'li classes' . There are no id, name or class. I don't know how to reach the element there and click it.
Things that I've tried...
elem = driver.find_element_by_xpath('/html/body/div[3]/div[3]')
all_li = elem.find_elements_by_tag_name("li")
gg = random.choice(all_li)
gg = driver.find_element_by_css_selector("ul > li:nth-child(15)").click()
html code, this is what generate html when hit menu
This is my code:
driver.find_element_by_xpath("/html/body/div[1]/div/main/div/div[2]/form/div[2]/div[1]/div/div/div").click()
Simple html:
<div class="MuiPaper-root MuiMenu-paper MuiPaper-elevation8 MuiPopover-paper MuiPaper-rounded" role="document" tabindex="-1" style="opacity: 1; transform: none; min-width: 491px; transition: opacity 381ms cubic-bezier(0.4, 0, 0.2, 1) 0ms, transform 254ms cubic-bezier(0.4, 0, 0.2, 1) 0ms; top: 80px; left: 16px; transform-origin: -1px 478.513px;">
<ul class="MuiList-root MuiMenu-list MuiList-padding" role="listbox" tabindex="-1" style="padding-right: 17px; width: calc(100% + 17px);">
<li class="MuiButtonBase-root MuiListItem-root MuiMenuItem-root MuiMenuItem-gutters MuiListItem-gutters MuiListItem-button" tabindex="-1" role="option" aria-disabled="false" variant="outlined" data-value="testOne">Test One<span class="MuiTouchRipple-root"></span></li>
<li class="MuiButtonBase-root MuiListItem-root MuiMenuItem-root MuiMenuItem-gutters MuiListItem-gutters MuiListItem-button" tabindex="-1" role="option" aria-disabled="false" variant="outlined" data-value="testTwo">Test Two<span class="MuiTouchRipple-root"></span></li>
I'm trying to upload a image to https://www.alibaba.com/ through Selenium.
So i find the element which allows me to do that:
driver = webdriver.Chrome(r'C:\Users\migue\Desktop\WorkerBot\Drivers\chromedriver')
driver.maximize_window()
driver.get('https://www.alibaba.com/');
time.sleep(5)
#Open menu to upload image
wait = WebDriverWait(driver, 5)
x = True
while x:
x = False
try:
search_camara = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, 'i.ui-searchbar-imgsearch-icon')))
search_camara.click()
except:
x = True
driver.refresh()
time.sleep(5)
searcher1 = driver.find_element_by_xpath('//*[#id="J_SC_header"]/header/div[2]/div[2]/div/div/form/div[2]/div[3]/div[1]/div/div')
print(searcher1.get_attribute('innerHTML'))
When i print out searcher1 i get:
<div class="upload-btn-wrapper"><div class="upload-btn" style="z-index: 1;">Upload Image</div><div id="html5_1cu85jlnu116m14sle1omtch5s3_container" class="moxie-shim moxie-shim-html5" style="position: absolute; top: 14px; left: 183px; width: 109px; height: 28px; overflow: hidden; z-index: 0;"><input id="html5_1cu85jlnu116m14sle1omtch5s3" type="file" style="font-size: 999px; opacity: 0; position: absolute; top: 0px; left: 0px; width: 100%; height: 100%;" multiple="" accept="image/jpeg,image/png,image/bmp"></div>
Max 2MB per Image
That's the element i need to upload the image, but when i try to do the following:
Option 1
searcher1.find_element_by_class_name('.moxie-shim moxie-shim-html5')
Option 2
searcher1.find_element_by_class_name('upload-btn')
Option 3
searcher1.find_element_by_xpath('/div')
I get the following (for option 3, for example):
NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"/div"}
What's the problem? I'm stuck :(
For relative xpath, you need to put a . in front of it. Try:
searcher1.find_element_by_xpath('./div')
I am trying to use BeautifulSoup 4 to extract text from specific tags in an HTML Document. I have HTML that has a bunch of div tags like the following:
<div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:42px; top:90px; width:195px; height:24px;">
<span style="font-family: FIPXQM+Arial-BoldMT; font-size:12px">
Futures Daily Market Report for Financial Gas
<br/>
21-Jul-2015
<br/>
</span>
</div>
<div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:54px; top:135px; width:46px; height:10px;">
<span style="font-family: FIPXQM+Arial-BoldMT; font-size:10px">
COMMODITY
<br/>
</span>
</div>
I am trying to get the text from all span tags that are in any div tag that has a style of "left:54px".
I can get a single div if i use:
soup = BeautifulSoup(open(extracted_html_file))
print soup.find_all('div',attrs={"style":"position:absolute; border: textbox 1px solid; "
"writing-mode:lr-tb; left:42px; top:90px; "
"width:195px; height:24px;"})
It returns:
[<div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:42px; top:90px; width:195px; height:24px;"><span style="font-family: FIPXQM+Arial-BoldMT; font-size:12px">Futures Daily Market Report for Financial Gas
<br/>21-Jul-2015
<br/></span></div>]
But that only gets me the one div that exactly matches that styling. I want all divs that match only the "left:54px" style.
To do this, I've tried a few different ways:
soup = BeautifulSoup(open(extracted_html_file))
print soup.find_all('div',style='left:54px')
print soup.find_all('div',attrs={"style":"left:54px"})
print soup.find_all('div',attrs={"left":"54px"})
But all these print statements return empty lists.
Any Ideas?
You can pass in a regular expression instead of a string according to the documentation here: http://www.crummy.com/software/BeautifulSoup/bs4/doc/#the-keyword-arguments
So I would try this:
import re
soup = BeautifulSoup(open(extracted_html_file))
soup.find_all('div', style = re.compile('left:54px'))