I have a main page where there are links to 5 other pages with the following xpaths from tr[1] to tr[5].
/html/body/div[3]/div[2]/div/div[5]/div/div[2]/table/tbody/tr[1]/td[3]/div[1]/a
/html/body/div[3]/div[2]/div/div[5]/div/div[2]/table/tbody/tr[2]/td[3]/div[1]/a
/html/body/div[3]/div[2]/div/div[5]/div/div[2]/table/tbody/tr[3]/td[3]/div[1]/a
/html/body/div[3]/div[2]/div/div[5]/div/div[2]/table/tbody/tr[4]/td[3]/div[1]/a
/html/body/div[3]/div[2]/div/div[5]/div/div[2]/table/tbody/tr[5]/td[3]/div[1]/a
Inside every page there I have the following actions:
driver.find_element_by_name('key').send_keys('test_1')
driver.find_element_by_name('i18n[en_EN][value]').send_keys('Test 1')
# and at the end this takes me back to the main page again
driver.find_element_by_xpath('/html/body/div[3]/div[2]/div/div[3]/div/ul/li[2]/a').click()
How can I iterate so that the script will go through all 5 pages and do the above actions. Tried for loop but I guess I didn't do it right... any help would be very appreciated.
You can try this:
xpath = '/html/body/div[3]/div[2]/div/div[5]/div/div[2]/table/tbody/tr[{}]/td[3]/div[1]/a'
for i in range(1, 6):
driver.find_element_by_xpath(xpath.format(i)).click()
seems like I figured it out so here is the answer which works for me now.
wls = ['/html/body/div[3]/div[2]/div/div[5]/div/div[2]/table/tbody/tr[1]/td[3]/div[1]/a',
'/html/body/div[3]/div[2]/div/div[5]/div/div[2]/table/tbody/tr[2]/td[3]/div[1]/a',
'/html/body/div[3]/div[2]/div/div[5]/div/div[2]/table/tbody/tr[3]/td[3]/div[1]/a',
'/html/body/div[3]/div[2]/div/div[5]/div/div[2]/table/tbody/tr[4]/td[3]/div[1]/a',
'/html/body/div[3]/div[2]/div/div[5]/div/div[2]/table/tbody/tr[5]/td[3]/div[1]/a']
for i in wls:
driver.find_element_by_xpath(i).click()
below
template = '/html/body/div[3]/div[2]/div/div[5]/div/div[2]/table/tbody/tr[{}]/td[3]/div[1]/a'
for x in range(1,6):
a = template.format(x)
print(a)
# do what you need to do with the 'a' element.
output
/html/body/div[3]/div[2]/div/div[5]/div/div[2]/table/tbody/tr[1]/td[3]/div[1]/a
/html/body/div[3]/div[2]/div/div[5]/div/div[2]/table/tbody/tr[2]/td[3]/div[1]/a
/html/body/div[3]/div[2]/div/div[5]/div/div[2]/table/tbody/tr[3]/td[3]/div[1]/a
/html/body/div[3]/div[2]/div/div[5]/div/div[2]/table/tbody/tr[4]/td[3]/div[1]/a
/html/body/div[3]/div[2]/div/div[5]/div/div[2]/table/tbody/tr[5]/td[3]/div[1]/a
Related
This is python project.
I want to make the code below into repeating using while or for.
Because I have 45lists. I have to get all information from the code.
I think I have to edit the code that 'li.dragons:nth-child(1)' to like 'li.dragons:nth-child(i)'
what should I edit or add the codes to repeat?
I need your help python master.
You can freely edit my code.
for i in range(1, 46): since CSS indexes start at , not 0
browser.find_element_by_css_selector("ul.list_basis
li.dragons:nth-child(1) .dragonchild a.link").click() to
browser.find_element_by_css_selector(f"ul.list_basis
li.dragons:nth-child({i}) .dragonchild a.link").click()
browser.switch_to.window(browser.window_handles[1])
items = browser.find_elements_by_css_selector('.classname11')
for item in items:
name = item.find_element_by_css_selector('.classname12 >
.classname13').text
number = item.find_element_by_css_selector('.classname14').text
print([name,number])
browser.close()
browser.switch_to.window(browser.window_handles[0])
Use a format string! In other words, wrap this whole code block in a for-loop (for i in range(1, 46): since CSS indexes start at 1, not 0), then change browser.find_element_by_css_selector("ul.list_basis li.dragons:nth-child(1) .dragonchild a.link").click() to browser.find_element_by_css_selector(f"ul.list_basis li.dragons:nth-child({i}) .dragonchild a.link").click()
Just like the title says, how do I write the code in python if I want to replace a part of the URL.
For this example replacing a specific part by 1, 2, 3, 4 and so on for this link (https://test.com/page/1), then doing something on said page and going to the next and repeat.
So, "open url > click on button or whatever > replace link by the new link with the next number in order"
(I know my code is a mess I am still a newbie, but I am trying to learn and I am adding whatever mess I've wrote so far to follow the posting rules)
PATH = Service("C:\Program Files (x86)\chromedriver.exe")
driver = webdriver.Chrome(service=PATH)
driver.maximize_window()
get = 1
url = "https://test.com/page/{get}"
while get < 5:
driver.get(url)
time.sleep(1)
driver.find_element_by_xpath("/html/body/div/div/div[2]/form/section[3]/input[4]").click()
get = get + 1
driver.get(url)
driver.close()
get = 1
url = f"https://test.com/page/{get}"
while get < 5:
driver.get(url)
driver.find_element_by_xpath("/html/body/div/div/div[2]/form/section[3]/input[4]").click()
print(get)
print(url)
get+=1
url = f"https://test.com/page/{get}"
To simply update url in a loop.
Outputs
1
https://test.com/page/1
2
https://test.com/page/2
3
https://test.com/page/3
4
https://test.com/page/4
Use the range() function and use String interpolation as follows:
for i in range(1,5):
print(f"https://test.com/page/{i}")
driver.get(f"https://test.com/page/{i}")
driver.find_element_by_xpath("/html/body/div/div/div[2]/form/section[3]/input[4]").click()
Console Output:
https://test.com/page/1
https://test.com/page/2
https://test.com/page/3
https://test.com/page/4
I'm trying to workout the number of for loops to run depending on the number of List (totalListNum)
And it seems that it is returning Nonetype when infact it should be returning either text or int
website:https://stamprally.org/?search_keywords=&search_keywords_operator=and&search_cat1=68&search_cat2=0
Code Below
for prefectureValue in prefectureValueStorage:
driver.get(
f"https://stamprally.org/?search_keywords&search_keywords_operator=and&search_cat1={prefectureValue}&search_cat2=0")
# Calculate How Many Times To Run Page Loop
totalListNum = driver.find_element_by_css_selector(
'div.page_navi2.clearfix>p').get_attribute('text')
totalListNum.text.split("件中")
if totalListNum[0] % 10 != 0:
pageLoopCount = math.ceil(totalListNum[0])
else:
continue
currentpage = 0
while currentpage < pageLoopCount:
currentpage += 1
print(currentpage)
I dont think you should use get_attribute. Instead try this.
totalListNum = driver.find_element_by_css_selector('div.page_navi2.clearfix>p').text
First, your locator is not unique
Use this:
div.page_navi2.clearfix:nth-of-type(1)>p
or for the second element:
div.page_navi2.clearfix:nth-of-type(2)>p
Second, as already mentioned, use .text to get the text.
If .text does not work you can use .get_attribute('innerHTML')
I'm working on scraping a site that has a dropdown menu of hundreds of schools. I am trying to go through and grab tables for only schools from a certain district in the state. So far I have isolated the values for only those schools, but I've bee unable to replace the xpath values from what is stored in my dataframe/list.
Here is my code:
ousd_list = ousd['name'].to_list()
for i in range(0,129):
n = 0
driver.find_element_by_xpath(('"//option[#value="',ousd_list[n],']"'))
driver.find_elements_by_name("submit1").click()
table = driver.find_elements_by_id("ContentPlaceHolder1_grdDisc")
tdf = pd.read_html(table)
tdf.to_csv(index=False)
n += 1
driver.get('https://dq.cde.ca.gov/dataquest/Expulsion/ExpSearchName.asp?TheYear=2018-19&cTopic=Expulsion&cLevel=School&cName=&cCounty=&cTimeFrame=S')
I suspect the issue is on the find_element_by_xpath line, but I'm not sure how else I would go about resolving this issue. Any advice?
The mistake is not in the scraping part but your code logic, since you put n=0 in the beginning of your loop, it resets to 0 and every loop will just find your ousd_list[0].
Try,
ousd_list = ousd['name'].to_list()
for ousd_name in ousd_list :
driver.find_element_by_xpath(f'//option[#value="{ousd_name}"]')
driver.find_elements_by_name("submit1").click()
table = driver.find_elements_by_id("ContentPlaceHolder1_grdDisc")
tdf = pd.read_html(table)
tdf.to_csv(index=False)
driver.get('https://dq.cde.ca.gov/dataquest/Expulsion/ExpSearchName.asp?TheYear=2018-19&cTopic=Expulsion&cLevel=School&cName=&cCounty=&cTimeFrame=S')
Instead of writing 10+ IF statements, I'm trying to create one IF statement using a variable. Unfortunately, I'm not familiar with how to implement string concatenation for xpath using python. Can anyone teach me how to perform string formatting for the following code segments?
I would greatly appreciate it, thanks.
if page_number == 1:
next_link = browser.find_element_by_xpath('//*[#title="Go to page 2"]')
next_link.click()
page_number = page_number + 1
time.sleep(30)
elif page_number == 2:
next_link = browser.find_element_by_xpath('//*[#title="Go to page 3"]')
next_link.click()
page_number = page_number + 1
time.sleep(30)
This answer is not about string concatenation, but about simple problem solution...
Instead of clicking particular link on pagination you can click "Next" button:
pages_number = 10
for _ in range(pages_number):
driver.find_element_by_xpath("//a[#title='Go to next page']").click()
time.sleep(30)
If you need to open specific page you can use below:
required_page = 3
driver.find_element_by_link_text(required_page).click()
P.S. I assumed you are talking about this site
You can use a for-loop
Ex:
for i in range(1, 10):
next_link = browser.find_element_by_xpath('//*[#title="Go to page {0}"]'.format(str(i+1))) #using str.format for page number
next_link.click()
time.sleep(30)