python web scraping of a dynamic bus map - python

The link contains a map showing the current location of the bus, and I want to scrape the map every few minutes with python and output it as an image. I tried to manage it with the following code, but the output is not showing the map but only showing the route. Moreover, if I want to run multiple times with selenium, it will open a lot of browsers on the backend. Is there any other way to do this? Thanks
Code I tried:
from PIL import Image
from selenium import webdriver
driver = webdriver.Chrome('./chromedriver')
driver.maximize_window() # maximize window
driver.get("https://mobi.mit.edu/default/transit/route?feed=nextbus&direction=loop&agency=mit&route=tech&_tab=map")
element = driver.find_element("xpath", "/html/body/div/div/main/div[2]/div/div[2]/div/div[3]/div/div/div/div/div/div"); # this is the map xpath
location = element.location;
size = element.size;
driver.save_screenshot("canvas.png");
x = location['x'];
y = location['y'];
width = location['x']+size['width'];
height = location['y']+size['height'];
im = Image.open('canvas.png')
im = im.crop((int(x), int(y), int(width), int(height)))
im.save('canvas_el.png') # your file
Output:
Expected:

Related

Python how to take screenshot of div

i'm trying to take a screenshot of product detail of Amazon item. I found that div id = aplus is the product detail description which is i'm looking for.
So i create code using python and selenium to take the full screen shot of the div section.
However, the result is cropped and only shows partial top of div.
options = webdriver.ChromeOptions()
options.headless = True
driver = webdriver.Chrome()
URL = "https://www.amazon.co.jp/-/en/Figuarts-Dragon-Saiyan-Approx-Painted/dp/B08S7KVHMP/ref=sr_1_1?crid=3O3TF6V9FJHS5&currency=JPY&keywords=b08s7kvhmp&qid=1668143838&qu=eyJxc2MiOiIwLjAwIiwicXNhIjoiMC4wMCIsInFzcCI6IjAuMDAifQ%3D%3D&sprefix=%2Caps%2C140&sr=8-1"
driver.get(URL)
time.sleep(5)
S = lambda X: driver.execute_script('return document.body.parentNode.scroll' +X)
time.sleep(1)
driver.set_window_size(S('Width'), S('Height'))
image = driver.find_element('id','aplus')
image.screenshot('yes.png')
and if i put
options=options
inside webdriver.Chrome(), depending on product it takes full screenshot of the div, but it does not contain any image.
I have no idea how to take full screenshot of the div :S
This example you need import the library PIL.
pip install Pillow
from selenium import webdriver
from PIL import Image
from io import BytesIO
options = webdriver.ChromeOptions()
options.headless = True
driver = webdriver.Chrome()
URL = "https://www.amazon.co.jp/-/en/Figuarts-Dragon-Saiyan-Approx-Painted/dp/B08S7KVHMP/ref=sr_1_1?crid=3O3TF6V9FJHS5&currency=JPY&keywords=b08s7kvhmp&qid=1668143838&qu=eyJxc2MiOiIwLjAwIiwicXNhIjoiMC4wMCIsInFzcCI6IjAuMDAifQ%3D%3D&sprefix=%2Caps%2C140&sr=8-1"
driver.get(URL)
# now that we have the preliminary stuff out of the way time to get that image :D
element = options.find_element_by_id('aplus') # find part of the page you want image of
location = element.location
size = element.size
png = options.get_screenshot_as_png() # saves screenshot of entire page
options.quit()
im = Image.open(BytesIO(png)) # uses PIL library to open image in memory
left = location['x']
top = location['y']
right = location['x'] + size['width']
bottom = location['y'] + size['height']
im = im.crop((left, top, right, bottom)) # defines crop points
im.save('screenshot.png') # saves new cropped image

How to get screen coordinate of a element in selenium? [Python]

I wanted to know that how can one get the coordinates of a element according to the screen resolution rather than the browser windows size, I have tried this already (code block), but it provides coordinates according to the browser window rather than the screen
element = driver.find_element_by_xpath("//*[#id='search_form_input_homepage']")
print(element.location)
Any alternatives that I can use?
A terrible attempt to explain what I mean :
note: driver.execute_script is not allowed, as the website has a bot blocker :(
You can use .size and .location to get the sizes.
Try this:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from time import sleep, strftime
url = "some url"
webdriver = webdriver.Chrome()
webdriver.get(url)
webdriver.fullscreen_window()
cookies = webdriver.find_element_by_xpath("xome xpath")
location = cookies.location
size = cookies.size
w, h = size['width'], size['height']
print(location)
print(size)
print(w, h)
print(element.location_once_scrolled_into_view)
Try if this helps , more available methods like size rect etc can be found at:
https://www.selenium.dev/selenium/docs/api/py/webdriver_remote/selenium.webdriver.remote.webelement.html#module-selenium.webdriver.remote.webelement

What is Google Image's id or class that corresponds to the box that lets you drag and drop images?

I am using Python Selenium to make some sort of Python Console version of Google Images. I already got the part where it opens up and clicks the camera icon. Unfortunately, I don't know what the id or class is for the box that lets you drag in images, as when I try to use an Id from what appears to be the box it says "Element not interactable"
the code so far:
from selenium import webdriver
import time
PATH = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://images.google.com")
print("Googlen't Images")
image_query = input("Enter path where image is: ")
cameraicon = driver.find_element_by_class_name("BwoPOe")
cameraicon.click()
time.sleep(2)
box = driver.find_element_by_id("dRSWfb") #this is the one that gives "element not interactable" error
box.send_keys(image_query)
Can anyone help?
First: error gives line with send_keys(), not find_... - so your comment in code is misleading.
Problem is that "dRSWfb is a <div> and you can't send keys to <div>. Inside this <div> is <input> which you should get and send keys.
This <input> has id Ycyxxc
box = driver.find_element_by_id("Ycyxxc")
box.send_keys(image_query)
I don't know how to drag'n'drop in Selenium (if it is even possible) but DevTools in Firefox shows events dragover and drop for <div> with id QDMvGf
EDIT: to send local file you can use button Browse on second tab
instead of drag'n'drop
which you can access using id awyMjb
box = driver.find_element_by_id("awyMjb")
box.send_keys(image_query)
Minimal working code
from selenium import webdriver
import time
print("Googlen't Images")
image_query = input("Enter path where image is: ")
driver = webdriver.Chrome("C:\Program Files (x86)\chromedriver.exe")
#driver = webdriver.Firefox(executable_path='/home/furas/bin/geckodriver')
driver.get("https://images.google.com")
cameraicon = driver.find_element_by_class_name("BwoPOe")
cameraicon.click()
time.sleep(1)
# send word or url on first tab
#box = driver.find_element_by_id("Ycyxxc")
#box.send_keys(image_query)
# send local file on second tab
box = driver.find_element_by_id("awyMjb")
box.send_keys(image_query)

Automatically substituting logos in web images

Consider the following task:
Open a given URL
Find the first image tag in the URL
Substitute it for an image in your local drive
Save the resulting webpage as a png
I want to automatize this task with a Python script, and I am unsure of the best approach.
I have been using selenium to convert URLs into screenshots, but I am unsure of how to introduce the part about modifying the first image tag to load a local file.
You can use execute_script to replace the image should look something like:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
browser = webdriver.Firefox()
url = 'https://www.aircanada.com/en/'
browser.get(url)
my_image = browser.find_element_by_xpath('//*[#id="pagePromoBanner-wrapper"]/div/a/img')
# or
# my_image = browser.find_element_by_xpath('any XPath')
link_to_new_image = "https://images.pexels.com/photos/67636/rose-blue-flower-rose-blooms-67636.jpeg?auto=compress&cs=tinysrgb&dpr=2&h=750&w=1260"
# if you are using python 3.6 and up:
browser.execute_script(f"arguments[0].src = '{link_to_new_image}'", my_image )
# else:
# browser.execute_script("arguments[0].src = '"+link_to_new_image+"'", my_image )
Hope this helps you!

Taking screenshots from part of the web page with python and selenium

I have been able to catch screenshots as pngs of some elements such the one with following code
from selenium import webdriver
from PIL import Image
from io import BytesIO
from os.path import expanduser
from time import sleep
# Define url and driver
url = 'https://www.formula1.com/'
driver = webdriver.Chrome('chromedriver')
# Go to url, scroll down to right point on page and find correct element
driver.get(url)
driver.execute_script('window.scrollTo(0, 4100)')
sleep(4) # Wait a little for page to load
element = driver.find_element_by_class_name('race-list')
location = element.location
size = element.size
png = driver.get_screenshot_as_png()
driver.quit()
# Store image as bytes, crop it and save to desktop
im = Image.open(BytesIO(png))
im = im.crop((200, 150, 700, 725))
path = expanduser('~/Desktop/')
im.save(path + 'F1-info.png')
This outputs to:
Which is what I want but not exactly how I want. I needed to manually input some scrolling down and as I couldn't get the element I wanted (class='race step-1 step-2 step-3') I had to manually crop the image too.
Any better solutions?
In case someone is wondering. This is how I managed it in the end. First I found and scrolled to the right part of the page like this
element = browser.find_element_by_css_selector('.race.step-1.step-2.step-3')
browser.execute_script('arguments[0].scrollIntoView()', element)
browser.execute_script('window.scrollBy(0, -80)')
and then cropped the image
im = im.crop((200, 80, 700, 560))

Categories

Resources