I am working with selenium and I want to get to images. The problem is that the selenium works up to 21 images and after that, it returns empty URLs like below.
1 : https://photo.venus.com/im/19230307.jpg?preset=dept
2 : https://photo.venus.com/im/18097354.jpg?preset=dept
3 : https://photo.venus.com/im/19230311.jpg?preset=dept
4 : https://photo.venus.com/im/19234200.jpg?preset=dept
5 : https://photo.venus.com/im/17307902.jpg?preset=dept
6 : https://photo.venus.com/im/19305650.jpg?preset=dept
7 : https://photo.venus.com/im/19060456.jpg?preset=dept
8 : https://photo.venus.com/im/18295767.jpg?preset=dept
9 : https://photo.venus.com/im/19102600.jpg?preset=dept
10 : https://photo.venus.com/im/19230297.jpg?preset=dept
11 : https://photo.venus.com/im/16181113.jpg?preset=dept
12 : https://photo.venus.com/im/19101047.jpg?preset=dept
13 : https://photo.venus.com/im/19150290.jpg?preset=dept
14 : https://photo.venus.com/im/19042244.jpg?preset=dept
15 : https://photo.venus.com/im/19230329.jpg?preset=dept
16 : https://photo.venus.com/im/19101040.jpg?preset=dept
17 : https://photo.venus.com/im/17000870.jpg?preset=dept
18 : https://photo.venus.com/im/19100952.jpg?preset=dept
19 : https://photo.venus.com/im/19183658.jpg?preset=dept
20 : https://photo.venus.com/im/19102243.jpg?preset=dept
21 : https://photo.venus.com/im/18176590.jpg?preset=dept
22 : data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAANSURBVBhXYzh8+PB/AAffA0nNPuCLAAAAAElFTkSuQmCC
23 : data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAANSURBVBhXYzh8+PB/AAffA0nNPuCLAAAAAElFTkSuQmCC
24 : data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAANSURBVBhXYzh8+PB/AAffA0nNPuCLAAAAAElFTkSuQmCC
25 : data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAANSURBVBhXYzh8+PB/AAffA0nNPuCLAAAAAElFTkSuQmCC
26 : ...
I even used time sleep, but it has not worked. Any ideas would be appreciated.
Here is also my code:
url = 'https://www.venus.com/products.aspx?BRANCH=7~63~'
driver.get(url)
product_container_ls = driver.find_elements_by_class_name('product-container')
for prd in product_container_ls:
# Finding elements of images by class name
image_lm = prd.find_element_by_class_name('main')
# The url to image
image_url = image_lm.get_attribute('src')
print(image_id, ': ', image_url)
# Image Path
image_path = os.path.join(directory, f'{image_name}.jpg')
# Getting and saving the image
urllib.request.urlretrieve(image_url, image_path)
image_id += 1
time.sleep(3)
driver.quit()
Thanks!
Look for the attribute of data-original rather than the src since this how they are lazy loading the images. I modified the following variable and got all the images
image_url = image_lm.get_attribute('data-original')
Here is a sample of my print out for that variable:
https://photo.venus.com/im/18235739.jpg?preset=dept
https://photo.venus.com/im/19034244.jpg?preset=dept
https://photo.venus.com/im/17199949.jpg?preset=dept
https://photo.venus.com/im/19121197.jpg?preset=dept
https://photo.venus.com/im/18235918.jpg?preset=dept
https://photo.venus.com/im/18366410.jpg?preset=dept
Related
This is my first question on this site, and my English is not so good.So,if there is any misunderstanding, please tell me. Thank you.
I am trying use gPRC to send a message ,which is defined as :
message PrepareMsg{
message Data{
string node_id = 1;
string vote = 2;
}
Data data = 1;
repeated string signature = 2;
}
My code for Cilent just like this:
1 def pre_prepare(self, block):
2 request = grpc_pb2.PrePrepareMsg()
3 request.data.node_id = self.node_id
4 request.data.block.CopyFrom(block)
5 a = request.data.SerializeToString()
6 temp_sign = signing(self.signer, a)
7 for i in temp_sign:
8 request.signature.append(hex(i))
9 self_node = set()
10 self_node.add(p2p.SELF_IP_PORT)
11 print(self_node)
12 nodes = set(p2p.Node.get_nodes_list()) - self_node
13 print("print nodes in broadcast:")
14 print(nodes)
15 for i in nodes:
16 channel = grpc.insecure_channel(i)
17 stub = grpc_pb2_grpc.ConsensusStub(channel)
18 try:
19 response = stub.PrePrepare(request)
20 print(response.Result)
21 except Exception as e:
22 print("get except: %s" % str(e))
I find there is a error :"Exception calling application: the JSON object must be str, bytes or bytearray, not RepeatedScalarContainer"
I don't know why it happened.
using python, i want to seperated some data file.
file form is text file and there are no tabs only one space between inside data.
here is example file,
//test.txt
Class name age room fund.
13 A 25 B101 300
12 B 21 B102 200
9 C 22 B103 200
13 D 25 B102 100
20 E 23 B105 100
13 F 25 B103 300
11 G 25 B104 100
13 H 22 B101 300
I want to take only line containing specific data,
class : 13 , fund 300
,and save another text file.
if this code was worked, making text file is that
//new_test.txt
Class name age room fund.
13 A 25 B101 300
13 F 25 B103 300
13 H 22 B101 300
thanks.
Hk
This should do.
with open('new_test.txt','w') as new_file:
with open('test.txt') as file:
print(file.readline(),end='',file=new_file)
for line in file:
arr=line.strip().split()
if arr[0]=='13' and arr[-1]=='300':
print(line,end='',file=new_file)
However, you should include your code when asking a question. It ensures that the purpose of this site is served.
If you want to filter your data:
def filter_data(src_file, dest_file, filters):
data = []
with open(src_file) as read_file:
header = [h.lower().strip('.') for h in read_file.readline().split()]
for line in read_file:
values = line.split()
row = dict(zip(header, values))
data.append(row)
for k, v in filters.items():
if data and row.get(k, None) != v:
data.pop()
break
with open(dest_file, 'w') as write_file:
write_file.write(' '.join(header) + '\n')
for row in data:
write_file.write(' '.join(row.values()) + '\n')
my_filters = {
"class": "13",
"fund": "300"
}
filter_data(src_file='test.txt', dest_file='new_test.txt', filters=my_filters)
I wrote a little bit of code to read a number in a file. Append it to a variable, then increment the number so the next time it runs the number in the file will be number +1. It looks like its working except it seems to increment twice.. For example here is my code :
11 def mcIPNumber():
12 with open('mcIPlatest.txt', 'r+') as file:
13 NameNumber= file.read().replace('\n','')
14 NameNumber=int(NameNumber)
15 NewNumber= NameNumber+1
16 print "newnumber = %s" % NewNumber
17 file.seek(0)
18 file.write(str(NewNumber))
19 file.truncate()
20 return NameNumber
21
22 def makeNameMCTag():
23 NameNumber = mcIPNumber()
24 NameTag = "varName" + str(NameNumber)
25 print "Name Tag: %s" % NameTag
26 mcGroup = "varTagmc"
27 #IPNumber = 1
28 mcIP = "172.16.0.%s" % NameNumber
29 print ( "Multicast Tag: %s, %s" % (mcGroup,mcIP))
30
31
32 mcIPNumber()
33 makeNameMCTag()
But here is my output.. notice that "NewNumber" gets printed out twice.. for some reason"
newnumber = 2
newnumber = 3
Name Tag: varName2
Multicast Tag: varTagmc, 172.16.0.2
So it correctly made my varName2 and my IP 172.16.0.2 (incremented my initial number in the file by 1) but this means the 2nd time I run it.. I get this:
newnumber = 4
newnumber = 5
Name Tag: varName
Multicast Tag: varTagmc, 172.16.0.4
My expected result is this:
newnumber = 3
Name Tag: varName3
Multicast Tag: varTagmc, 172.16.0.3
Any idea why its looping?
Thanks!
(by the way if you're curious I'm trying to write some code which will eventually write the tf file for my TerraForm lab)
Because of this:
def makeNameMCTag():
NameNumber = mcIPNumber()
You are calling mcIPNumber from inside makeNameMCTag, so you don't excplicitly need to call that method in line 32.
Alternatively
def make_name_mc_tag(name_number):
NameTag = "varName" + str(name_number)
print "Name Tag: %s" % NameTag
...
make_name_mc_tag(mcIPNumber())
here you are passing the required data as a parameter.
I have a data that looks like this:
INFO : Reading PDB list file 'model3.list'
INFO : Successfully read 10 / 10 PDBs from list file 'model3.list'
INFO : Successfully read 10 Chain structures
INFO : Processed 40 of 45 MAXSUBs
INFO : CPU time = 0.02 seconds
INFO : ======================================
INFO : 3D-Jury (Threshold: > 10 pairs # > 0.200)
INFO : ======================================
INFO : Rank Model Pairs File
INFO : 1 : 1 151 pdbs2/model.165.pdb
INFO : 2 : 7 145 pdbs2/model.150.pdb
INFO : 3 : 6 144 pdbs2/model.144.pdb
INFO : 4 : 9 142 pdbs2/model.125.pdb
INFO : 5 : 4 137 pdbs2/model.179.pdb
INFO : 6 : 8 137 pdbs2/model.191.pdb
INFO : 7 : 10 137 pdbs2/model.147.pdb
INFO : 8 : 3 135 pdbs2/model.119.pdb
INFO : 9 : 5 131 pdbs2/model.118.pdb
INFO : 10 : 2 129 pdbs2/model.128.pdb
INFO : ======================================
INFO : Pairwise single linkage clustering
INFO : ======================================
INFO : Hierarchical Tree
INFO : ======================================
INFO : Node Item 1 Item 2 Distance
INFO : 0 : 6 1 0.476 pdbs2/model.144.pdb pdbs2/model.165.pdb
INFO : -1 : 7 4 0.484 pdbs2/model.150.pdb pdbs2/model.179.pdb
INFO : -2 : 9 2 0.576 pdbs2/model.125.pdb pdbs2/model.128.pdb
INFO : -3 : -2 0 0.598
INFO : -4 : 10 -3 0.615 pdbs2/model.147.pdb
INFO : -5 : -1 -4 0.618
INFO : -6 : 8 -5 0.620 pdbs2/model.191.pdb
INFO : -7 : 3 -6 0.626 pdbs2/model.119.pdb
INFO : -8 : 5 -7 0.629 pdbs2/model.118.pdb
INFO : ======================================
INFO : 1 Clusters # Threshold 0.800 (0.8)
INFO : ======================================
INFO : Item Cluster
INFO : 1 : 1 pdbs2/model.165.pdb
INFO : 2 : 1 pdbs2/model.128.pdb
INFO : 3 : 1 pdbs2/model.119.pdb
INFO : 4 : 1 pdbs2/model.179.pdb
INFO : 5 : 1 pdbs2/model.118.pdb
INFO : 6 : 1 pdbs2/model.144.pdb
INFO : 7 : 1 pdbs2/model.150.pdb
INFO : 8 : 2 pdbs2/model.191.pdb
INFO : 9 : 2 pdbs2/model.125.pdb
INFO : 10 : 2 pdbs2/model.147.pdb
INFO : ======================================
INFO : Centroids
INFO : ======================================
INFO : Cluster Centroid Size Spread
INFO : 1 : 1 10 0.566 pdbs2/model.165.pdb
INFO : 2 : 10 3 0.777 pdbs2/model.147.pdb
INFO : ======================================
And it constitutes a chunk of many more data.
The chunks are denoted with starting line
INFO : Reading PDB list file 'model3.list'
What I want to do is to extract parts of chunk here:
INFO : ======================================
INFO : Cluster Centroid Size Spread
INFO : 1 : 1 10 0.566 pdbs2/model.165.pdb
INFO : 2 : 10 3 0.777 pdbs2/model.147.pdb
INFO : ======================================
At the end of the day a dictionary that looks like this:
{1:"10 pdbs2/model.165.pdb",
2:"3 pdbs2/model.147.pdb"}
Namely with cluster number as key and values as cluster size + file_model name.
What's the way to achieve that in Python?
I'm stuck with this code:
import csv
import json
import os
import argparse
import re
def main():
"""docstring for main"""
file = "max_cluster_output.txt"
with open(file, 'r') as tsvfile:
tabreader = csv.reader(tsvfile, delimiter=' ')
for line in tabreader:
linelen = len(line)
if "Centroids" in line:
print line
#if linelen >= 32 and linelen <= 34:
# print linelen, line
if __name__ == '__main__':
main()
I would do this using regexes.
I would have an outer loop that
reads lines until it finds "INFO : Reading PDB list file"
reads lines until it finds "INFO : Cluster Centroid Size Spread"
inner loop that:
creates dictionary entries from each subsequent line, until the line no longer matches
INFO: <number> : <number> <number> <number> <string>
It would look something like this (not tested):
import re
FILENAME = "foo.txt"
info = {}
try:
with open(FILENAME) as f:
while True:
for line in f:
if re.match("^INFO\s+:\s+Reading PDB list file", line):
break
for line in f:
if re.match("^INFO\s+:\s+Cluster\s+Centroid\s+Size\s+Spread", line):
break
# We're up to the data
for line in f:
# look for INFO : Cluster-number Centroid-number Size-number spread-blah File-string
match = re.match(^INFO\s+:\s+(?P<Cluster>\d+)\s+\d+\s+(?P<Size>\d+).*\s(?P<FileName>\S+)$, line)
if match:
info[match.group("Cluster")] = "%s %s" % (match.group('Size'), match.group("FileName"))
else:
break
except StopIteration:
print "done"
This code is here just to show the types of things to use (looping, file iterator, breaking, regexes) ... its by no means necessarily the most elegant way (and it is not tested).
I am new to python and I'm learning rapidly, but this is beyond my current level of understanding. I'm trying to to pull the output from the linux command apcaccess into a list in python.
apcaccess is a linux command to get the status of an APC UPS. The output is this:
$ apcaccess
APC : 001,035,0933
DATE : 2014-11-12 13:38:27 -0500
HOSTNAME : doormon
VERSION : 3.14.10 (13 September 2011) debian
UPSNAME : UPS
CABLE : USB Cable
DRIVER : USB UPS Driver
UPSMODE : Stand Alone
STARTTIME: 2014-11-12 12:28:00 -0500
MODEL : Back-UPS ES 550G
STATUS : ONLINE
LINEV : 118.0 Volts
LOADPCT : 15.0 Percent Load Capacity
BCHARGE : 100.0 Percent
TIMELEFT : 46.0 Minutes
MBATTCHG : 5 Percent
MINTIMEL : 3 Minutes
MAXTIME : 0 Seconds
SENSE : Medium
LOTRANS : 092.0 Volts
HITRANS : 139.0 Volts
ALARMDEL : 30 seconds
BATTV : 13.6 Volts
LASTXFER : No transfers since turnon
NUMXFERS : 2
XONBATT : 2014-11-12 12:33:35 -0500
TONBATT : 0 seconds
CUMONBATT: 53 seconds
XOFFBATT : 2014-11-12 12:33:43 -0500
STATFLAG : 0x07000008 Status Flag
SERIALNO : 4B1335P17084
BATTDATE : 2013-08-28
NOMINV : 120 Volts
NOMBATTV : 12.0 Volts
FIRMWARE : 904.W1 .D USB FW:W1
END APC : 2014-11-12 13:38:53 -0500
I've tried different iterations of Popen such as:
def check_apc_ups():
output = subprocess.Popen("apcaccess", stderr=subprocess.STDOUT, stdout=subprocess.PIPE)
x1, x2, x3, x4, x5 = output
I would like to be able to pull each line into a list or tuple containing all 32 and then only display/print what I need, such as TIMELEFT and BCHARGE.
Any help would be greatly appreciated.
There are already answers how to get the output of the command into python.
It is not clear what you are going to do with the output. Maybe, a dictionary (dict) is better than a list for you:
# stolen from Hackaholic's answer
import subprocess
child = subprocess.Popen('apcaccess',stdout=subprocess.PIPE)
msg,err = child.communicate()
# now create the dict:
myDict={}
#for i in msg.split("\n"): # loop over lines
for i in msg.splitlines(): # EDIT: See comments
splitted=i.split(":") # list like ["HOSTNAME ", " doormon"]
# remove leading & trailing spaces, add to dict
myDict[splitted[0].strip()]=splitted[1].strip()
#Now, you can easily access the items:
print myDict["SERIALNO"]
print myDict["STATUS"]
print myDict["BATTV"]
for k in myDict.keys():
print k +" = "+ myDict[k]
from subprocess import check_output
out = check_output(["apcaccess"])
spl = [ele.split(":",1)for ele in out.splitlines()]
d = {k.rstrip():v.lstrip() for k,v in spl}
print(d['BCHARGE'])
print(d["TIMELEFT"])
100.0 Percent
46.0 Minutes
from subprocess import check_output
def get_apa():
out = check_output(["apcaccess"])
spl = [ele.split(":", 1) for ele in out.splitlines()]
d = {k.rstrip(): v.lstrip() for k, v in spl}
return d
output = get_apa()
print (output['BCHARGE'])
100.0 Percent
To print all key/values pairings:
for k,v in get_apa().items():
print("{} = {}".format(k,v))
what you need is subprocess module
import subprocess
child = subprocess.Popen('apcaccess',stdout=subprocess.PIPE)
msg,err = child.communicate()
print(msg.split())