I have a selenium bot doing actions on a social network. I would like it to stop after he does a certain number of actions (10 is for the example). I initialize variables this way:
def __init__(self):
self.browser = webdriver.Firefox()
self.counter_var = int(0)
self.max_var = int(10)
This is the part performing and counting actions:
def action(self, accounts):
for account in accounts[9:]:
try:
self.browser.get(account)
time.sleep(5)
like_button = self.browser.find_element_by_xpath(
u'//button[contains(#class, "Heart")]').click()
self.count_actions()
print(self.counter_var)
except selenium.common.exceptions.NoSuchElementException:
break
def count_actions(self):
self.counter_var += 1
And this is the loop I've tried to make into main:
while self.counter_var < self.max_var:
searched_category = random.choice(pool_categories)
accounts = self.load_category(searched_category)
self.action(accounts)
However the bot never stops, even when counter_var reaches 10.
Do you know how to correct it?
Currently it's impossible to give a concrete answer to your question, due to the lack of code given, but it looks like the problem is because the action() method doesn't check self.counter_var while the for loop within it is executing.
Something like the follow might work. Adding yield to the action() method turns it into a generator function, which makes it iterable. When that is done it will effectively "pause" at that point each iteration of its for loop, and would allow the caller to inspect the current value of self.counter_var (or anything else it wanted to do each iteration).
Here's what I'm suggesting with a few explanatory comments:
class Class:
def action(self, accounts):
for account in accounts[9:]:
try:
self.browser.get(account)
time.sleep(5)
like_button = self.browser.find_element_by_xpath(
u'//button[contains(#class, "Heart")]').click()
self.count_actions()
print(self.counter_var)
yield # Added.
except selenium.common.exceptions.NoSuchElementException:
break
def count_actions(self):
self.counter_var += 1
def main(self):
while True:
searched_category = random.choice(pool_categories)
accounts = self.load_category(searched_category)
for _ in self.action(accounts): # Iterate through account checks.
if self.counter_var < self.max_var: # Too many actions?
break
Related
I'm self-learning python so I don't know how to describe this in a way that would be clear, so here's the easiest by proxy example I can come up with in pseudo code:
#where r() is a random number function
objCount = 0
def mainfunc()
while playgame= True and objCount < 100:
create(r(time))
time.sleep(1)
return None
def create(tmptime)
global objCount
objCount = objCount+1
newobj = plotSomething(r(x),r(y))
time.sleep(tmptime)
selfDelete..
return None
mainfunc() #run it
Instead of it making a random "lived" object every second, it makes a random lived object every second, but waits for it's "life" to expire. I'm trying to just fire this thing off to a sidechain to timeout on its own while still making new things.
All the documentation is getting super involved using asyncio, multithreading, etc.
Is there an easy way to kick this thing out of the main loop and not hold up traffic?
laziest method for simplicity is :
import concurrent.futures as delayobj
#where r() is a random number function
objCount = 0
def mainfunc()
global objCount
with delayobj:
while objCount < 100:
delayobj.ThreapoolExecutor().submit(create,tmptime=r(time))
time.sleep(1)
return None
def create(tmptime)
global objCount
objCount = objCount+1
newobj = plotSomething(r(x),r(y))
time.sleep(tmptime)
selfDelete..
return None
mainfunc() #run it
thanks again guys
I would like to write a class with the following interface.
class Automaton:
""" A simple automaton class """
def iterate(self, something):
""" yield something and expects some result in return """
print("Yielding", something)
result = yield something
print("Got \"" + result + "\" in return")
return result
def start(self, somefunction):
""" start the iteration process """
yield from somefunction(self.iterate)
raise StopIteration("I'm done!")
def first(iterate):
while iterate("what do I do?") != "over":
continue
def second(iterate):
value = yield from iterate("what do I do?")
while value != "over":
value = yield from iterate("what do I do?")
# A simple driving process
automaton = Automaton()
#generator = automaton.start(first) # This one hangs
generator = automaton.start(second) # This one runs smoothly
next_yield = generator.__next__()
for step in range(4):
next_yield = generator.send("Continue...({})".format(step))
try:
end = generator.send("over")
except StopIteration as excp:
print(excp)
The idea is that Automaton will regularly yield values to the caller which will in turn send results/commands back to the Automaton.
The catch is that the decision process "somefunction" will be some user defined function I have no control over. Which means that I can't really expect it to call the iterate method will a yield from in front. Worst, it could be that the user wants to plug some third-party function he has no control over inside this Automaton class. Meaning that the user might not be able to rewrite his somefunction for it to include yield from in front of iterate calls.
To be clear: I completely understand why using the first function hangs the automaton. I am just wondering if there is a way to alter the definition of iterate or start that would make the first function work.
I have an assignment where I need to create a stopwatch, but only for IDLE. Here's what I have so far, I'm not sure how to convert the times to normal time.
import time
start = 0
def stopwatch():
while True:
command = input("Type: start, stop, reset, or quit: \n")
if (command == "quit"):
break
elif (command == "start"):
start = time.time()
print(start)
stopwatch2()
elif (command == "stop"):
stopwatch()
elif (command == "reset'"):
stopwatch()
else :
break
def stopwatch2():
while True:
command = input("Type: stop, reset, or quit: \n")
if (command == "quit"):
break
elif (command == "stop"):
total = time.time() - start
print(total)
stopwatch()
elif (command == "reset'"):
stopwatch()
else:
break
stopwatch()
Thanks for your help!
You can use datetime.timedelta():
import datetime
print(datetime.timedelta(seconds=total))
For example:
In [10]: print datetime.timedelta(seconds=10000000)
115 days, 17:46:40
Think of it like this... Idle is really no different than coding in the interactive python interpreter which I do all the time (well, I use ipython).
Think of your stopwatch like an object. What functionality does it have? Things like start, stop, reset.
This may not be the most efficient way to solve the problem but here is what I would do.
>>> import time
>>> class StopwatchException:
pass
>>> class IsRunningException(StopwatchException):
pass
>>> class NotRunningException(StopwatchException):
pass
>>> class Stopwatch():
def __init__(self):
self._times = []
self._is_running = False
def start(self):
if self._is_running:
raise IsRunningException
self._is_running = True
tracker = {
'start': time.time(),
'stop': None,
}
self._times.append(tracker)
def stop(self):
if not self._is_running:
raise NotRunningException
tracker = self._times[-1]
# the dict is mutable, and tracker is a shallow copy
tracker['stop'] = time.time()
#print(self._times[-1])
self._is_running = False
def reset(self):
if self._is_running:
raise IsRunningException
self._times = []
def total(self):
if self._is_running:
raise IsRunningException
total = 0.0
for t in self._times:
total += t['stop'] - t['start']
return total
>>> s = Stopwatch()
>>> s.start()
>>> s.stop()
>>> s.total()
6.499619960784912
>>> s.reset()
>>> s.total()
0.0
To me, anytime you want to model a real world object or "thing", OOP makes the most sense. Heres a simple argument for each element of the program:
Classes
StopwatchException
Base exception class for the stopwatch class.
IsRunningException
Raised if the stopwatch is running when it should be stopped.
NotRunningException
Raised if the stopwatch is not running when it should be.
Stopwatch
This represents the actual stopwatch.
Stopwatch Class
init
A basic stopwatch class really only needs to instance variables. A variable that stores each start/stop time (allows them to be computed later) and a variable that stores the "state" of the stopwatch (on/off or running/stopped).
start
First we need to make sure the stopwatch isn't already running.
Then we need to set it's state to running and store the time in self._times.
I chose to use a local variable and store each time pair as a dictionary with the keys 'start' and 'stop'. I chose a dictionary because it is mutable. You could also have a list with index 0 being the start time and index 1 being the stop time. You cannot use a tuple for this since tuples are immutable.
Also, the "temporary" variable is not necessary but I used it for readability.
stop
First we need to make sure the stopwatch is actually running.
Then we set the state as 'stopped' (using our boolean self._is_running) and store our stop time, similar to what we did with start. I think it doesn't matter whether you set the boolean at the beginning or the end, although I chose to set it at the beginning of the start function and the end of the stop function so that the times would not include the time needed to update a boolean variable (even though it's a trivial task, it could be much more complex in more complex programs).
reset
Make sure the stopwatch isn't running
Set self._times to be an empty list.
total
Make sure the stopwatch isn't running.
Optional: You can stop the stopwatch here if it's running, but I prefer to raise an exception.
Iterate through each list item in self._times and calculate the difference between stop and start.
I have this class called DecayingSet which is a deque with expiration
class DecayingSet:
def __init__(self, timeout): # timeout in seconds
from collections import deque
self.timeout = timeout
self.d = deque()
self.present = set()
def add(self, thing):
# Return True if `thing` not already in set,
# else return False.
result = thing not in self.present
if result:
self.present.add(thing)
self.d.append((time(), thing))
self.clean()
return result
def clean(self):
# forget stuff added >= `timeout` seconds ago
now = time()
d = self.d
while d and now - d[0][0] >= self.timeout:
_, thing = d.popleft()
self.present.remove(thing)
I'm trying to use it inside a running script, that connects to a streaming api.
The streaming api is returning urls that I am trying to put inside the deque to limit them from entering the next step of the program.
class CustomStreamListener(tweepy.StreamListener):
def on_status(self, status, include_entities=True):
longUrl = status.entities['urls'][0]['expanded_url']
limit = DecayingSet(86400)
l = limit.add(longUrl)
print l
if l == False:
pass
else:
r = requests.get("http://api.some.url/show?url=%s"% longUrl)
When i use this class in an interpreter, everything is good.
But when the script is running, and I repeatedly send in the same url, l returns True every time indicating that the url is not inside the set, when is supposed to be. What gives?
Copying my comment ;-) I think the indentation is screwed up, but it looks like you're creating a brand new limit object every time on_status() is called. Then of course it would always return True: you'd always be starting with an empty limit.
Regardless, change this:
l = limit.add(longUrl)
print l
if l == False:
pass
else:
r = requests.get("http://api.some.url/show?url=%s"% longUrl)
to this:
if limit.add(longUrl):
r = requests.get("http://api.some.url/show?url=%s"% longUrl)
Much easier to follow. It's usually the case that when you're comparing something to a literal True or False, the code can be made more readable.
Edit
i just saw in the interpreter the var assignment is the culprit.
How would I use the same obj?
You could, for example, create the limit object at the module level. Cut and paste ;-)
For brevity, I'm just showing what can/must occur in states. I haven't run into any oddities in the state machine framework itself.
Here is a specific question:
Do you find it confusing that we have to return StateChange(...) and StateMachineComplete(...) whereas some of the of the other actions like some_action_1(...) and some_action_2(...) need not be returned - they're just direct method invocations?
I think that StateChange(...) needs to return because otherwise code beyond the StateChange(...) call will be executed. This isn't how a state machine should work! For example see the implementation of event1 in the ExampleState below
import abc
class State(metaclass=abc.ABCMeta):
# =====================================================================
# == events the state optionally or must implement ====================
# =====================================================================
# optional: called when the state becomes active.
def on_state_entry(self): pass
# optional: called when we're about to transition away from this state.
def on_state_exit(self): pass
#abc.abstractmethod
def event1(self,x,y,z): pass
#abc.abstractmethod
def event2(self,a,b): pass
#abc.abstractmethod
def event3(self): pass
# =====================================================================
# == actions the state may invoke =====================================
# =====================================================================
def some_action_1(self,c,d,e):
# implementation omitted for brevity
pass
def some_action_2(self,f):
# implementation omitted for brevity
pass
class StateChange:
def __init__(self,new_state_type):
# implementation omitted for brevity
pass
class StateMachineComplete: pass
class ExampleState(State):
def on_state_entry(self):
some_action_1("foo","bar","baz")
def event1(self,x,y,z):
if x == "asdf":
return StateChange(ExampleState2)
else:
return StateChange(ExampleState3)
print("I think it would be confusing if we ever got here. Therefore the StateChange calls above are return")
def event2(self,a,b):
if a == "asdf":
return StateMachineComplete()
print("As with the event above, the return above makes it clear that we'll never get here.")
def event3(self):
# Notice that we're not leaving the state. Therefore this can just be a method call, nothing need be returned.
self.some_action_1("x","y","z")
# In fact we might need to do a few things here. Therefore a return call again doesn't make sense.
self.some_action_2("z")
# Notice we don't implement on_state_exit(). This state doesn't care about that.
When I need a state machine in Python, I store it as a dictionary of functions. The indices into the dictionary are the current states, and the functions do what they need to and return the next state (which may be the same state) and outputs. Turning the crank on the machine is simply:
state, outputs = machine_states[state](inputs)
By putting the outgoing state changes in code you're obfuscating the whole process. A state machine should be driven by a simple set of tables. One axis is the current state, and the other is the possible events. You have two or three tables:
The "next-state" table that determines the exit state
The "action" table that determines what action to take
The "read" table that determines whether you stay on the current input event or move on to the next.
The third table may or may not be needed depending on the complexity of the input "grammar".
There are more esoteric variations, but I've never found a need for more than this.
I also struggled to find a good state_machine solution in python. So I wrote state_machine
It works like the following
#acts_as_state_machine
class Person():
name = 'Billy'
sleeping = State(initial=True)
running = State()
cleaning = State()
run = Event(from_states=sleeping, to_state=running)
cleanup = Event(from_states=running, to_state=cleaning)
sleep = Event(from_states=(running, cleaning), to_state=sleeping)
#before('sleep')
def do_one_thing(self):
print "{} is sleepy".format(self.name)
#before('sleep')
def do_another_thing(self):
print "{} is REALLY sleepy".format(self.name)
#after('sleep')
def snore(self):
print "Zzzzzzzzzzzz"
#after('sleep')
def big_snore(self):
print "Zzzzzzzzzzzzzzzzzzzzzz"
person = Person()
print person.current_state == person.sleeping # True
print person.is_sleeping # True
print person.is_running # False
person.run()
print person.is_running # True
person.sleep()
# Billy is sleepy
# Billy is REALLY sleepy
# Zzzzzzzzzzzz
# Zzzzzzzzzzzzzzzzzzzzzz
print person.is_sleeping # True