Using result of a function as search conditional - python

I would like to insert the 1st result of the function def elliptic() into the 2nd function entity_noun(). In the 2nd function, it finds the node which has the attribute with a specific value. I want this value (which is a string in quotes "??????") to be retrieved from the returned value of the 1st function.
from bs4 import BeautifulSoup
def elliptic():
last_a_tag = soup.find_all("sn", elliptic="yes")
for item in last_a_tag:
entity = item.get('entity')
return(entity)
def entity_noun():
ent = soup.find(entity="??????")
noun = ent.find('n')
return(noun)
Do you have any suggestion how to do this?

You can pass the result of calling the function right in the parameters.
So in this case you would do:
ent = soup.find(entity=elliptic())

You have here two functions. function should be called to return a result.
if you do something like this :
from bs4 import BeautifulSoup
def elliptic():
last_a_tag = soup.find_all("sn", elliptic="yes")
for item in last_a_tag:
entity = item.get('entity')
return(entity)
def entity_noun():
ent = soup.find(entity=elliptic())
noun = ent.find('n')
return(noun)
entity_noun()
you will call entity_noun() which will call elliptic()
an other option is to use argument :
from bs4 import BeautifulSoup
def elliptic():
last_a_tag = soup.find_all("sn", elliptic="yes")
for item in last_a_tag:
entity = item.get('entity')
return(entity)
def entity_noun(X):
ent = soup.find(entity=X)
noun = ent.find('n')
return(noun)
A=elliptic()
entity_noun(A)
in this case you will call the first function elliptic() keep the result in A and then pass A to entity_noun(). with this second method each function will stay independent one from an other and so be used independently in different context.

Related

Mock function use two times/same name

I have a function to test that calls the same function twice, but this function returns two different data. I need to create a mock for the first variable and then for the second, I have a solution but it doesn't work in some cases. I want to be able to mock the api_result_first variable and the api_result_second variable which uses api_call().
Do you have an idea?
My code :
import pandas as pd
import time
import random
def api_call():
time.sleep(2)
return random.randint(0,9)
def slow_function():
api_result_first = api_call()
api_result_second = api_call()
result = api_result_first + api_result_second
return result
My Pystest :
from a import *
import pytest
# https://changhsinlee.com/pytest-mock/
def test_aa(mocker):
mocker.patch("a.api_call", return_value="ok")
value = slow_function()
assert isinstance(value, int)
Use side_effects to provide a sequence of return values to use.
def test_aa(mocker):
mocker.patch("a.api_call", side_effects=[3, 5])
value = slow_function()
assert value == 8

Call many python functions from a module by looping through a list of function names and making them variables

I have three similar functions in tld_list.py. I am working out of mainBase.py file.
I am trying to create a variable string which will call the appropriate function by looping through the list of all functions. My code reads from a list of function names, iterates through the list and running the function on each iteration. Each function returns 10 pieces of information from separate websites
I have tried 2 variations annotated as Option A and Option B below
# This is mainBase.py
import tld_list # I use this in conjunction with Option A
from tld_list import * # I use this with Option B
functionList = ["functionA", "functionB", "functionC"]
tldIterator = 0
while tldIterator < len(functionList):
# This will determine which function is called first
# In the first case, the function is functionA
currentFunction = str(functionList[tldIterator])
Option A
currentFunction = "tld_list." + currentFunction
websiteName = currentFunction(x, y)
print(websiteName[1]
print(websiteName[2]
...
print(websiteName[10]
Option B
websiteName = currentFunction(x, y)
print(websiteName[1]
print(websiteName[2]
...
print(websiteName[10]
Even though it is not seen, I continue to loop through the iteration by ending each loop with tldIterator += 1
Both options fail for the same reason stating TypeError: 'str' object is not callable
I am wondering what I am doing wrong, or if it is even possible to call a function in a loop with a variable
You have the function names but what you really want are the function objects bound to those names in tld_list. Since function names are attributes of the module, getattr does the job. Also, it seems like list iteration rather than keeping track of your own tldIterator index would suffice.
import tld_list
function_names = ["functionA", "functionB", "functionC"]
functions = [getattr(tld_list, name) for name in function_names]
for fctn in functions:
website_name = fctn(x,y)
You can create a dictionary to provide a name to function conversion:
def funcA(...): pass
def funcB(...): pass
def funcC(...): pass
func_find = {"Huey": funcA, "Dewey": funcB, "Louie": FuncC}
Then you can call them, e.g.
result = func_find["Huey"](...)
You should avoid this type of code. Try using if's, or references instead. But you can try:
websiteName = exec('{}(x, y)'.format(currentFunction))

How to make a loop in return so that not to repeat scrapy.request?

I am scraping a page. I tried to make loop in return function but it didn't work. It gave me the result of just first link. I want to make a loop so that I could return all three values.
class SiteFetching(scrapy.Spider):
name = 'Site'
def start_requests(self):
links = {'transcription_page': 'https://www.rev.com/freelancers/transcription',
'captions_page': 'https://www.rev.com/freelancers/captions',
'subtitles_page': 'https://www.rev.com/freelancers/subtitles'}
call = [self.parse_transcription, self.parse_caption, self.parse_subtitles]
return [
scrapy.Request(links['transcription_page'], callback=call[0]),
scrapy.Request(links['captions_page'], callback=call[1]),
scrapy.Request(links['subtitles_page'], callback=call[2])
]
Yes, you can have a list comprehension do the looping so that there is only one instance of the text scrapy.Request() in the program, but of course being a loop the function will be called once per loop:
class SiteFetching(scrapy.Spider):
name = 'Site'
def start_requests(self):
links = [('https://www.rev.com/freelancers/transcription', self.parse_transcription),
('https://www.rev.com/freelancers/captions', self.parse_caption),
('https://www.rev.com/freelancers/subtitles', self.parse_subtitles)]
return [scrapy.Request(link[0], callback=link[1]) for link in links]
Another option if you want to avoid making all the requests at once and waiting for them all to return is to use a generator expression:
return (scrapy.Request(link[0], callback=link[1]) for link in links)
btw I know nothing about Spider etc
Now you call start_requests() but it returns a generator and you call next() on it to make each Request():
sf = SiteFetching() # I assume this is how you instantiate SiteFetching
gen = sf.start_requests() # Only returns a generator
req = next(gen) # Only here does the first call to Request() occur with callback to follow.
I only showed one instance of calling next(), but you could have a loop (or iterate over it with for), but any way you do it you get to say when the Request() occurs and what you do before and after each call.

How to get a variable from an external function

How, if possible, would I be able to bring in a variable from an external function. Take this code
# HttpThing.py
import requests
def httpThing(someurl):
x = requests.get(someurl)
In another file I have
from httpThing.py import httpThing
httpThing(www.url.com)
print(x)
How would I be able to get the last print function to print the response to the query.
you return that value from the function like this:
# HttpThing.py
import requests
def httpThing(someurl):
x = requests.get(someurl)
return x
then use it like this:
from httpThing import httpThing
x = httpThing(www.url.com)
print(x)
NOTE: the variable that you return dosen't have to be same as the variable where you call that function for printing it. it could be named anything.

Result as new argument

I wish to repeat this function (say 200times), each time taking the output as a new argument:
def coop(url):
num_body = re.search('\d+', urllib.urlopen(url).read()).group(0)
num_head = re.search('\d+', url).group(0)
new = url.replace(num_head, num_body)
return new
You could do this with a simple loop:
for _ in range(0,200):
url = coop(url)
This works with the function you have written ands stores the result in url to feed into the next call.

Categories

Resources