I'm trying to add the array values one by one to an URL since each one is a different ID for an API call im trying to make.
The code works if I write the ID directly in the URL variable. But I have hundreds of API calls to make.
How can I print/add each array element one by one and the URL? Check the final output code and see how it adds the whole array instead of each element one by one.
import requests
ids = ["12ab", "13ab", "14ab"]
for x in ids:
url = ("https://google.com/{}"+format(ids)+"?extraurlparameters")
response = requests.request("DELETE", url)
print(x)
print(url)
print(response.text)
output
12ab
1
https://google.com/{}['12ab', '13ab', '14ab']?extraurlparameters
2
13ab
3
https://google.com/{}['12ab', '13ab', '14ab']?extraurlparameters
4
14ab
5
https://google.com/{}['12ab', '13ab', '14ab']?extraurlparameters
6
Replace your version with the following and let me know if it works
ids = ["12ab", "13ab", "14ab"]
for x in ids:
url = ("https://google.com/{}".format(x)+"?extraurlparameters")
print(url)
import requests
ids = ["12ab", "13ab", "14ab"]
for x in ids:
url = ("https://google.com/"+format(x)+"?extraurlparameters")
response = requests.request("DELETE", url)
print(x)
print(url)
print(response.text)
change ids to x in line 4.
Usually, format() is called at the end of the string.
url = "https://google.com/{}?extraurlparameters".format(x)
In Python 3.6+, you could use an f-string (format string), such as:
url = f"https://google.com/{x}?extraurlparameters"
Original code sample
import requests
ids = ["12ab", "13ab", "14ab"]
for x in ids:
url = "https://google.com/{}?extraurlparameters".format(x)
response = requests.request("DELETE", url)
print(x)
print(url)
print(response.text)
I think you misuse the format function:
import requests
ids = ["12ab", "13ab", "14ab"]
for id in ids:
url = ("https://google.com/{}?extraurlparameters".format(id))
response = requests.request("DELETE", url)
print(id)
print(url)
print(response.text)
Related
I have a set of brand numbers for a webpage url. I convert the webpage url into an f-string, and apply the brand number where it's supposed to. Each page has a unique ID to load the next page. I'm trying to extract this next page whilst matching the brand number the Id belongs to.
Here's some sample code:
import requests
import pandas as pd
from bs4 import BeautifulSoup
brands = [989,1344,474,1237,886,1,328,2188]
testid = {}
for b in brands:
url = f'https://webapi.depop.com/api/v2/search/products/?brands={b}&itemsPerPage=24&country=gb¤cy=GBP&sort=relevance'
payload={}
headers = {}
response = requests.request("GET", url, headers=headers, data=payload)
test= pd.read_json(StringIO(response.text), lines=True)
for m in test['meta'].items():
if m[1]['hasMore'] == True:
testid[str(b)]= [m[1]['cursor']]
else:
continue
for br in testid.keys():
while True:
html = f'https://webapi.depop.com/api/v2/search/products/?brands={br}&cursor={testid[str(br)][-1]}&itemsPerPage=24&country=gb¤cy=GBP&sort=relevance'
r = requests.request("GET",html, headers=headers, data=payload)
read_id = pd.read_json(StringIO(r.text), lines=True)
for m in read_id['meta'].items():
try:
testid[str(br)].append(m[1]['cursor'])
except:
continue
Here's the output it produces:
{'989': ['MnwyNHwxNjQwMDMwODcw']}
However, it replaces the values originally in the brand number and only leaves the last one collected. It should leave a list and produce something like this:
{'989': ['MnwyNHwxNjQwMDI4Mzk1', ...],
'1344': ['MnwyNHwxNjQwMDI4Mzk2', ...],
'474': ['MnwyNHwxNjQwMDI4Mzk3', ...],
'1237': ['MnwyNHwxNjQwMDI4Mzk3', ...],
'886': ['MnwyNHwxNjQwMDI4Mzk4', ...],
'1': ['MnwyNHwxNjQwMDI4Mzk4', ...],
'328': ['MnwyNHwxNjQwMDI4Mzk5', ...],
Where the triple dots ... denotes the additional ID values collected from the page with that brand number. How can I get an output like this?
After setting the testid list to be a collections.defaultdict(list) the rest falls out in a rather straightforward manner..
Note: I'm only going to fetch the first 3 cursors of any product but you can do them all as you like.
import collections
import requests
brands = [989,1344,474,1237,886,1,328,2188]
testid = collections.defaultdict(list)
for b in brands:
headers = {}
payload={}
url = f"https://webapi.depop.com/api/v2/search/products/?brands={b}&itemsPerPage=24&country=gb¤cy=GBP&sort=relevance"
response = requests.request("GET", url, headers=headers, data=payload)
data = response.json()
i = 0 # short circuit
while data.get("meta", {}).get("hasMore") and i < 3:
cursor = data.get("meta", {}).get("cursor")
testid[str(b)].append(cursor)
response = requests.request("GET", f"{url}&cursor={cursor}", headers=headers, data=payload)
data = response.json()
i += 1
for key, value in testid.items():
print(key, value)
This gives us:
989 ['MnwyNHwxNjQwMDMzMjM0']
1344 ['MnwyNHwxNjQwMDMzMjM1', 'M3w0OHwxNjQwMDMzMjM1', 'NHw3MnwxNjQwMDMzMjM1']
474 ['MnwyNHwxNjQwMDMzMjM3', 'M3w0OHwxNjQwMDMzMjM3', 'NHw3MnwxNjQwMDMzMjM3']
1237 ['MnwyNHwxNjQwMDMzMjM5', 'M3w0OHwxNjQwMDMzMjM5', 'NHw3MnwxNjQwMDMzMjM5']
886 ['MnwyNHwxNjQwMDMzMjQz', 'M3w0OHwxNjQwMDMzMjQz', 'NHw3MnwxNjQwMDMzMjQz']
1 ['MnwyNHwxNjQwMDMzMjQ4', 'M3w0OHwxNjQwMDMzMjQ4', 'NHw3MnwxNjQwMDMzMjQ4']
328 ['MnwyNHwxNjQwMDMzMjUz', 'M3w0OHwxNjQwMDMzMjUz', 'NHw3MnwxNjQwMDMzMjUz']
Wait a sec.... What is going on with:
data.get("meta", {}).get("hasMore")
Great question and I should have explained it before.
So, there is a chance that data.meta is not defined and if that was true, the following would fail;
data["meta"].get("hasMore")
as would
data.get("meta").get("hasMore")
So what we did:
data.get("meta", {}).get("hasMore")
was use the second parameter of get() to provide a default value. In this case it is just an empty dict but that is enough for us to safely chain the followup .get("hasMore") onto.
I'm trying to create a loop on some response data from a website. The loop should effectively grab the value within the tag cursor, and read this into the url link and repeat this continuously until the last link has no tag cursor. The cursor tag has a value which gives info on the next url page that is loading.
I so far only get it to work individually likeso:
import requests
import pandas as pd
from io import StringIO
def ids(url):
payload={}
headers = {}
response = requests.request("GET", url, headers=headers, data=payload)
test2 = []
test= pd.read_json(StringIO(response.text), lines=True)
for m in test['meta'].items():
for j in m[1].values():
test2.append([j])
id = test2[1][0]
#split - the next part should repeat
html = f'https://webapi.depop.com/api/v2/search/products/?brands=1596&cursor={id}&itemsPerPage=24&country=gb¤cy=GBP&sort=relevance'
r = requests.request("GET",html, headers=headers, data=payload)
list_id = []
read_id = pd.read_json(StringIO(r.text), lines=True)
for m in read_id['meta'].items():
for id in m[1].values():
list_id.append([id])
id2 = list_id[1][0]
return id2
Then I try:
url = 'https://webapi.depop.com/api/v2/search/products/?brands=1645&itemsPerPage=24&country=gb¤cy=GBP&sort=relevance'
ids(url)
However, I have to replicate the code below 'split' in such a way that it can just loop the new id (so id2) into the f-string for the html and keep extracting the value within cursor and store the values as a big list. My approach I can only grab them one-by-one, and will have to keep repeating the code above which will take an enormous amount of time and code. Is there a more efficient way of doing this?
Expected output:
['MnwyNHwxNjQwMDA4MjI2', 'M3w0OHwxNjQwMDA4MjI2', 'NnwxMjB8MTY0MDAwODIyNg', ...]
Ways it may work:
Run the code within a while loop and have the f string take the last string in list id2, so that when the code iterates and produces a new id it then takes that latest id and so forth until the last 2 are equal we stop.
Aftering testing with a while loop it seems to work but I end up getting an error anyways:
def ids(url):
payload={}
headers = {}
response = requests.request("GET", url, headers=headers, data=payload)
list_id=[]
test= pd.read_json(StringIO(response.text), lines=True)
for m in test['meta'].items():
list_id.append(m[1]['cursor'])
while True:
html = f'https://webapi.depop.com/api/v2/search/products/?brands=1596&cursor={list_id[-1]}&itemsPerPage=24&country=gb¤cy=GBP&sort=relevance'
r = requests.request("GET",html, headers=headers, data=payload)
read_id = pd.read_json(StringIO(r.text), lines=True)
for m in read_id['meta'].items():
try:
list_id.append(m[1]['cursor'])
except:
continue
if list_id[-2] == list_id[-1]:
break
return list_id
I get the following error right before the exception:
KeyError: 'meta'
But it still grabs the data when I check the list, how can I make it so that no error appears?
I've got a list of IDs which I want to pass through the URLs to collect the data on the comments. But I'm kinda of newb and when I'm trying to iterate over the list, I'm getting only one url and consequently data for one comment. Can someone, please, explain me what's wrong with my code and how to get URLs for all IDs in a list and consequently collect the data for all comments?
comments_from_reddit = ['fkkmga7', 'fkkgxtj', 'fkklfx3', ...]
def getPushshiftData():
for ID in range(len(comments_from_reddit)):
url = 'https://api.pushshift.io/reddit/comment/search?ids={}'.format(comments_from_reddit[ID])
print(url)
req = requests.get(url)
data = json.loads(req.text)
return data['data']
data = getPushshiftData()
Output I'm getting: https://api.pushshift.io/reddit/comment/search?ids=fkkmga7
I will really appreciate any help on my issue. Thanks for your attention.
This should work:
comments_from_reddit = ['fkkmga7', 'fkkgxtj', 'fkklfx3', ...]
def getPushshiftData():
result = list()
for ID in range(len(comments_from_reddit)):
url = 'https://api.pushshift.io/reddit/comment/search?ids={}'.format(comments_from_reddit[ID])
print(url)
req = requests.get(url)
data = json.loads(req.text)
result.append( data['data'] )
return result
data = getPushshiftData()
I am want to extract all Wikipedia titles via API.Each response contains continue key which is used to get next logical batch,but after 30 requests continue key starts to repeat it mean I am receiving same pages.
I have tried the following code above and Wikipedia documentation
https://www.mediawiki.org/wiki/API:Allpages
def get_response(self, url):
resp = requests.get(url=url)
return resp.json()
appcontinue = []
url = 'https://en.wikipedia.org/w/api.php?action=query&list=allpages&format=json&aplimit=500'
json_resp = self.get_response(url)
next_batch = json_resp["continue"]["apcontinue"]
url +='&apcontinue=' + next_batch
appcontinue.append(next_batch)
while True:
json_resp = self.get_response(url)
url = url.replace(next_batch, json_resp["continue"]["apcontinue"])
next_batch = json_resp["continue"]["apcontinue"]
appcontinue.append(next_batch)
I am expecting to receive more than 10000 unique continue keys as one response could contains max 500 Titles.
Wikipedia has 5,673,237 articles in English.
Actual response. I did more than 600 requests and there is only 30 unique continue keys.
json_resp["continue"] contains two pairs of values, one is apcontinue and the other is continue. You should add them both to your query. See https://www.mediawiki.org/wiki/API:Query#Continuing_queries for more details.
Also, I think it'll be easier to use the params parameter of request.get instead of manually replacing the continue values. Perhaps something like this:
import requests
def get_response(url, params):
resp = requests.get(url, params)
return resp.json()
url = 'https://en.wikipedia.org/w/api.php?action=query&list=allpages&format=json&aplimit=500'
params = {}
while True:
json_resp = get_response(url, params)
params = json_resp["continue"]
...
A'm just a beginner in Python. Please, help me with such problem:
I have API documentation (Server allowed methods):
GET, http://view.example.com/candidates/, shows a
candidate with id=. Returns 200
I write such code:
import requests
url = 'http://view.example.com/candidates/4'
r = requests.get(url)
print r
But I want to now how can I put id of candidate through "input()" builtin-function instead of including it to URL.
There is my efforts to do this:
import requests
cand_id = input('Please, type id of askable candidate: ')
url = ('http://view.example.com/candidates' + 'cand_id')
r = requests.get(url)
print r
dir(r)
r.content
But it's not working...
You can do this to construct the url:
url = 'http://view.example.com/candidates'
params = { 'cand_id': 4 }
requests.get(url, params=params)
Result: http://view.example.com/candidates?cand_id=4
--
Or if you want to build the same url as you mentioned in your post:
url = 'http://view.example.com/candidates'
cand_id = input("Enter a candidate id: ")
new_url = "{}/{}".format(url, cand_id)
Result: http://view.example.com/candidates/4
you're using the string 'cand_id' instead of the variable cand_id. The string creates a url of 'http://view.example.com/candidatescand_id'