So basically, I have an api from which i have several dictionaries/arrays. (http://dev.c0l.in:5984/income_statements/_all_docs)
When getting the financial information for each company from the api (e.g. sector = technology and statement = income) python is supposed to return 614 technology companies, however i get this error:
Traceback (most recent call last):
File "C:\Users\samuel\Desktop\Python Project\Mastercopy.py", line 83, in <module>
user_input1()
File "C:\Users\samuel\Desktop\Python Project\Mastercopy.py", line 75, in user_input1
income_statement_fn()
File "C:\Users\samuel\Desktop\Python Project\Mastercopy.py", line 51, in income_statement_fn
if is_response ['sector'] == user_input3:
KeyError: 'sector'
on a random company (usually on one of the 550-600th ones)
Here is the function for income statements
def income_statement_fn():
user_input3 = raw_input("Which sector would you like to iterate through in Income Statement?: ")
print 'Starting...'
for item in income_response['rows']:
is_url = "http://dev.c0l.in:5984/income_statements/" + item['id']
is_request = urllib2.urlopen(is_url).read()
is_response = json.loads(is_request)
if is_response ['sector'] == user_input3:
csv.writerow([
is_response['company']['name'],
is_response['company']['sales'],
is_response['company']['opening_stock'],
is_response['company']['purchases'],
is_response['company']['closing_stock'],
is_response['company']['expenses'],
is_response['company']['interest_payable'],
is_response['company']['interest_receivable']])
print 'loading...'
print 'done!'
print end - start
Any idea what could be causing this error?
(I don't believe that it is the api itself)
Cheers
Well, on testing the url you pass in the urlopen call, with a random number, I got this:
{"error":"not_found","reason":"missing"}
In that case, your function will return exactly the error you get. If you want your program to handle the error nicely and add a "missing" line instead of actual data, you could do that for instance:
def income_statement_fn():
user_input3 = raw_input("Which sector would you like to iterate through in Income Statement?: ")
print 'Starting...'
for item in income_response['rows']:
is_url = "http://dev.c0l.in:5984/income_statements/" + item['id']
is_request = urllib2.urlopen(is_url).read()
is_response = json.loads(is_request)
if is_response.get('sector', False) == user_input3:
csv.writerow([
is_response['company']['name'],
is_response['company']['sales'],
is_response['company']['opening_stock'],
is_response['company']['purchases'],
is_response['company']['closing_stock'],
is_response['company']['expenses'],
is_response['company']['interest_payable'],
is_response['company']['interest_receivable']])
print 'loading...'
else:
csv.writerow(['missing data'])
print 'done!'
print end - start
The problem seems to be with the final row of your income_response data
{"id":"_design/auth","key":"_design/auth","value":{"rev":"1-3d8f282ec7c26779194caf1d62114dc7"}}
This does not have a sector value. You need to alter your code to handle this line, for example by ignoring any line where the sector key is not present.
You could easily have debugged this with a few print statements - for example insert
print item['id'], is_response.get('sector', None)
into your code before the part that outputs the CSV.
A KeyError means that the key you tried to use does not exist in the dictionary. When checking for a key, it is much safer to use .get(). So you would replace this line:
if is_response['sector'] == user_input3:
With this:
if is_response.get('sector') == user_input3:
Related
I am trying to create a telegram-bot that will create notes in notion, for this I use:
notion-py
pyTelegramBotAPI
then I connected my notion by adding token_v2, and then receiving data about the note that I want to save in notion, at the end I save a note on notion like this:
def make_notion_row():
collection_view = client.get_collection_view(list_url[temporary_category]) #take collection
print(temporary_category)
print(temporary_name)
print(temporary_link)
print(temporary_subcategory)
print(temporary_tag)
row = collection_view.collection.add_row() #make row
row.ssylka = temporary_link #this is link
row.nazvanie_zametki = temporary_name #this is name
if temporary_category == 0: #this is category, where do I want to save the note
row.stil = temporary_subcategory #this is subcategory
tags = temporary_tag.split(',') #temporary_tags is text that has many tags separated by commas. I want to get these tags as an array
for tag_one in tags:
**add_new_multi_select_value("Теги", tag_one): #"Теги" is "Tag column" in russian. in this situation, tag_one takes on the following values: ['my_hero_academia','midoria']**
else:
row.kategoria = temporary_subcategory
this script works, but the problem is filling in the Tags column which is of type multi-select.
Since in the readme 'notion-py', nothing was said about filling in the 'multi-select', therefore
I used the bkiac function:https://github.com/jamalex/notion-py/issues/51
here is the slightly modified by me function:
art_tags = ['ryuko_matoi', 'kill_la_kill']
def add_new_multi_select_value(prop, value, style=None):
global temporary_prop_schema
if style is None:
style = choice(art_tags)
collection_schema = collection_view.collection.get(["schema"])
prop_schema = next(
(v for k, v in collection_schema.items() if v["name"] == prop), None
)
if not prop_schema:
raise ValueError(
f'"{prop}" property does not exist on the collection!'
)
if prop_schema["type"] != "multi_select":
raise ValueError(f'"{prop}" is not a multi select property!')
dupe = next(
(o for o in prop_schema["options"] if o["value"] == value), None
)
if dupe:
raise ValueError(f'"{value}" already exists in the schema!')
temporary_prop_schema = prop_schema
prop_schema["options"].append(
{"id": str(uuid1()), "value": value, "style": style}
)
collection.set("schema", collection_schema)`
But it turned out that this function does not work, and gives the following error:
add_new_multi_select_value("Теги","my_hero_academia)
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
add_new_multi_select_value("Теги","my_hero_academia)
File "C:\Users\laere\OneDrive\Documents\Programming\Other\notion-bot\program\notionbot\test.py", line 53, in add_new_multi_select_value
collection.set("schema", collection_schema)
File "C:\Users\laere\AppData\Local\Programs\Python\Python39-32\lib\site-packages\notion\records.py", line 115, in set
self._client.submit_transaction(
File "C:\Users\laere\AppData\Local\Programs\Python\Python39-32\lib\site-packages\notion\client.py", line 290, in submit_transaction
self.post("submitTransaction", data)
File "C:\Users\laere\AppData\Local\Programs\Python\Python39-32\lib\site-packages\notion\client.py", line 260, in post
raise HTTPError(
requests.exceptions.HTTPError: Unsaved transactions: Not allowed to edit column: schema
this is my table image: link
this is my telegram chatting to bot: link
Honestly, I don’t know how to solve this problem, the question is how to fill a column of type 'multi-select'?
I solved this problem using this command
row.set_property("Категория", temporary_subcategory)
and do not be afraid if there is an error "options ..." this can be solved by adding settings for the 'multi-select' field.
I’m trying to scrape some data of TripAdvisor.
I'm interested to get the "Price Range/ Cuisine & Meals" of restaurants.
So I use the following xpath to extract each of this 3 lines in the same class :
response.xpath('//div[#class="restaurants-detail-overview-cards-DetailsSectionOverviewCard__categoryTitle--14zKt"]/text()').extract()[1]
I'm doing the test directly in scrapy shell and it's working fine :
scrapy shell https://www.tripadvisor.com/Restaurant_Review-g187514-d15364769-Reviews-La_Gaditana_Castellana-Madrid.html
But when I integrate it to my script, I've the following error :
Traceback (most recent call last):
File "/usr/lib64/python3.6/site-packages/scrapy/utils/defer.py", line 102, in iter_errback
yield next(it)
File "/usr/lib64/python3.6/site-packages/scrapy/spidermiddlewares/offsite.py", line 29, in process_spider_output
for x in result:
File "/usr/lib64/python3.6/site-packages/scrapy/spidermiddlewares/referer.py", line 339, in <genexpr>
return (_set_referer(r) for r in result or ())
File "/usr/lib64/python3.6/site-packages/scrapy/spidermiddlewares/urllength.py", line 37, in <genexpr>
return (r for r in result or () if _filter(r))
File "/usr/lib64/python3.6/site-packages/scrapy/spidermiddlewares/depth.py", line 58, in <genexpr>
return (r for r in result or () if _filter(r))
File "/root/Scrapy_TripAdvisor_Restaurant-master/tripadvisor_las_vegas/tripadvisor_las_vegas/spiders/res_las_vegas.py", line 64, in parse_listing
(response.xpath('//div[#class="restaurants-details-card-TagCategories__categoryTitle--o3o2I"]/text()')[1])
File "/usr/lib/python3.6/site-packages/parsel/selector.py", line 61, in __getitem__
o = super(SelectorList, self).__getitem__(pos)
IndexError: list index out of range
I paste you part of my code and I explain it below :
# extract restaurant cuisine
row_cuisine_overviewcard = \
(response.xpath('//div[#class="restaurants-detail-overview-cards-DetailsSectionOverviewCard__categoryTitle--14zKt"]/text()')[1])
row_cuisine_card = \
(response.xpath('//div[#class="restaurants-details-card-TagCategories__categoryTitle--o3o2I"]/text()')[1])
if (row_cuisine_overviewcard == "CUISINES"):
cuisine = \
response.xpath('//div[#class="restaurants-detail-overview-cards-DetailsSectionOverviewCard__tagText--1XLfi"]/text()')[1]
elif (row_cuisine_card == "CUISINES"):
cuisine = \
response.xpath('//div[#class="restaurants-details-card-TagCategories__tagText--2170b"]/text()')[1]
else:
cuisine = None
In tripAdvisor restaurants, there is 2 different type of pages, with 2 different format.
The first with a class overviewcard, an the second, with a class cards
So I want to check if the first is present (overviewcard), if not, execute the second (card), and if not, put "None" value.
:D But looks like Python execute both .... and as the second one don't exist in the page, the script stop.
Could it be an indentation error ?
Thanks for your help
Regards
Your second selector (row_cuisine_card) fails because the element does not exist on the page. When you then try to access [1] in the result it throws an error because the result array is empty.
Assuming you really want item 1, try this
row_cuisine_overviewcard = \
(response.xpath('//div[#class="restaurants-detail-overview-cards-DetailsSectionOverviewCard__categoryTitle--14zKt"]/text()')[1])
# Here we get all the values, even if it is empty.
row_cuisine_card = \
(response.xpath('//div[#class="restaurants-details-card-TagCategories__categoryTitle--o3o2I"]/text()').getall())
if (row_cuisine_overviewcard == "CUISINES"):
cuisine = \
response.xpath('//div[#class="restaurants-detail-overview-cards-DetailsSectionOverviewCard__tagText--1XLfi"]/text()')[1]
# Here we check first if that result has more than 1 item, and then we check the value.
elif (len(row_cuisine_card) > 1 and row_cuisine_card[1] == "CUISINES"):
cuisine = \
response.xpath('//div[#class="restaurants-details-card-TagCategories__tagText--2170b"]/text()')[1]
else:
cuisine = None
You should apply the same kind of safety checking whenever you try to get a specific index from a selector. In other words, make sure you have a value before you access it.
Your problem is already in your check in this line_
row_cuisine_card = \
(response.xpath('//div[#class="restaurants-details-card-TagCategories__categoryTitle--o3o2I"]/text()')[1])
You are trying to extract a value from the website that may not exist. In other words, if
response.xpath('//div[#class="restaurants-details-card-TagCategories__categoryTitle--o3o2I"]/text()')
returns no or only one element, then you cannot access the second element in the returned list (which you want to access with the appended [1]).
I would recommend storing the values that you extract from the website into a local variable first in order to then check whether or not a value that you want has been found. My guess is that the page it breaks on does not have the information you want.
This could roughly look like the following code:
# extract restaurant cuisine
cuisine = None
cuisine_overviewcard_sections = response.xpath('//div[#class="restaurants-detail-overview-cards-DetailsSectionOverviewCard__categoryTitle--14zKt"]/text()'
if len(cuisine_overviewcard_sections) >= 2:
row_cuisine_overviewcard = cuisine_overviewcard_sections[1]
cuisine_card_sections = response.xpath('//div[#class="restaurants-details-card-TagCategories__categoryTitle--o3o2I"]/text()'
if len(cuisine_card_sections) >= 2:
row_cuisine_card = cuisine_card_sections[1]
if (row_cuisine_overviewcard == "CUISINES"):
cuisine = \
response.xpath('//div[#class="restaurants-detail-overview-cards-DetailsSectionOverviewCard__tagText--1XLfi"]/text()')[1]
elif (row_cuisine_card == "CUISINES"):
cuisine = \
response.xpath('//div[#class="restaurants-details-card-TagCategories__tagText--2170b"]/text()')[1]
Since you only need a part of the information, if the first XPath check already returns the correct answer, the code can be beautified a bit:
# extract restaurant cuisine
cuisine = None
cuisine_overviewcard_sections = response.xpath('//div[#class="restaurants-detail-overview-cards-DetailsSectionOverviewCard__categoryTitle--14zKt"]/text()'
if len(cuisine_overviewcard_sections) >= 2 and cuisine_overviewcard_sections[1] == "CUISINES":
cuisine = \
response.xpath('//div[#class="restaurants-detail-overview-cards-DetailsSectionOverviewCard__tagText--1XLfi"]/text()')[1]
else:
cuisine_card_sections = response.xpath('//div[#class="restaurants-details-card-TagCategories__categoryTitle--o3o2I"]/text()'
if len(cuisine_card_sections) >= 2 and cuisine_card_sections[1] == "CUISINES":
cuisine = \
response.xpath('//div[#class="restaurants-details-card-TagCategories__tagText--2170b"]/text()')[1]
This way you only do a (potentially expensive) XPath search when it actually is necessary.
I've got a weird problem. I've made a vba code in excel, that calls a python code that get information from the excel sheet and put it into a database. Yesterday there was no problem. Today I start my computer and tried the vba code and it errors in the python file.
The error:
testchipnr = TC104
Traceback (most recent call last):
testchipID = 108
File "S:/3 - Technical/13 - Reports & Templates/13 - Description/DescriptionToDatabase/DescriptionToDatabase.py", line 40, in <module>
TestchipID = cursorOpenShark.fetchone()[0] # Fetch a single row using fetchone() method and store the result in a variable., the [0] fetches only 1 value
TypeError: 'NoneType' object has no attribute '__getitem__'
The weird thing is that there is a value in the database -> testchipID ...
My code:
#get the testchipID and the testchipname
testchipNr = sheet.cell(7, 0).value # Get the testchipnr
print "testchipnr = ", testchipNr
queryTestchipID = """SELECT testchipid FROM testchip WHERE nr = '%s'""" %(testchipNr)
cursorOpenShark.execute(queryTestchipID)
print "testchipID = ", cursorOpenShark.fetchone()[0]
TestchipID = cursorOpenShark.fetchone()[0] # Fetch a single row using fetchone() method and store the result in a variable., the [0] fetches only 1 value
As you are saying, it was working fine but now it is not, the only reason that I can think of is when your query is returning Zero results.
Looking at your code, if your query is returning only one result, then it will print the result and the next time you are making the call to cursorOpenShark.fetchone() to store in TestchipID it would return None.
Instead, can you try the following code
#get the testchipID and the testchipname
testchipNr = sheet.cell(7, 0).value # Get the testchipnr
print "testchipnr = ", testchipNr
queryTestchipID = """SELECT testchipid FROM testchip WHERE nr = '%s'""" %(testchipNr)
cursorOpenShark.execute(queryTestchipID)
TestchipID = cursorOpenShark.fetchone()[0] # Fetch a single row using fetchone() method and store the result in a variable., the [0] fetches only 1 value
if TestchipID:
print "testchipID = ", TestchipID[0]
else:
print "Returned nothing"
Let me know if this works.
I have the following code that uses 3 strings 'us dollars','euro', '02-11-2014',
and a number to calculate the exchange rate for that given date. I modified the
code to pass those arguments but I get an error when I try to call it with
python currencyManager.py "us dollars" "euro" 100 "02-11-2014"
Traceback (most recent call last):
File "currencyManager.py", line 37. in <module>
currencyManager(currTo,currFrom,currAmount,currDate)
NameError: name 'currTo' is not defined
I'm fairly new to Python so my knowledge is limited. Any help would be greatly appreciated. Thanks.
Also the version of Python I'm using is 3.4.2.
import urllib.request
import re
def currencyManager(currTo,currFrom,currAmount,currDate):
try:
currency_to = currTo #'us dollars'
currency_from = currFrom #'euro'
currency_from_amount = currAmount
on_date = currDate # Day-Month-Year
currency_from = currency_from.replace(' ', '+')
currency_to = currency_to.replace(' ', '+')
url = 'http://www.wolframalpha.com/input/?i=' + str(currency_from_amount) + '+' + str(currency_from) + '+to+' + str(currency_to) + '+on+' + str(on_date)
req = urllib.request.Request(url)
output = ''
urllib.request.urlopen(req)
page_fetch = urllib.request.urlopen(req)
output = page_fetch.read().decode('utf-8')
search = '<area shape="rect.*href="\/input\/\?i=(.*?)\+.*?&lk=1'
result = re.findall(r'' + search, output, re.S)
if len(result) > 0:
amount = float(result[0])
print(str(amount))
else:
print('No match found')
except URLError as e:
print(e)
currencyManager(currTo,currFrom,currAmount,currDate)
The command line
python currencyManager.py "us dollars" "euro" 100 "02-11-2014"
does not automatically assign "us dollars" "euro" 100 "02-11-2014" to currTo,currFrom,currAmount,currDate.
Instead the command line arguments are stored in a list, sys.argv.
You need to parse sys.argv and/or pass its values on to the call to currencyManager:
For example, change
currencyManager(currTo,currFrom,currAmount,currDate)
to
import sys
currencyManager(*sys.argv[1:5])
The first element in sys.argv is the script name. Thus sys.argv[1:5] consists of the next 4 arguments after the script name (assuming 4 arguments were entered on the command line.) You may want to check that the right number of arguments are passed on the command line and that they are of the right type. The argparse module can help you here.
The * in *sys.argv[1:5] unpacks the list sys.argv[1:5] and passes the items in the list as arguments to the function currencyManager.
For my IRC bot when I try to use the `db addcol command I will get that Index error but I got no idea what is wrong with it.
#hook.command(adminonly=True, autohelp=False)
def db(inp,db=None):
split = inp.split(' ')
action = split[0]
if "init" in action:
result = db.execute("create table if not exists users(nick primary key, host, location, greeting, lastfm, fines, battlestation, desktop, horoscope, version)")
db.commit()
return result
elif "addcol" in action:
table = split[1]
col = split[2]
if table is not None and col is not None:
db.execute("ALTER TABLE {} ADD COLUMN {}".format(table,col))
db.commit
return "Added Column"
That is the command I am trying to execute and here is the error:
Unhandled exeption in thread started by <function run at 0xb70d6844
Traceback (most recent call last):
File "core/main.py", line 68, in run
out = func(input.inp, **kw)
File "plugins/core_admin_global.py", :ine 308, in db
col = split[2]
IndexError: list index out of range
You will find the whole code at my GIT repository.
Edit: This bot is just a little thing I have been playing around with while learning python so don't expect me to be too knowledgeable about this.
Yet another edit:
The command I am trying to add, just replace desktop with mom.
#hook.command(autohelp=False)
def desktop(inp, nick=None, conn=None, chan=None,db=None, notice=None):
"desktop http://url.to/desktop | # nick -- Shows a users Desktop."
if inp:
if "http" in inp:
database.set(db,'users','desktop',inp.strip(),'nick',nick)
notice("Saved your desktop.")
return
elif 'del' in inp:
database.set(db,'users','desktop','','nick',nick)
notice("Deleted your desktop.")
return
else:
if '#' in inp: nick = inp.split('#')[1].strip()
else: nick = inp.strip()
result = database.get(db,'users','desktop','nick',nick)
if result:
return '{}: {}'.format(nick,result)
else:
if not '#' in inp: notice(desktop.__doc__)
return 'No desktop saved for {}.'.format(nick)