Sorry, trying to understand and get used to dictionary and list objects.
I'm calling eBay's API through their ebaysdk, and want to store the items from it to a collection as documents in Mongo. Simple.
Here's a sample of the schema that will be returned:
<timestamp>2009-09-04T00:47:12.456Z</timestamp>
<searchResult count="2">
<item>
<itemId>230371938681</itemId>
<title>Harry Potter and the Order of the Phoenix HD-DVD</title>
<globalId>EBAY-US</globalId>
<primaryCategory>
<categoryId>617</categoryId>
<categoryName>DVD, HD DVD & Blu-ray</categoryName>
</primaryCategory>
I've tried 500 iterations of this code, stripped down to the most basic here's what I have.
from ebaysdk import finding
from pymongo import MongoClient
api = finding(appid="billy-40d0a7e49d87")
api.execute('findItemsByKeywords', {'keywords': 'potter'})
listings = api.response_dict()
client = MongoClient('mongodb://user:pass#billy.mongohq.com:10099/ebaystuff')
db = client['ebaycollection']
ebay_collection = db.ebaysearch
for key in listings:
print key
ebay_collection.insert(key)
Will get this error:
Traceback (most recent call last):
File "ebay_search.py", line 34, in <module>
ebay_collection.insert(key)
File "/Library/Python/2.7/site-packages/pymongo/collection.py", line 408, in insert
self.uuid_subtype, client)
File "/Library/Python/2.7/site-packages/pymongo/collection.py", line 378, in gen
doc['_id'] = ObjectId()
TypeError: 'str' object does not support item assignment
Simple stuff. All I want to do is add each item as a document.
An immutable type like a string cannot be used as a document because it doesn't allow adding additional fields, like the _id field Mongo requires. You can instead wrap the string in a dictionary to serve as a wrapper document:
key_doc = {'key': key}
ebay_collection.insert(key_doc)
Related
I am a beginner in python. I'm trying to create a dictionary in a JSON file that I created before. This dictionary must contain the geometrical TYPE of the elements that I retrieve via an API. I tried with the following code but I have the following error message:
with open(filename) as json_file:
data_raw = json.load(json_file)
data_events = dict(type=data_raw['type'], features=[])
The API looks like this :
...
"geometry":{
"type":"Point",
"coordinates":[
2.900875,
48.550178
]
},
...
the error I have :
Traceback (most recent call last):
File "<string>", line 25, in <module>
KeyError: 'type'
What should i do?
Notice that the object that you're attempting to access is assigned to the geometry field. These objects are in a list assigned to the records field in the outermost object. So you'll need to use data_raw["records"][(integer index of record you want to access)]["geometry"]["type"] to access the desired field.
I've been trying to get attachment image data from documents in Cloudant.
I can successfully do it once a document is selected (direct extract with _id, etc).
Now trying to do it in combination with "query" operation using selector, I run into trouble.
Here is my code.
targetName="chibika33"
targetfile="chibitest.png"
#--------------------------------------------------
# get all the documents with the specific nameField
#--------------------------------------------------
myDatabase.create_query_index(fields = ['nameField'])
selector = {'nameField': {'$eq': targetName}}
docs = myDatabase.get_query_result(selector)
#--------------------------------------------------
# get the attachment files to them, save it locally
#--------------------------------------------------
count = 0
for doc in docs:
count=count+1
result_filename="result%03d.png"%(count)
dataContent = doc.get_attachment(targetfile, attachment_type='binary')
dataContentb =base64.b64decode(dataContent)
with open(result_filename,'wb') as output:
output.write(dataContentb)
Causes error as;
Traceback (most recent call last):
File "view8.py", line 44, in <module>
dataContent = doc.get_attachment(targetfile, attachment_type='binary')
AttributeError: 'dict' object has no attribute 'get_attachment'
So far, I've been unable to find any API for converting dict to document object in the python-cloudant-document...[python-cloudant document]: http://python-cloudant.readthedocs.io/en/latest/index.html
Any advise would be highly appreciated.
The returned structure from get_query_result(...) isn't an array of documents.
Try:
resp = myDatabase.get_query_result(selector)
for doc in resp['docs']:
# your code here
See the docs at:
http://python-cloudant.readthedocs.io/en/latest/database.html#cloudant.database.CloudantDatabase.get_query_result
I am reading a json file with dictionary and values, but I am battling to use a variable as a query item when searching the json file.
x = value_cloud = "%s%s%s" % (["L1_METADATA_FILE"],["IMAGE_ATTRIBUTES"],["CLOUD_COVER"])
for meta in filelist(dir):
with open (meta) as data_file:
data = json.load(data_file)
cloud = str(data[x])
The error I get is:
Traceback (most recent call last):
File "E:\SAMPLE\Sample_Script_AWS\L8_TOA_using_gdal_rasterio.py", line 96, in <module>
cloud = str(data[x])
KeyError: "['L1_METADATA_FILE']['IMAGE_ATTRIBUTES']['CLOUD_COVER']"
What I actually want is to search the json file for the key in the variable...
The keys do exist in the json file because when I run the following I get the correct output.
cloud = str(data["L1_METADATA_FILE"]["IMAGE_ATTRIBUTES"]["CLOUD_COVER"])
print cloud
My knowledge of python is sketchy, and I am passing the variable through as a string and not an expression or object and therefore it gives me that error. What is the correct way to create the variable and call the keys that I want.
Thanks in advance!
Your key ends up including the brackets in the string, which which where the error comes from. If you use each key in its own variable, like this:
x, y, z = "L1_METADATA_FILE", "IMAGE_ATTRIBUTES" , "CLOUD_COVER"
and then:
cloud = str(data[x][y][z])
it should avoid any errors.
I'm attempting to write a program that utilizes urllib2 to parse HTML, and then utilizes PyRSS2Gen to create the RSS feed, in XML.
I keep getting the error
Traceback (most recent call last):
File "pythonproject.py", line 46, in <module>
get_rss()
File "pythonproject.py", line 43, in get_rss
rss.write_xml(open("cssnews.rss.xml", "w"))
File "build/lib/PyRSS2Gen.py", line 34, in write_xml
self.publish(handler)
File "build/lib/PyRSS2Gen.py", line 380, in publish
item.publish(handler)
File "build/lib/PyRSS2Gen.py", line 427, in publish
_opt_element(handler, "title", self.title)
File "build/lib/PyRSS2Gen.py", line 58, in _opt_element
_element(handler, name, obj)
File "build/lib/PyRSS2Gen.py", line 53, in _element
obj.publish(handler)
AttributeError: 'builtin_function_or_method' object has no attribute 'publish'
upon trying to run it.
From what I could find, other users came across this issue when trying to create a new tag for the XML, but I am trying to use the default tags given with PyRSS2Gen. Inspecting the PyRSS2Gen.py file shows the write_xml() command I am using, so is the error with how I am assigning values to the rss items by popping them from a list?
def get_rss():
sys.path.append('build/lib')
from PyRSS2Gen import RSS2, RSSItem
rss = RSS2(
title = 'Python RSS Creator',
link = 'technews.acm.org',
description = 'Creates RSS out of HTML',
items = [],
)
for x in range(0, len(rssTitles)):
rss.items.append(RSSItem(
title = rssTitles.pop,
link = rssLinks.pop,
description = rssDesc.pop,
))
rss.write_xml(open("cssnews.rss.xml", "w"))
# 5 - Call function
get_rss()
I ended up just writing out to a file, like so;
news = open("news.rss.xml", "w")
news.write("<?xml version=\"1.0\" ?>")
news.write("\n")
news.write("<rss xmlns:atom=\"http://www.w3.org/2005/Atom\" version=\"2.0\">")
news.write("\n")
news.write("<channel>")
news.write("\n")
etc.
PyRSS2Gen relies on its inputs to either be strings or to have a publish method that does all the necessary conversion.
In this case, you missed to call the pop method on the rssTitles, giving you a function rather than a string. Adding () after all the pop mentions should give you a usable program.
Note that similar errors can also crop up when there's other non-sting items around (eg. byte strings); the AttributeError line gives you a hint as to the object that went into the RSS item, and the backtrace indicates where in the RSS item that is (the title, in this case).
I want to read a BSON format Mongo dump in Python and process the data. I am using the Python bson package (which I'd prefer to use rather than have a pymongo dependency), but it doesn't explain how to read from a file.
This is what I'm trying:
bson_file = open('statistics.bson', 'rb')
b = bson.loads(bson_file)
print b[0]
But I get:
Traceback (most recent call last):
File "test.py", line 11, in <module>
b = bson.loads(bson_file)
File "/Library/Python/2.7/site-packages/bson/__init__.py", line 75, in loads
return decode_document(data, 0)[1]
File "/Library/Python/2.7/site-packages/bson/codec.py", line 235, in decode_document
length = struct.unpack("<i", data[base:base + 4])[0]
TypeError: 'file' object has no attribute '__getitem__'
What am I doing wrong?
I found this worked for me with a mongodb 2.4 BSON file and PyMongo's 'bson' module:
import bson
with open('survey.bson','rb') as f:
data = bson.decode_all(f.read())
That returned a list of dictionaries matching the JSON documents stored in that mongo collection.
The f.read() data looks like this in a BSON:
>>> rawdata[:100]
'\x04\x01\x00\x00\x12_id\x00\x01\x00\x00\x00\x00\x00\x00\x00\x02_type\x00\x07\x00\x00\x00simple\x00\tchanged\x00\xd0\xbb\xb2\x9eI\x01\x00\x00\tcreated\x00\xd0L\xdcfI\x01\x00\x00\x02description\x00\x14\x00\x00\x00testing the bu'
The documentation states :
> help(bson.loads)
Given a BSON string, outputs a dict.
You need to pass a string. For example:
> b = bson.loads(bson_file.read())
loads expects a string (that's what the 's' stands for), not a file. Try reading from the file, and passing the result to loads.