Why does CouchDb-python (or do I) confuse strings and dictionaries? - python

I'm trying to use the Python wrapper for CouchDB to update a database. The file is structured as a nested dictionary as follows.
doc = { ...,
'RLSoo': {'RT_freq': 2, 'tweet': "They're going to play monopoly now.
This makes me feel like an excellent mother. #Sandy #NYC"},
'GiltCityNYC': {},
....}
I would like to put each entry of the larger dicitionary, for example RLSoo into its own document. However, I get an error message when I try the following code.
for key in doc:
db.update(doc[key],all_or_nothing=True)
Error Message
TypeError: expected dict, got <type 'str'>
I don't understand why CouchDB won't accept the dictionary.

According Database.update() method realization and his documentation, first argument should be list of document objects (e.g. list of dicts). Since you doc variable has dict type, direct iteration over it actually iterates over all his keys which are string typed. If I understood your case right, probably your doc contains nested documents as values. So, try just:
db.update(doc.values(), all_or_nothing=True)
And it all first level values are dicts, it should works!

Related

How to see the content of a Gensim-generated dictionary?

I am running topic modeling using Gensim. Before creating the document-term matrix, one needs to create a dictionary of tokens.
dictionary = corpora.Dictionary(tokenized_reviews)
doc_term_matrix = [dictionary.doc2bow(rev) for rev in tokenized_reviews]
But, I don't understand what kind of object "dictionary" is.
So, when I type:
type(dictionary)
I get
gensim.corpora.dictionary.Dictionary
Is this a dictionary ( a kind of data structure)? If so, why can't I see the content (I am just curious)?
When I type
dictionary
I get:
<gensim.corpora.dictionary.Dictionary at 0x1bac985ebe0>
The same issue exists with some of the objects in NLTK.
If this is a dictionary (as a data structure), why I am not able to see the keys and values like any other Python dictionary?
Thanks,
Navid
This is a specific Dictionary class implemented by the Gensim project.
It will be very similar in interface to the standard Python dict (and other various Dictionary/HashMap/etc types you may have used elsewhere).
However, to see exactly what it can do, you should consult the class-specific documentation:
https://radimrehurek.com/gensim/corpora/dictionary.html
Like a dict, you can do typical operations:
len(dictionary) # gets number of entries
dictionary[key] # gets the value at a certain key (word)
dictionary.keys() # gets all stored keys
The reason you see a generic <gensim.corpora.dictionary.Dictionary at 0x1bac985ebe0> when you try to display the value of the dictionary itself is that it hasn't defined any convenience display-string with more info - so you're seeing the default for any random Python object. (Such dictionaries are usually far too large to usefull dump their full contents whenever asked, generically, to "show yourself".

Python convert named string fields to tuple

Similar to this question: Tuple declaration in Python
I have this function:
def get_mouse():
# Get: x:4631 y:506 screen:0 window:63557060
mouse = os.popen( "xdotool getmouselocation" ).read().splitlines()
print mouse
return mouse
When I run it it prints:
['x:2403 y:368 screen:0 window:60817757']
I can split the line and create 4 separate fields in a list but from Python code examples I've seen I feel there is a better way of doing it. I'm thinking something like x:= or window:=, etc.
I'm not sure how to properly define these "named tuple fields" nor how to reference them in subsequent commands?
I'd like to read more on the whole subject if there is a reference link handy.
It seems it would be a better option to use a dictionary here. Dictionaries allow you to set a key, and a value associated to that key. This way you can call a key such as dictionary['x'] and get the corresponding value from the dictionary (if it exists!)
data = ['x:2403 y:368 screen:0 window:60817757'] #Your return data seems to be stored as a list
result = dict(d.split(':') for d in data[0].split())
result['x']
#'2403'
result['window']
#'60817757'
You can read more on a few things here such as;
Comprehensions
Dictionaries
Happy learning!
try
dict(mouse.split(':') for el in mouse
This should give you a dict (rather than tuples, though dicts are mutable and also required hashability of keys)
{x: 2403, y:368, ...}
Also the splitlines is probably not needed, as you are only reading one line. You could do something like:
mouse = [os.popen( "xdotool getmouselocation" ).read()]
Though I don't know what xdotool getmouselocation does or if it could ever return multiple lines.

Error with unhashable type while using TweetTokenize

I start by downloading some tweets from Twitter.
tweet_text = DonaldTrump["Tweets"]
tweet_text = tweet_text.str.lower()
Then in next step, we move with TweetTokenizer.
Tweet_tkn = TweetTokenizer()
tokens = [Tweet_tkn.tokenize(t) for t in tweet_text]
tokens[0:3]
Can someone explain to me and help me solve it.
I have been through similar questions that face similar errors but they provide different solutions.
Lists are mutable and can therefore not be used as dict keys. Otherwise, the program could add a list to a dictionary, change its value, and it is now unclear whether the value in the dictionary should be available under the new or the old list value, or neither.
If you want to use structured data as keys, you need to convert them to immutable types first, such as tuple or frozenset. For non-nested objects, you can simply use tuple(obj). For a simple list of lits, you can use this:
tuple(tuple(elem) for elem in obj)
But for an arbitrary structure, you will have to use recursion.

How do i access the sub details in the JSON using python?

the python program that i am writing calls to an api that returns this json:
Code Output
How do i access the subdetails? When i run the .keys() it only lists those three top levels. I want to be able to get specific items, e.g. "Utility"
I've tried several solutions but none parse correctly. I have tried calling the list inside the dictionary, to no avail. Originally i thought it was a dictionary inside of a dictionary, but Python thinks its a list nested into a dictionary.
Any help would be appreciated!
keys() function only returns the keys of dictionary, so it you call keys(), it will only return the three result. The "subdetails" you are referring to are the values of those keys. For key "SUMMARY" as an example, its value is a list instead of dict (note the "[" after the key). However, the list only has a single element. This is quite common in json. To retrive "Utility", all you need to do is data['SUMMARY'][0]['Utility']
Maybe to help you understand the data structure better, call the "values()" and "items()" function to see what it returns.
Since it's a dict of lists of dicts, simply use an index of 0 to access the first item of the list if there is always only one item in each list. For example, if your JSON object is stored as variable data, then the value of Utility can be accessed with data['SUMMARY'][0]['Utility'].

json dump with dict as value

This question clearly explains how to send a response using json and python dicts. The example however uses String as the value type in this dict. How would one do this with dict as value type? That is a dict with with dict as a value.
To clarify adityasdarma1's comment: this is not a limitation of Python or Django, but of JSON. In JSON, object keys must always be strings. There is no "tuple" type in JSON or JavaScript anyway; and arrays can't be keys because they are mutable. (In Python, tuples can be dict keys, but lists can't.)
I'm not sure why you would need that, though. You can either concatenate the values in some way to make a string key - eg "bar-baz" - or alternatively you might need a more complex nested structure, with bar as the key of the outer dict and baz as an inner one. Without seeing your full data structure, it's hard to advise further.
By giving it a dict with a dict as a value type.
>>> json.dumps({'foo': {'bar': 42}})
'{"foo": {"bar": 42}}'

Categories

Resources