Pluck in Python - python

I started reading about underscore.js today, it is a library for javascript that adds some functional programming goodies I'm used to using in Python. One pretty cool shorthand method is pluck.
Indeed in Python I often need to pluck out some specific attribute, and end up doing this:
users = [{
"name" : "Bemmu",
"uid" : "297200003"
},
{
"name" : "Zuck",
"uid" : "4"
}]
uids = map(lambda x:x["uid"], users)
If the underscore shorthand is somewhere in Python, this would be possible:
uids = pluck(users, "uid")
It's of course trivial to add, but is that in Python somewhere already?

Just use a list comprehension in whatever function is consuming uids:
instead of
uids = map(operator.itemgetter("uid"), users)
foo(uids)
do
foo([x["uid"] for x in users])
If you just want uids to iterate over, you don't need to make a list -- use a generator instead. (Replace [] with ().)
For example:
def print_all(it):
""" Trivial function."""
for i in it:
print i
print_all(x["uid"] for x in users)

From funcy module (https://github.com/Suor/funcy) you can pick pluck function.
In this case, provided that funcy is available on your host, the following code should work as expected:
from funcy import pluck
users = [{
"name" : "Bemmu",
"uid" : "297200003"
},
{
"name" : "Zuck",
"uid" : "4"
}]
uids = pluck("uid", users)
Pay attention to the fact that the order of arguments is different from that used with underscore.js

Related

Replacing a variable name in text with the value of that variable

I have a template that uses placeholders for the varying content that will be filled in. Suppose the template has:
"This article was written by AUTHOR, who is solely responsible for its content."
The author's name is stored in the variable author.
So I of course do:
wholeThing = wholeThing.replace('AUTHOR', author)
The problem is I have 10 of these self-named variables, and it would just be more economical if I could something like this, using only 4 for brevity:
def(self-replace):
...
return
wholeThing = wholeThing.self-replace('AUTHOR', 'ADDR', 'PUBDATE', 'MF_LINK')
With Python 3.6+, you may find formatted string literals (PEP 498) efficient:
# data from #bohrax
d = {"publication": "article", "author": "Me"}
template = f"This {d['publication']} was written by {d['author']}, who is solely responsible for its content."
print(template)
This article was written by Me, who is solely responsible for its content.
Sounds like what you need is string formatting, something like this:
def get_sentence(author,pud_date):
return "This article was written by {}, who is solely responsible for its content. This article was published on {}.".format(author,pub_date)
Assuming you are parsing the variables that make up the string iteratively, you can call this function with the arguments needed and get the string returned.
That str.format() function can be placed anywhere and can take any number of arguments as long as there is a place for it in the string indicated by the {}. I suggest you play around with this function on the interpreter or ipython notebook to get familiar with it.
If you have control over the templates I would use str.format and a dict containing the variables:
>>> template = "This {publication} was written by {author}, who is solely responsible for its content."
>>> variables = {"publication": "article", "author": "Me"}
template.format(**variables)
'This article was written by Me, who is solely responsible for its content.'
It is easy to extend this to a list of strings:
templates = [
"String with {var1}",
"String with {var2}",
]
variables = {
"var1": "value for var1",
"var2": "value for var2",
}
replaced = [template.format(**variables) for template in templates]

How do I pull a recurring key from a JSON?

I'm new to python (and coding in general), I've gotten this far but I'm having trouble. I'm querying against a web service that returns a json file with information on every employee. I would like to pull just a couple of attributes for each employee, but I'm having some trouble.
I have this script so far:
import json
import urllib2
req = urllib2.Request('http://server.company.com/api')
response = urllib2.urlopen(req)
the_page = response.read()
j = json.loads(the_page)
print j[1]['name']
The JSON that it returns looks like this...
{
"name": bill jones,
"address": "123 something st",
"city": "somewhere",
"state": "somestate",
"zip": "12345",
"phone_number": "800-555-1234",
},
{
"name": jane doe,
"address": "456 another ave",
"city": "metropolis",
"state": "ny",
"zip": "10001",
"phone_number": "555-555-5554",
},
You can see that with the script I can return the name of employee in index 1. But I would like to have something more along the lines of: print j[**0 through len(j)**]['name'] so it will print out the name (and preferably the phone number too) of every employee in the json list.
I'm fairly sure I'm approaching something wrong, but I need some feedback and direction.
Your JSON is the list of dict objects. By doing j[1], you are accessing the item in the list at index 1. In order to get all the records, you need to iterate all the elements of the list as:
for item in j:
print item['name']
where j is result of j = json.loads(the_page) as is mentioned in your answer
Slightly nicer for mass-conversions than repeated dict lookup is using operator.itemgetter:
from future_builtins import map # Only on Py2, to get lazy, generator based map
from operator import itemgetter
for name, phone_number in map(itemgetter('name', 'phone_number'), j):
print name, phone_number
If you needed to look up individual things as needed (so you didn't always need name or phone_number), then regular dict lookups would make sense, this just optimizes the case where you're always retrieving the same set of items by pushing work to builtin functions (which, on the CPython reference interpreter, are implemented in C, so they run a bit faster than hand-rolled code). Using a generator based map isn't strictly necessary, but it avoids making (potentially large) temporary lists when you're just going to iterate the result anyway.
It's basically just a faster version of:
for emp in j:
name, phone_number = emp['name'], emp['phone_number']
print name, phone_number

MongoDB PyMongo Listing all keys in a document

I have a question about how to manipulate a document in PyMongo to have it list all of its current keys, and I'm not quite sure how to do it. For example, if I had a document that looked like this:
{
"_id" : ObjectID("...")
"name": ABCD,
"info": {
"description" : "XYZ",
"type" : "QPR"
}
}
and I had a variable "document" that had this current document as its value, how could I write code to print the three keys:
"_id"
"name"
"info"
I don't want it to list the values, simply the names. The motivation for this is that the user would type one of the names and my program would do additional things after that.
As mentioned in the documentation:
In PyMongo we use dictionaries to represent documents.
So you can get all keys using .keys():
print(document.keys())
Using Python we can do the following which is to fetch all the documents in a variable as mydoc
mydoc = collections.find()
for x in mydoc:
l=list(x.keys())
print(l)
Using this we can get all the keys as a list and then we can use them for further user's need
the document is a python dictionary so you can just print its keys
e.g.
document = db.collection_name.find_one()
for k in document:
print(k)

Organizing of Dynamic Lists of Lists

I'm sorry if this has been answered (I looked and did not find anything.) Please let me know and I will delete immediately.
I am writing a program that makes an API call which returns a multiple lists of different length depending on the call (e.g. facebook API call. Enter the persons name and a list of pictures is returned and each picture has a list of of who "liked" each photo. I want to store a list of a list of these "likes").
#Import urllib for API request
import urllib.request
import urllib.parse
#First I have a function that takes two arguments, first and last name
#Function will return a list of all photos the person has been tagged in facebook
def id_list_generator(first,last):
#Please note I don't actually know facebook API, this part wil not be reproducible
pic_id_request = urllib.request.open('www.facebook.com/pics/id/term={first}+{last}[person]')
pic_id_list = pic_id_request.read()
for i in pic_id_list:
id_list.append(i)
return(id_list)
#Now, for each ID of a picture, I will generate a list of people who "liked" that picture.
#This is where I have trouble. I don't know how to store these list of lists.
for i in id_list:
pic_list = urllib.request.open('www.facebook.com/pics/id/like/term={i}[likes]')
print pic_list
This would print multiple lists of "likes" for each picture the person was tagged in:
foo, bar
bar, baz
baz, foo, qux
norf
I don't really know how to store these honestly.
I was thinking of using a list that would look like this after appending:
foo = [["foo", "bar"], ["bar","baz"],["baz","foo","qux"],["norf"]]
But really I'm not sure what type of storage to use in this case. I thought of using a dictionary of a dictionary, but I don't know if the key can be iterable. I feel like there is a simple answer to this that I am missing.
Well, you could have a list of dictionaries:
Here's an example:
facebook_likes = [{
"first_name": "John",
"last_name": "Smith",
"image_link": "link",
"likes": ["foo"]
}, {
"first_name": "John",
"last_name": "Doe",
"image_link": "link",
"likes": ["foo", "bar"]
}]
for like in facebook_likes:
print like
print like["likes"]
print like["likes"][0]
You should also look into JSON objects.
Its one of the standard response objects that you get after making API calls.
Fortunately, its very simple to transform a Python dict into a JSON object and vice versa.
If you just want to sort by the first element in each list, Python does that by default for 2D lists. Refer to this thread: Python sort() first element of list

Altering JSON array using python

This is the way reading from a .json file on ubuntu terminal:
python -c "import json;print json.loads(open('json_file.json', 'r').read())['foo']['bar']"
What I'd like to do is altering the JSON file, adding new objects and arrays. So how to do this in python?
json_file.json:
{
"data1" :
[
{
"unit" : "Unit_1",
"value" : "20"
},
{
"unit" : "Unit_2",
"value" : "10"
}
]
}
First of all, create a new python file.
import json
data = json.loads(open('json_file.json', 'r').read())
The data is then just a bunch of nested dictionaries and lists.
You can modify it the same way you would modify any python dictionary and list; it shouldn't be hard to find a resource on this as it is one of the most basic python functionalities. You can find a complete reference at the official python documentation, and if you are familiar with arrays/lists and associative arrays/hashes in any language, this should be enough to get you going. If it's not, you can probably find a tutorial and if that doesn't help, if you are able to create a well-formed specific question then you could ask it here.
once you are done, you can put everything back into json:
print json.dumps(data)
For more information on how to customize the output, and about the json module overall, see the documentation.

Categories

Resources