Python Array Variable Assign - python

I have the following array in Python in the following format:
Array[('John', '123'), ('Alex','456'),('Nate', '789')]
Is there a way I can assign the array variables by field as below?
Name = ['john', 'Alex', 'Nate']
ID = ['123', '456', '789']

In the spirit of "explicit is better than implicit":
data = [('John', '123'), ('Alex', '456'), ('Nate', '789')]
names = [x[0] for x in data]
ids = [x[1] for x in data]
print(names) # prints ['John', 'Alex', 'Nate']
print(ids) # prints ['123', '456', '789']
Or even, to be even more explicit:
data = [('John', '123'), ('Alex', '456'), ('Nate', '789')]
NAME_INDEX = 0
ID_INDEX = 1
names = [x[NAME_INDEX] for x in data]
ids = [x[ID_INDEX] for x in data]

this is a compact way to do this using zip:
lst = [('John', '123'), ('Alex','456'),('Nate', '789')]
name, userid = list(zip(*lst))
print(name) # ('John', 'Alex', 'Nate')
print(userid) # ('123', '456', '789')
note that the results are stored in (immutable) tuples; if you need (mutatble) lists you need to cast.

Related

Sort inter-dependant elements of a dictionary

I have a dictionary like this
dict = {'name':''xyz','c1':['a','b','r','d','c'],'c2':['21','232','11','212','34']}
Here dict.c1 values and dict.c2 values are inter-dependant. That is, 'a' is related to '21', 'd' is related to '212', 'b' is related to '232'...
I have to sort c1 and c2 should get reflected accordingly. The final output should be
dict = {'name':''xyz','c1':['a','b','c','d','r'],'c2':['21','232','34','212','11']}
What is the most efficient way to do this?
This works:
d = {'name': 'xyz','c1':['a','b','r','d','c'],'c2':['21','232','11','212','34']}
s = sorted(list(zip(d['c1'], d['c2'])))
d['c1'] = [x[0] for x in s]
d['c2'] = [x[1] for x in s]
Result:
{'c1': ['a', 'b', 'c', 'd', 'r'],
'c2': ['21', '232', '34', '212', '11'],
'name': 'xyz'}
UPDATE
The call to list is not needed. Thanks to tzaman for the hint. It was a relict of putting together the solution from separate steps.
d = {'name': 'xyz','c1':['a','b','r','d','c'],'c2':['21','232','11','212','34']}
s = sorted(zip(d['c1'], d['c2']))
d['c1'] = [x[0] for x in s]
d['c2'] = [x[1] for x in s]
Your data structure does not reflect the real relationship between its elements. I would start by merging c1 and c2 into an OrderedDict. It would take care of both the relationship and the order of elements. Like this:
dict = dict(
name='xyz',
c = OrderedDict(sorted(
zip(
['a','b','r','d','c'],
['21','232','11','212','34']
)
))
)
Frankly, the most efficient way is to store the data in a form compatible with its use. In this case, you would use a table (e.g. PANDAS data frame) or simply a list:
xyz_table = [('a', '21'), ('b'. 232'), ...]
Then you simply sort the list on the first element of each tuple:
xyz_table = [('a', '21'), ('b', '232'), ('c', '34'), ('d', '212'), ('r', '11')]
xyz_sort = sorted(xyz_table, key = lambda row: row[0])
print xyz_sort
You could also look up a primer on Python sorting, such as this
Here's one way you could do it, using zip for both joining and unjoining. You can re-cast them as lists instead of tuples when you put them back into the dictionary.
a = ['a','b','r','d','c']
b = ['21','232','11','212','34']
mix = list(zip(a, b))
mix.sort()
a_new, b_new = zip(*mix)
>>> a_new
('a', 'b', 'c', 'd', 'r')
>>> b_new
('21', '232', '34', '212', '11')

enumerate is iterating of letters of strings in a list instead of elements

I'm trying to use enumerate to iterate of a list and store the elements of the list as well as use the index to grab the index of another list the same size.
Using a silly example:
animal = ['cat', 'dog', 'fish' , 'monkey']
name = ['george', 'steve', 'john', 'james']
x = []
for count, i in enumerate(animal):
y = zip(name[count], i)
x = x +y
Instead of producing tuples of each element of both lists. It produces tuples by letter. Is there a way to do this but get the elements of each list rather than each letter? I know there is likely a better more pythonic way of accomplishing this same task, but I'm specifically looking to do it this way.
enumerate() is doing no such thing. You are pairing up the letters here:
y = zip(name[count], i)
For example, for the first element in animal, count is 0 and i is set to 'cat'. name[0] is 'george', so you are asking Python to zip() together 'george' and 'cat':
>>> zip('george', 'cat')
[('g', 'c'), ('e', 'a'), ('o', 't')]
This is capped at the shorter wordlength.
If you wanted a tuple, just use:
y = (name[count], i)
and then append that to your x list:
x.append(y)
You could use zip() instead of enumerate() to create your pairings:
x = zip(name, animal)
without any loops required:
>>> animal = ['cat', 'dog', 'fish' , 'monkey']
>>> name = ['george', 'steve', 'john', 'james']
>>> zip(name, animal)
[('george', 'cat'), ('steve', 'dog'), ('john', 'fish'), ('james', 'monkey')]
When you use zip() it actually creates a list of tuples of corresponding elements at each index.
So when you provide strings as the input, it provides the result as list of tuples at each character. Example -
>>> zip('cat','george')
[('c', 'g'), ('a', 'e'), ('t', 'o')]
This is what you are doing, when you iterate over each element in the list and use zip.
Instead , you should directly use zip , without iterating over the elements of the list.
Example -
>>> animal = ['cat', 'dog', 'fish' , 'monkey']
>>> name = ['george', 'steve', 'john', 'james']
>>> zip(animal,name)
[('cat', 'george'), ('dog', 'steve'), ('fish', 'john'), ('monkey', 'james')]

Grouping two lists in python

I have two lists which I want to group on the basis of the first element of the lists.
list1 = [['1','abc','zef'],['2','qwerty','opo'],['3','lol','pop']]
list2 = [['1','rofl','pole'],['2','sole','pop'],['3','lmao','wtf']]
Here the first elements in the list inside the list are '1' , '2' and '3'.
I want my final list to be like :-
Final_List = [['1', 'abc', 'zef', 'rofl', 'pole'], ['3', 'lol', 'pop', 'lmao', 'wtf'], ['2', 'qwerty', 'opo', 'sole', 'pop']]
I have tried this using below code.
#!/usr/bin/python
list1 = [['1','abc','zef'],['2','qwerty','opo'],['3','lol','pop']]
list2 = [['1','rofl','pole'],['2','sole','pop'],['3','lmao','wtf']]
d = {}
for i in list1:
d[i[0]] = i[1:]
for i in list2:
d[i[0]].extend(i[1:])
Final_List = []
for key, value in d.iteritems():
value.insert(0,key)
Final_List.append(value)
This code works but i was wondering if there was an easy and cleaner way to do it
Any help?
I would have written like you have written with a little modification, like this
Prepare a dictionary with all the elements from the second position gathered corresponding to the first element.
d = {}
for items in (list1, list2):
for item in items:
d.setdefault(item[0], [item[0]]).extend(item[1:])
And then just get all the values from the dictionary (Thanks #jamylak) :-)
print(d.values())
Output
[['3', 'lol', 'pop', 'lmao', 'wtf'],
['1', 'abc', 'zef', 'rofl', 'pole'],
['2', 'qwerty', 'opo', 'sole', 'pop']]
If item sequence in the lists inside of the Final_List is not important then this can be used,
[list(set(sum(itm, []))) for itm in zip(list1, list2)]
Your code seems correct. Just modify the following portion:
Final_List = []
for key in d:
L = [key] + [x for x in d[key]]
Final_List.append(L)
Yes, with list comprehension and enumerate
list1 = [['1','abc','zef'],['2','qwerty','opo'],['3','lol','pop']]
list2 = [['1','rofl','pole'],['2','sole','pop'],['3','lmao','wtf']]
print [set(v + list2[k]) for k,v in enumerate(list1)]
[['1', 'abc', 'zef', 'rofl', 'pole'], ['2', 'qwerty', 'opo', 'sole', 'pop'], ['3', 'lol', 'pop', 'lmao', 'wtf']]
EDIT
With index relation
list1 = [['1','abc','zef'],['2','qwerty','opo'],['3','lol','pop']]
list2 = [['1','rofl','pole'],['3','lmao','wtf'],['2','sole','pop']]
d1 = {a[0]:a for a in list1}
d2 = {a[0]:a for a in list2}
print [set(v + d2[k]) for k, v in d1.items()]
Using default dict and list comprehensions you can shorten your code
from collections import defaultdict
list1 = [['1','abc','zef'],['2','qwerty','opo'],['3','lol','pop']]
list2 = [['1','rofl','pole'],['2','sole','pop'],['3','lmao','wtf']]
d = defaultdict(list)
for i in list1 + list2:
d[i[0]].extend(i[1:])
Final_List = [[key] + value for key, value in d.iteritems()]
print Final_List
list3 = []
for i in xrange(0,max(len(list1[0]), len(list2[0]))):
list3.append(list(list1[i]))
list3[i].extend(x for x in list2[i] if x not in list3[i])
with a xrange, you can iterate only once through the list.
A bit of functional style:
import operator, itertools
from pprint import pprint
one = [['1','abc','zef'],['2','qwerty','opo'],['3','lol','pop']]
two = [['1','rofl','pole'],['2','sole','pop'],['3','lmao','wtf']]
A few helpers:
zero = operator.itemgetter(0)
all_but_the_first = operator.itemgetter(slice(1, None))
data = (one, two)
def foo(group):
# group is (key, iterator) from itertools.groupby
key = group[0]
lists = group[1]
result = list(key)
for item in lists:
result.extend(all_but_the_first(item))
return result
Function to process the daa
def process(data, func = foo):
# concatenate all the sublists
new = itertools.chain(*data)
# group by item zero
three = sorted(new, key = zero)
groups = itertools.groupby(three, zero)
# iterator that builds the new lists
return itertools.imap(foo, groups)
Usage
>>> pprint(list(process(data)))
[['1', 'abc', 'zef', 'rofl', 'pole'],
['2', 'qwerty', 'opo', 'sole', 'pop'],
['3', 'lol', 'pop', 'lmao', 'wtf']]
>>>
>>> for thing in process(data):
print thing
['1', 'abc', 'zef', 'rofl', 'pole']
['2', 'qwerty', 'opo', 'sole', 'pop']
['3', 'lol', 'pop', 'lmao', 'wtf']
>>>
list1 = [['1','abc','zef'],['2','qwerty','opo'],['3','lol','pop']]
list2 = [['1','rofl','pole'],['2','sole','pop'],['3','lmao','wtf']]
Final_List = []
for i in range(0, len(list1)):
Final_List.append(list1[i] + list2[i])
del Final_List[i][3]
print Final_List
Output
[['1', 'abc', 'zef', 'rofl', 'pole'], ['2', 'qwerty', 'opo', 'sole', 'pop'], ['3', 'lol', 'pop', 'lmao', 'wtf']]

Tuple unpacking

I have a tuple that looks like this:
('Elizabeth', 'Peter, Angela, Thomas')
How could I separate the last value in it so it would look like this:
('Elizabeth', 'Peter', 'Angela', 'Thomas')
>>> names = ('Elizabeth', 'Peter, Angela, Thomas')
>>> [y for x in names for y in x.split(', ')]
['Elizabeth', 'Peter', 'Angela', 'Thomas']
There's also this way, I prefer the first however:
>>> ', '.join(names).split(', ')
['Elizabeth', 'Peter', 'Angela', 'Thomas']
Of course you can convert the result to a tuple in the end but it is most likely unnecessary to do so.

Finding tuples with a common element

Suppose I have a set of tuples with people's names. I want to find everyone who shares the same last name, excluding people who don't share their last name with anyone else:
# input
names = set([('John', 'Lee'), ('Mary', 'Miller'), ('Paul', 'Ryan'),
('Bob', 'Ryan'), ('Tina', 'Lee'), ('Bob', 'Smith')])
# expected output
{'Lee': ['Tina', 'John'], 'Ryan': ['Bob', 'Paul']} # or similar
This is what I am using
def find_family(names):
result = {}
try:
while True:
name = names.pop()
if name[1] in result:
result[name[1]].append(name[0])
else:
result[name[1]] = [name[0]]
except KeyError:
pass
return dict(filter(lambda x: len(x[1]) > 1, result.items()))
This looks ugly and inefficient. Is there a better way?
defaultdict can be used to simplify the code:
from collections import defaultdict
def find_family(names):
d = defaultdict(list)
for fn, ln in names:
d[ln].append(fn)
return dict((k,v) for (k,v) in d.items() if len(v)>1)
names = set([('John', 'Lee'), ('Mary', 'Miller'), ('Paul', 'Ryan'),
('Bob', 'Ryan'), ('Tina', 'Lee'), ('Bob', 'Smith')])
print find_family(names)
This prints:
{'Lee': ['Tina', 'John'], 'Ryan': ['Bob', 'Paul']}
Instead of using a while loop, use a for loop (or similar construct) over the set contents (and while you're at it, you can destructure the tuples):
for firstname, surname in names:
# do your stuff
You might want to use a defaultdict or OrderedDict (http://docs.python.org/library/collections.html) to hold your data in the body of the loop.
>>> names = set([('John', 'Lee'), ('Mary', 'Miller'), ('Paul', 'Ryan'),
... ('Bob', 'Ryan'), ('Tina', 'Lee'), ('Bob', 'Smith')])
You can get a dictionary of all the people where the keys are their lastnames easily with a for-loop:
>>> families = {}
>>> for name, lastname in names:
... families[lastname] = families.get(lastname, []) + [name]
...
>>> families
{'Miller': ['Mary'], 'Smith': ['Bob'], 'Lee': ['Tina', 'John'], 'Ryan': ['Bob', 'Paul']}
Then, you just need to filter the dictionary with the condition len(names) > 1. This filtering could be done using a "dictionary comprehension":
>>> filtered_families = {lastname: names for lastname, names in families.items() if len(names) > 1}
>>> filtered_families
{'Lee': ['Tina', 'John'], 'Ryan': ['Bob', 'Paul']}

Categories

Resources