Unpacking a list of tuples [closed] - python

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
How can I unpack the following list
[('1', 'GENERAL', '1'), ('1.1', 'RELATED DOCUMENTS', '1'), ('1.2', 'SUMMARY', '1'), ('1.3', 'DEFINITIONS', '1'), ('1.4', 'INFORMATIONAL SUBMITTALS', '2'), ('1.5', 'GENERAL COORDINATION PROCEDURES', '2'), ('1.6', 'COORDINATION DRAWINGS', '3'), ('1.7', 'REQUESTS FOR INFORMATION (RFIs)', '4'), ('1.8', 'PROJECT MEETINGS', '6')]
[[('2', 'PRODUCTS – NOT APPLICABLE', '10')]]
From solution on other post I tried.
Part, Title, Page = zip(*text_good[0])
But got the error
too many values to unpack (expected 3)
And I also tried
Part1[a].append(Part for Part, Title, Page in text_good[0])
Part2[a].append(Part for Part, Title, Page in text_good[1])
Part3[a].append(Part for Part, Title, Page in text_good[2])
But this seemed to return a spot in memory and I could not open the array because I received an error stating it is not pickable.
Thanks
Update:
Assignment of text_good
for i in range(0, len(text_between_parts)):
text_good[i].append(re.findall(r'\s*(\b\d+(?:[.]\d+)?)\W+\s*(.*?)\s*(\b\d+\b)', text_between_parts[i]))
Update 2: When I do text_good[0] I get
[[('1', 'GENERAL', '1'), ('1.1', 'RELATED DOCUMENTS', '1'), ('1.2', 'SUMMARY', '1'), ('1.3', 'DEFINITIONS', '1'), ('1.4', 'INFORMATIONAL SUBMITTALS', '2'), ('1.5', 'GENERAL COORDINATION PROCEDURES', '2'), ('1.6', 'COORDINATION DRAWINGS', '3'), ('1.7', 'REQUESTS FOR INFORMATION (RFIs)', '4'), ('1.8', 'PROJECT MEETINGS', '6')]]
and when I do text_good[0][0] I get
[('1', 'GENERAL', '1'), ('1.1', 'RELATED DOCUMENTS', '1'), ('1.2', 'SUMMARY', '1'), ('1.3', 'DEFINITIONS', '1'), ('1.4', 'INFORMATIONAL SUBMITTALS', '2'), ('1.5', 'GENERAL COORDINATION PROCEDURES', '2'), ('1.6', 'COORDINATION DRAWINGS', '3'), ('1.7', 'REQUESTS FOR INFORMATION (RFIs)', '4'), ('1.8', 'PROJECT MEETINGS', '6')]
Notice the extra bracket when I do text_good[0].

Ok, I think we need to do a little clarification here first. I'm a little confused on what exactly the list is so I will make the following assumption (if any of these assumptions are wrong please let me know so I can fix them):
text_good = [[('1', 'GENERAL', '1'), ('1.1', 'RELATED DOCUMENTS', '1'), ('1.2', 'SUMMARY', '1'), ('1.3', 'DEFINITIONS', '1'), ('1.4', 'INFORMATIONAL SUBMITTALS', '2'), ('1.5', 'GENERAL COORDINATION PROCEDURES', '2'), ('1.6', 'COORDINATION DRAWINGS', '3'), ('1.7', 'REQUESTS FOR INFORMATION (RFIs)', '4'), ('1.8', 'PROJECT MEETINGS', '6')], [('2', 'PRODUCTS - NOT APPLICABLE', '10')]]
Where now if I do text_good[0] I get:
[('1', 'GENERAL', '1'),
('1.1', 'RELATED DOCUMENTS', '1'),
('1.2', 'SUMMARY', '1'),
('1.3', 'DEFINITIONS', '1'),
('1.4', 'INFORMATIONAL SUBMITTALS', '2'),
('1.5', 'GENERAL COORDINATION PROCEDURES', '2'),
('1.6', 'COORDINATION DRAWINGS', '3'),
('1.7', 'REQUESTS FOR INFORMATION (RFIs)', '4'),
('1.8', 'PROJECT MEETINGS', '6')]
and text_good[1] would be:
[('2', 'PRODUCTS - NOT APPLICABLE', '10')]
And to me this seems like you have a list of tuples where ('1', 'GENERAL', '1') would correspond to Part, Title, Page, in that order.
Then if this is the case you need can do something like this:
Parts, Title, Page = zip(*[t for l in text_good for t in l])
Where in this case you get:
print Parts # ('1', '1.1', '1.2', '1.3', '1.4', '1.5', '1.6', '1.7', '1.8', '2')
print Title # ('GENERAL',
# 'RELATED DOCUMENTS',
# 'SUMMARY',
# 'DEFINITIONS',
# 'INFORMATIONAL SUBMITTALS',
# 'GENERAL COORDINATION PROCEDURES',
# 'COORDINATION DRAWINGS',
# 'REQUESTS FOR INFORMATION (RFIs)',
# 'PROJECT MEETINGS',
# 'PRODUCTS - NOT APPLICABLE')
print Page # ('1', '1', '1', '1', '2', '2', '3', '4', '6', '10')
Final Edit:
Because #JStuff has a list of lists of lists of tuples, we technically need 3 for loops to be able to extract the definitions he wants.
Parts, Title, Page = [t for l in text_good for ll in l for t in ll] # Yay for list comprehension?

Related

Delete all tuples from a list, if the first two elements of another tuple matches

i have two tuple lists, and i want to delete all tuples from list1, if the first two elements of a tuple from list2 matches with the first two elements of a tuple from list1.
list1 = [('google', 'data', '1'), ('google', 'data', '2'), ('google', 'data', '3'), ('google', 'data', '4'), ('google', 'WORLD', '1')]
list2 = [('google', 'data', '1'), ('google', 'HELLO', '2'), ('google', 'BLA', '3')]
Result:
list1 = [('google', 'WORLD', '1')]
You can use a list comprehension with all to get only elements that do not match with any of the elements from the second list.
res = [x for x in list1 if all(x[:2] != y[:2] for y in list2)]
Take the first first two elements of each tuple in list2, use set() to deduplicate entries, then filter elements in list1 that are not in this set.
list1 = [('google', 'data', '1'), ('google', 'data', '2'), ('google', 'data', '3'),
('google', 'data', '4'), ('google', 'WORLD', '1')]
list2 = [('google', 'data', '1'), ('google', 'HELLO', '2'), ('google', 'BLA', '3')]
list2_pairs = set(map(lambda tpl: tpl[0:2], list2))
print(list2_pairs) # {('google', 'BLA'), ('google', 'HELLO'), ('google', 'data')}
list1_result = list(filter(lambda tpl: tpl[0:2] not in list2_pairs, list1))
print(list1_result) # [('google', 'WORLD', '1')]

How to sort a list of tuples but with one specific tuple being the first?

I'm doing an application to find the best path for a delivery.
The delivery send me his path:
[
('0', '1'),
('1', '2'),
('0', '2'),
('2', '0')
]
... where every pair of numbers is a location and smallest numbers are closer. They also send me their starting point. For example: 2.
I did a function to sort from lower to higher:
def lowToHigh(trajet):
trajet_opti = trajet
print(sorted(trajet_opti))
lowToHigh([
('0', '1'),
('1', '2'),
('0', '2'),
('2', '0')
])
The output is like this:
[('0', '1'), ('0', '2'), ('1', '2'), ('2', '0')]
I need a function who puts the tuple with the starting number first:
def starting_tuple():
starting_number = 2
.
.
.
Which returns something like this:
[('2', '0'), ('0', '1'), ('0', '2'), ('1', '2')]
Sort with a key that adds another tuple element representing whether the list item equals the starting point.
>>> path = [
... ('0', '1'),
... ('1', '2'),
... ('0', '2'),
... ('2', '0')
... ]
>>> sorted(path, key=lambda c: (c[0] != '2', c))
[('2', '0'), ('0', '1'), ('0', '2'), ('1', '2')]
The expression c[0] != '2' will be False (0) for the starting point and True (1) for all others, which will force the starting point to come at the start of the list. If there are multiple starting points, they will be sorted normally relative to each other.

How to create value pairs with lambda in pyspark?

I am trying to convert a pyspark rdd like this:
before:
[
[('169', '5'), ('2471', '6'), ('48516', '10')],
[('58', '7'), ('163', '7')],
[('172', '5'), ('186', '4'), ('236', '6')]
]
after:
[
[('169', '5'), ('2471', '6')],
[('169', '5'),('48516', '10')],
[('2471', '6'), ('48516', '10')],
[('58', '7'), ('163', '7')],
[('172', '5'), ('186', '4')],
[('172', '5'), ('236', '6')],
[('186', '4'), ('236', '6')]
]
The idea is to go through each line and create new line pairwise. I tried to find out a solution myself with lambda tutorials but with no good. May I ask for some help? If this is repeating other questions, I apologize. Thanks!
I'd use flatMap with itertools.combinations:
from itertools import combinations
rdd.flatMap(lambda xs: combinations(xs, 2))

Round off some values of a tuple

I have tuples like this ( I not sure will it call a list of tuple or not ! )
ratings = [('5', 45.58139534883721), ('4', 27.44186046511628), ('3', 20.0), ('2', 5.116279069767442), ('1', 1.8604651162790697)]
I want to make second value round off ( or truncate, don't matter to me )up to 2 decimal place, like this:
[('5', 45.58), ('4', 27.44), ('3', 20.0), ('2', 5.11), ('1', 1.86)]
I tried something like this:
l = tuple([round(x,2) if isinstance(x, float) else x for x in ratings])
But this seems to be not working. What can I try?
Round the 2nd element of your tuples only:
ratings = [('5', 45.58139534883721), ('4', 27.44186046511628), ('3', 20.0), ('2', 5.116279069767442), ('1', 1.8604651162790697)]
l = [(item[0],round(item[1],2)) for item in ratings]
# [('5', 45.58), ('4', 27.44), ('3', 20.0), ('2', 5.12), ('1', 1.86)]

Why am I not getting the result of sorted function in expected order?

print activities
activities = sorted(activities,key = lambda item:item[1])
print activities
Activities in this case is a list of tuples like (start_number,finish_number) the output of the above code according to me should be the list of values sorted according the the increasing order of finish_number. When I tried the above code in shell I got the following output. I am not sure why the second list is not sorted according the the increasing order of the finish_number. Please help me in understanding this.
[('1', '4'), ('3', '5'), ('0', '6'), ('5', '7'), ('3', '9'), ('5', '9'), ('6', '10'), ('8', '11'), ('8', '12'), ('2', '14'), ('12', '16')]
[('6', '10'), ('8', '11'), ('8', '12'), ('2', '14'), ('12', '16'), ('1', '4'), ('3', '5'), ('0', '6'), ('5', '7'), ('3', '9'), ('5', '9')]
You are sorting strings instead of integers: in that case, 10 is "smaller" than 4. To sort on integers, convert it to this:
activites = sorted(activities,key = lambda item:int(item[1]))
print activities
Results in:
[('1', '4'), ('3', '5'), ('0', '6'), ('5', '7'), ('3', '9'), ('5', '9'), ('6', '10'), ('8', '11'), ('8', '12'), ('2', '14'), ('12', '16')]
Your items are being compared as strings, not as numbers. Thus, since the 1 character comes before 4 lexicographically, it makes sense that 10 comes before 4.
You need to cast the value to an int first:
activities = sorted(activities,key = lambda item:int(item[1]))
You are sorting strings, not numbers. Strings get sorted character by character.
So, for example '40' is greater than '100' because character 4 is larger than 1.
You can fix this on the fly by simply casting the item as an integer.
activities = sorted(activities,key = lambda item: int(item[1]))
It's because you're not storing the number as a number, but as a string. The string '10' comes before the string '2'. Try:
activities = sorted(activities, key=lambda i: int(i[1]))
Look for a BROADER solution to your problem: Convert your data from str to int immediately on input, work with it as int (otherwise you'll be continually be bumping into little problems like this), and format your data as str for output.
This principle applies generally, e.g. when working with non-ASCII string data, do UTF-8 -> unicode -> UTF-8; don't try to manipulate undecoded text.

Categories

Resources