Compare tuple values record wise

Compare tuple values record wise - python

I have an ordered tuple(its 2dimensional, column 0 are my endings, which I want to compare & column1 there are the complete urls), at "column"[0] I have to compare the first value with the second one, if they are the same, save the first value to other list and repeat. I want to compare every item with the following one, if they are euqal or not.
tuple:
[('https://www.topart-online.com/de/Rose%2C-Micle%2C-kupfer%2C-52cm%2C-Oe-9cm/c-KAT240/a-XH0124KP', '/a-XH0124KP'), ('https://www.topart-online.com/de/Rose%2C-Micle%2C-kupfer%2C-52cm%2C-Oe-9cm/c-KAT183/a-XH0124KP', '/a-XH0124KP'), ('https://www.topart-online.com/de/Rose%2C-Micle%2C-kupfer%2C-52cm%2C-Oe-9cm/c-KAT173/a-XH0124KP', '/a-XH0124KP'), ('https://www.topart-online.com/de/Liguster-Zweig-50cm-mit-Glitter/c-KAT184/a-XM0721', '/a-XM0721'), ('https://www.topart-online.com/de/3D-Stern-schwarz-mit-Glitter%2C-7%2C5-cm---SUPER-DEAL/c-KAT14/a-XM1633ZW', '/a-XM1633ZW'), ('https://www.topart-online.com/de/Christbaumschmuck%2C-Zweige%2C-gold-30-cm----SUPER-DEAL/c-KAT14/a-XP0091', '/a-XP0091')]
I want to compare the productnumber extracted of the url, because every product could be possibly found in multiple urls
my sorting try:
sized = len(complete_links2) - 1
for index, tuple in enumerate(complete_links2):
index = k
k = index + 1
if k < sized:
while complete_links2[index][1] == complete_links2[k][1]:
k += 1
if complete_links2[index][1] == complete_links2[k][1]:
k -= 1
not_rep_links.append(complete_links2[index])
complete_links3 = [a_tuple[0] for a_tuple in not_rep_links]
My problem is, that there are some unique links, that get also filter off, because my logic is not really good.
I also tried with set, with unpacking the tuple but idk how to continue

I am still a bit confused but is this what You want?
list_ = [
('https://www.topart-online.com/de/Rose%2C-Micle%2C-kupfer%2C-52cm%2C-Oe-9cm/c-KAT240/a-XH0124KP', '/a-XH0124KP'),
('https://www.topart-online.com/de/Rose%2C-Micle%2C-kupfer%2C-52cm%2C-Oe-9cm/c-KAT183/a-XH0124KP', '/a-XH0124KP'),
('https://www.topart-online.com/de/Rose%2C-Micle%2C-kupfer%2C-52cm%2C-Oe-9cm/c-KAT173/a-XH0124KP', '/a-XH0124KP'),
('https://www.topart-online.com/de/Liguster-Zweig-50cm-mit-Glitter/c-KAT184/a-XM0721', '/a-XM0721'),
('https://www.topart-online.com/de/3D-Stern-schwarz-mit-Glitter%2C-7%2C5-cm---SUPER-DEAL/c-KAT14/a-XM1633ZW', '/a-XM1633ZW'),
('https://www.topart-online.com/de/Christbaumschmuck%2C-Zweige%2C-gold-30-cm----SUPER-DEAL/c-KAT14/a-XP0091', '/a-XP0091')
]
products = []
links = []
for item in list_:
if item[1] not in products:
products.append(item[1])
links.append(item[0])
print(links)

Related

How do I use a while loop to access all the 2nd elements of lists which are the values stored in a dictionary?

If I have a dictionary like this, filled with similar lists, how can I apply a while loo tp extract a list that prints that second element:
racoona_valence={}
racoona_valence={"rs13283416": ["7:87345874365-839479328749+","BOBB7"],\}
I need to print the part that says "BOBB7" for 2nd element of the lists in a larger dictionary. There are ten key-value pairs in it, so I am starting it like so, but unsure what to do because all the examples I can find don't relate to my problem:
n=10
gene_list = []
while n>0:
Any help greatly appreciated.

Well, there's a bunch of ways to do it depending on how well-structured your data is.
racoona_valence={"rs13283416": ["7:87345874365-839479328749+","BOBB7"], "rs13283414": ["7:87345874365-839479328749+","BOBB4"]}
output = []
for key in racoona_valence.keys():
output.append(racoona_valence[key][1])
print(output)
other_output = []
for key, value in racoona_valence.items():
other_output.append(value[1])
print(other_output)
list_comprehension = [value[1] for value in racoona_valence.values()]
print(list_comprehension)
n = len(racoona_valence.values())-1
counter = 0
gene_list = []
while counter<=n:
gene_list.append(list(racoona_valence.values())[n][1])
counter += 1
print(gene_list)

Here is a list comprehension that does what you want:
second_element = [x[1] for x in racoona_valence.values()]
Here is a for loop that does what you want:
second_element = []
for value in racoona_valence.values():
second_element.append(value[1])
Here is a while loop that does what you want:
# don't use a while loop to loop over iterables, it's a bad idea
i = 0
second_element = []
dict_values = list(racoona_valence.values())
while i < len(dict_values):
second_element.append(dict_values[i][1])
i += 1
Regardless of which approach you use, you can see the results by doing the following:
for item in second_element:
print(item)
For the example that you gave, this is the output:
BOBB7

Python insertion sorting a csv by row

My objective is to use an insertion sort to sort the contents of a csv file by the numbers in the first column for example I want this:
[[7831703, Christian, Schmidt]
[2299817, Amber, Cohen]
[1964394, Gregory, Hanson]
[1984288, Aaron, White]
[9713285, Alexander, Kirk]
[7025528, Janice, Lee]
[6441979, Sarah, Browning]
[8815776, Rick, Wallace]
[2395480, Martin, Weinstein]
[1927432, Stephen, Morrison]]
and sort it to:
[[1927432, Stephen, Morrison]
[1964394, Gregory, Hanson]
[1984288, Aaron, White]
[2299817, Amber, Cohen]
[2395480, Martin, Weinstein]
[6441979, Sarah, Browning]
[7025528, Janice, Lee]
[7831703, Christian, Schmidt]
[8815776, Rick, Wallace]
[9713285, Alexander, Kirk]]
based off the numbers in the first column within python my current code looks like:
import csv
with open('EmployeeList.csv', newline='') as File:
reader = csv.reader(File)
readList = list(reader)
for row in reader:
print(row)
def insertionSort(readList):
#Traverse through 1 to the len of the list
for row in range(len(readList)):
# Traverse through 1 to len(arr)
for i in range(1, len(readList[row])):
key = readList[row][i]
# Move elements of arr[0..i-1], that are
# greater than key, to one position ahead
# of their current position
j = i-1
while j >=0 and key < readList[row][j] :
readList[row] = readList[row]
j -= 1
readList[row] = key
insertionSort(readList)
print ("Sorted array is:")
for i in range(len(readList)):
print ( readList[i])
The code can already sort the contents of a 2d array, but as it is it tries to sort everything.
I think if I got rid of the [] it would work but in testing it hasn't given what I needed.
To try to clarify again I want to sort the rows positions based off of the first columns numerical value.

Sorry if I didn't understand your need right. But you have a list and you need to sort it? Why you don't you just use sort method in list object?
>>> data = [[7831703, "Christian", "Schmidt"],
... [2299817, "Amber", "Cohen"],
... [1964394, "Gregory", "Hanson"],
... [1984288, "Aaron", "White"],
... [9713285, "Alexander", "Kirk"],
... [7025528, "Janice", "Lee"],
... [6441979, "Sarah", "Browning"],
... [8815776, "Rick", "Wallace"],
... [2395480, "Martin", "Weinstein"],
... [1927432, "Stephen", "Morrison"]]
>>> data.sort()
>>> from pprint import pprint
>>> pprint(data)
[[1927432, 'Stephen', 'Morrison'],
[1964394, 'Gregory', 'Hanson'],
[1984288, 'Aaron', 'White'],
[2299817, 'Amber', 'Cohen'],
[2395480, 'Martin', 'Weinstein'],
[6441979, 'Sarah', 'Browning'],
[7025528, 'Janice', 'Lee'],
[7831703, 'Christian', 'Schmidt'],
[8815776, 'Rick', 'Wallace'],
[9713285, 'Alexander', 'Kirk']]
>>>
Note that here we have first element parsed as integer. It is important if you want to sort it by numerical value (99 comes before 100).
And don't be confused by importing pprint. You don't need it to sort. I just used is to get nicer output in console.
And also note that List.sort() is in-place method. It doesn't return sorted list but sorts the list itself.
*** EDIT ***
Here is two different apporach to sort function. Both could be heavily optimized but I hope you get some ideas how this can be done. Both should work and you can add some print commands in loops to see what happens there.
First recursive version. It orders the list a little bit on every run until it is ordered.
def recursiveSort(readList):
# You don't want to mess original data, so we handle copy of it
data = readList.copy()
changed = False
res = []
while len(data): #while 1 shoudl work here as well because eventually we break the loop
if len(data) == 1:
# There is only one element left. Let's add it to end of our result.
res.append(data[0])
break;
if data[0][0] > data[1][0]:
# We compare first two elements in list.
# If first one is bigger, we remove second element from original list and add it next to the result set.
# Then we raise changed flag to tell that we changed the order of original list.
res.append(data.pop(1))
changed = True
else:
# otherwise we remove first element from the list and add next to the result list.
res.append(data.pop(0))
if not changed:
#if no changes has been made, the list is in order
return res
else:
#if we made changes, we sort list one more time.
return recursiveSort(res)
And here is a iterative version, closer your original function.
def iterativeSort(readList):
res = []
for i in range(len(readList)):
print (res)
#loop through the original list
if len(res) == 0:
# if we don't have any items in our result list, we add first element here.
res.append(readList[i])
else:
done = False
for j in range(len(res)):
#loop through the result list this far
if res[j][0] > readList[i][0]:
#if our item in list is smaller than element in res list, we insert it here
res.insert(j, readList[i])
done = True
break
if not done:
#if our item in list is bigger than all the items in result list, we put it last.
res.append(readList[i])
print(res)
return res

Dictionary look up by value and return rank based on position?

my base data is a dictionary with keys being countries, and values being a list of names from position 0 to 9.
#base data
data
{'Newyork': ['Don Willis',
'Lewis Hamilton',
'Kimi Raikkonen',
'Daniel Ricciardo',
'Fernando Alonso',
'Max Verstappen',
'Nico Hulkenberg',
'Valtteri Bottas',
'Stoffel Vandoorne',
'Carlos Sainz'],
'Chicago': ['Don Willis',
'Fernando Alonso',...
find_city(name, rank) should, given the name of a contestant and a rank (number 1 rank being index position 0) and, returns a list of all the cities where contestant received a specific rank.
find_city("Don Willis", 1) == ["Newyork", "Chicago", "Miami"]
find_city("Lewis Hamilton", 6) == [] #None because Lewis Hamilton never ranked 6 in any of city
Here's my attempt so far, but not much progress. Any help?
def find_city(name, rank):
data.get("Newyork",None)
b=[]
for i in enumerate(a):
b.append(i)
for (i,v) in b:

You don't have to enumerate here since the index's are only offset by one we can just subtract 1 from rank and look up with that value
def find_city(name, rank):
l = [k for k in data if data[k][rank - 1] == name]
if len(l) == 0:
return None
return l

How about this?
def find_city(name, rank):
city_list = []
for c, l in data.iteritems():
if l[rank - 1] == name:
city_list.append(c)
return city_list
This will loop through each key:value in your data dictionary, and check if the value in that city's result list, at index position (rank - 1 because lists are zero indexed, F-1 drivers are not), matches the name. If yes, the key (the cityname) is appended to city_list.
When done, it either returns a list of citys, or, if none found, an empty list.

When solving some of these programming challenges, try to think if you can reframe the question or task at hand in another way.
A more specific or simpler way to frame the task would be:
Create a function that should check a list at each key of a dictionary, if a value exist on a specific index.
data = {...}
def find_city(name, rank):
keys = []
for key in data:
if len(data[key]) < rank:
raise Exception("You tried to reach an index out of bounds")
if data[key][rank-1] == name:
keys.append(key)
return keys

def find_city(name, rank):
for i in data:
try:
if data[i].index(name) == rank - 1:
yield i
except ValueError:
pass
It iterates over data and if the list data[i] has got the name at the rank - 1 it "yields" it
I used try because if the name is not in the data it raises ValueError and the program crashes
P.S. yield returns a generator-like object so after receiving the output you should transform it into a list with the list() function

How to remove list elements within a loop effectively in python

I have a code as follows.
for item in my_list:
print(item[0])
temp = []
current_index = my_list.index(item)
garbage_list = creategarbageterms(item[0])
for ele in my_list:
if my_list.index(ele) != current_index:
for garbage_word in garbage_list:
if garbage_word in ele:
print("concepts: ", item, ele)
temp.append(ele)
print(temp)
Now, I want to remove the ele from mylist when it gets appended to temp (so, that it won't get processed in the main loop, as it is a garbage word).
I know it is bad to remove elements straightly from the list, when it is in a loop. Thus, I am interested in knowing if there is any efficient way of doing this?
For example, if mylist is as follows;
mylist = [["tim_tam", 879.3000000000001], ["yummy_tim_tam", 315.0], ["pudding", 298.2],
["chocolate_pudding", 218.4], ["biscuits", 178.20000000000002], ["berry_tim_tam", 171.9],
["tiramusu", 158.4], ["ice_cream", 141.6], ["vanilla_ice_cream", 122.39999999999999]]
1st iteration
for the first element tim_tam, I get garbage words such as yummy_tim_tam and berry_tim_tam. So they will get added to my temp list.
Now I want to remove yummy_tim_tam and berry_tim_tam from the list (because they have already added to temp), so that it won't execute from the beginning.
2nd iteration
Now, since yummy_tim_tam is no longer in the list this will execute pudding. For pudding I get a diffrent set of garbage words such as chocolate_pudding, biscuits, tiramu. So, they will get added to temp and will get removed.
3rd iteration
ice_cream will be selected. and the process will go on.
My final objective is to get three separate lists as follows.
["tim_tam", 879.3000000000001], ["yummy_tim_tam", 315.0], ["berry_tim_tam", 171.9] , ["pudding", 298.2]
["chocolate_pudding", 218.4], ["biscuits", 178.20000000000002], ["tiramusu", 158.4]
["ice_cream", 141.6], ["vanilla_ice_cream", 122.39999999999999]

This code produces what you want:
my_list = [['tim_tam', 879.3], ['yummy_tim_tam', 315.0], ['pudding', 298.2],
['chocolate_pudding', 218.4], ['biscuits', 178.2], ['berry_tim_tam', 171.9],
['tiramusu', 158.4], ['ice_cream', 141.6], ['vanilla_ice_cream', 122.39]
]
creategarbageterms = {'tim_tam' : ['tim_tam','yummy_tim_tam', 'berry_tim_tam'],
'pudding': ['pudding', 'chocolate_pudding', 'biscuits', 'tiramusu'],
'ice_cream': ['ice_cream', 'vanilla_ice_cream']}
all_data = {}
temp = []
for idx1, item in enumerate(my_list):
if item[0] in temp: continue
all_data[idx1] = [item]
garbage_list = creategarbageterms[item[0]]
for idx2, ele in enumerate(my_list):
if idx1 != idx2:
for garbage_word in garbage_list:
if garbage_word in ele:
temp.append(ele[0])
all_data[idx1].append(ele)
for item in all_data.values():
print('-', item)
This produces:
- [['tim_tam', 879.3], ['yummy_tim_tam', 315.0], ['berry_tim_tam', 171.9]]
- [['pudding', 298.2], ['chocolate_pudding', 218.4], ['biscuits', 178.2], ['tiramusu', 158.4]]
- [['ice_cream', 141.6], ['vanilla_ice_cream', 122.39]]
Note that for the purpose of the example I created a mock creategarbageterms function (as a dictionary) that produces the term lists as you defined it in your post. Note the use of a defaultdict which allows unlimited number of iterations, that is, unlimited number of final lists produced.

I would propose to do it like this:
mylist = [["tim_tam", 879.3000000000001],
["yummy_tim_tam", 315.0],
["pudding", 298.2],
["chocolate_pudding", 218.4],
["biscuits", 178.20000000000002],
["berry_tim_tam", 171.9],
["tiramusu", 158.4],
["ice_cream", 141.6],
["vanilla_ice_cream", 122.39999999999999]]
d = set() # remembers unique keys, first one in wins
for i in mylist:
shouldAdd = True
for key in d:
if i[0].find(key) != -1: # if this key is part of any key in the set
shouldAdd = False # do not add it
if not d or shouldAdd: # empty set or unique: add to set
d.add(i[0])
myCleanList = [x for x in mylist if x[0] in d] # clean list to use only keys in set
print(myCleanList)
Output:
[['tim_tam', 879.3000000000001],
['pudding', 298.2],
['biscuits', 178.20000000000002],
['tiramusu', 158.4],
['ice_cream', 141.6]]
If the order of things in the list is not important, you could use a dictionary directly - and create a list from the dict.
If you need sublists, create them:
similarThings = [ [x for x in mylist if x[0].find(y) != -1] for y in d]
print(similarThings)
Output:
[
[['tim_tam', 879.3000000000001], ['yummy_tim_tam', 315.0], ['berry_tim_tam', 171.9]],
[['tiramusu', 158.4]],
[['ice_cream', 141.6], ['vanilla_ice_cream', 122.39999999999999]],
[['pudding', 298.2], ['chocolate_pudding', 218.4]],
[['biscuits', 178.20000000000002]]
]
As #joaquin pointed out in the comment, I am missing the creategarbageterms() functions that groups tiramusu and biscuits with pudding to fit the question 100% - my answer is advocating "do not modify lists in interations, use appropriate set or dict filter it to the groups. Unique keys here are keys that are not parts of later mentioned keys.

You want to have an outer loop that's looping through a list, and an inner loop that can modify that same list.
I saw you got suggestions in the comments to simply not remove entries during the inner loop at all, but instead check if terms already are in temp. This is possible, and may be easier to read, but is not necessarily the best solution with respect to processing time.
I also see you received an answer from Patrick using dictionaries. This is probably the cleanest solution for your specific use-case, but does not address the more general question in your title which is specifically about removing items in a list while looping through it. If for whatever reason this is really necessary, I would propose the following:
idx = 0
while idx < len(my_list)
item = my_list[idx]
print(item[0])
temp = []
garbage_list = creategarbageterms(item[0])
ele_idx = 0
while ele_idx < len(my_list):
if ele_idx != idx:
ele = my_list[ele_idx]
for garbage_word in garbage_list:
if garbage_word in ele:
print("concepts: ", item, ele)
temp.append(ele)
del my_list[ele_idx]
ele_idx += 1
print(temp)
idx += 1
The key insight here is that, by using a while loop instead of a for loop, you can take more detailed, ''manual'' control of the control flow of the program, and more safely do ''unconventional'' things in your loop. I'd only recommend doing this if you really have to for whatever reason though. This solution is closer to the literal question you asked, and closer to your original own code, but maybe not the easiest to read / most Pythonic code.

Filtering a list of images by using a filter and association lists

I've got an assignment and part of it asks to define a process_filter_description. Basically I have a list of images I want to filter:
images = ["1111.jpg", "2222.jpg", "circle.JPG", "square.jpg", "triangle.JPG"]
Now I have an association list that I can use to filter the images:
assc_list = [ ["numbers", ["1111.jpg", "2222.jpg"]] , ["shapes", ["circle.JPG", "square.jpg", "triangle.JPG"]] ]
I can use a filter description to select which association list I want to apply the filter the keyword is enclosed by colons):
f = ':numbers:'
I'm not exactly sure how to start it. In words I can at least think:
Filter is ':numbers:'
Compare each term of images to each term associated with numbers in the association list.
If term matches, then append term to empty list.
Right now I am just trying to get my code to print only the terms from the numbers association list, but it prints out all of them.
def process_filter_description(f, images, ia):
return_list = []
f = f[1:-1]
counter = 0
if f == ia[counter][0]:
#print f + ' is equal to ' + ia[counter][0]
for key in ial:
for item in key[1]:
#print item
#return_list.append(item)
return return_list

Instead of an "associative list", how about using a dictionary?
filter_assoc = {'numbers': ['1111.jpg', '2222.jpg'] ,
'shapes': ['circle.JPG', 'square.jpg', 'triangle.JPG']}
Now, just see which images are in each group:
>>> filter_assoc['numbers']
['1111.jpg', '2222.jpg']
>>>
>>> filter_assoc['shapes']
['circle.JPG', 'square.jpg', 'triangle.JPG']
Your processing function would become immensely simpler:
def process_filter_description(filter, association):
return association[filter[1:-1]]
I'll just think aloud here, so this is what I'd use as a function to perform the task of the dictionary:
def process_filter_description(index, images, association):
return_list = []
index = index[1:-1]
for top_level in association:
if top_level[0] == index:
for item in top_level[1]:
return_list.append(item)
break
return return_list

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Compare tuple values record wise - python

Related

How do I use a while loop to access all the 2nd elements of lists which are the values stored in a dictionary?

Python insertion sorting a csv by row

Dictionary look up by value and return rank based on position?

How to remove list elements within a loop effectively in python

Filtering a list of images by using a filter and association lists

Categories

Resources