Comparing two lists with specific values to read - python

I have two lists
list1 = ['01:15', 'abc', '01:15', 'def', '01:45', 'ghi' ]
list2 = ['01:15', 'abc', '01:15', 'uvz', '01:45', 'ghi' ]
and when I loop through the list
list_difference = []
for item in list1:
if item not in list2:
list_difference.append(item)
and I managed to get the difference, but I need time as well
because it is a separate item and 'uvz' does not mean to me anything in the list with a few thousand entries.
I tried to convert it to the dictionary, but it overwrites with the last key:value {'01:15' : 'def'}.

Convert the two lists to sets of tuples, then use the set difference operator.
set1 = set((list1[i], list1[i+1]) for i in range(0, len(list1), 2))
set2 = set((list2[i], list2[i+1]) for i in range(0, len(list2), 2))
list_difference = list(set1 - set2)

reformat your data, then do whatever you have done before
list1=list(zip(list1[::2],list1[1::2]))
list2=list(zip(list2[::2],list2[1::2]))

Related

Python - filter list from another other list with condition

list1 = ['/mnt/1m/a_pre.geojson','/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
I have multiple lists and I want to find all the elements of list1 which do not have entry in list2 with a filtering condition.
The condition is it should match 'm' like 1m,2m.. and name of geojson file excluding 'pre or post' substring.
For in e.g. list1 '/mnt/1m/a_pre.geojson' is processed but '/mnt/2m/b_pre.geojson' is not so the output should have a list ['/mnt/2m/b_pre.geojson']
I am using 2 for loops and then splitting the string which I am sure is not the only one and there might be easier way to do this.
for i in list1:
for j in list2:
pre_tile = i.split("/")[-1].split('_pre', 1)[0]
post_tile = j.split("/")[-1].split('_post', 1)[0]
if pre_tile == post_tile:
...
I believe you have similar first part of the file paths. If so, you can try this:
list1 = ['/mnt/1m/a_pre.geojson','/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
res = [x for x in list1 if x[:7] not in [y[:7] for y in list2]]
res:
['/mnt/2m/b_pre.geojson']
If I understand you correctly, using a regular expression to do this kind of string manipulation can be fast and easy.
Additionally, to do multiple member-tests in list2, it's more efficient to convert the list to a set.
import re
list1 = ['/mnt/1m/a_pre.geojson', '/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
pattern = re.compile(r'(.*?/[0-9]m/.*?)_pre.geojson')
set2 = set(list2)
result = [
m.string
for m in map(pattern.fullmatch, list1)
if m and f"{m[1]}_post.geojson" not in set2
]
print(result)

Comparing 2 lists and printing the differences

I am trying to compare 2 different lists and find the differences between them. Say for example I have list 1 which consists of cat,dog,whale,hamster and list 2 which consists of dog,whale,hamster. How would I compare these two and then assign a variable to the difference which in this case is cat. Order does not matter however if there is more than one difference each of these differences should be assigned to an individual variable.
In my actual code im comparing html which consists of thousands of lines so I would prefer something as fast as possible but any is appreciated :)
str1 = 'cat,dog,whale,hamster'
str2 = 'dog,whale,hamster'
Change strings into python sets:
set1 = set(str1.split(','))
set2 = set(str2.split(','))
Get the difference:
result = set1 - set2
Which prints:
{'cat'}
You can convert it to a list or a string:
result_as_list = list(result)
result_as_string = ','.join(result)
If your lists can contain duplicates or if you need to know the elements that are only in one of the two lists, you can use Counter (from the collections module):
list1 = ['cat','dog','whale','hamster','dog']
list2 = ['dog','whale','hamster','cow','horse']
from collections import Counter
c1,c2 = Counter(list1),Counter(list2)
differences = [*((c1-c2)+(c2-c1)).elements()]
print(differences) # ['cat', 'dog', 'cow', 'horse']
This is how you are gonna do it. The function defined here will print the difference between the two lists
def Diff(list1, list2):
li_dif = [i for i in list1 + list2 if i not in list1 or i not in list2]
return li_dif
# Driver Code
list1 = ['cat','dog','whale','hamster']
list2 = ['dog','whale','hamster']
diff = Diff(list1, list2)
print(diff)
output:
['cat']
here cat is generated by the variable diff
Now if there is more than one difference, as follows:
def Diff(list1, list2):
li_dif = [i for i in list1 + list2 if i not in list1 or i not in list2]
return li_dif
# Driver Code
list1 = ['cat','dog','whale','hamster','ostrich','yak','sheep','lion','tiger']
list2 = ['dog','whale','hamster']
diff = Diff(list1, list2)
print(diff)
the output will be:
['cat','ostrich','yak','sheep','lion','tiger']
Your question is that if there is more than one difference, each of these differences should be assigned to an individual variable.
for that, we will treat the printed item as a list, let's name it list3
diff==list3
here, list3=['cat','ostrich','yak','sheep','lion','tiger']
Here, is only 6 list items, we can assign a variable to each of them as follows:
v1=list3[0]
v2=list3[1]
v3=list3[2]
v4=list3[3]
v5=list3[4]
v6=list3[5]

combine two strings in python

I have a list1 like this,
list1 = [('my', '1.2.3', 2),('name', '9.8.7', 3)]
I want to get a new list2 like this (joining first element with second element's second entry);
list2 = [('my2', 2),('name8', 3)]
As a first step, I am checking to join the first two elements in the tuple as follow,
for i,j,k in list1:
#print(i,j,k)
x = j.split('.')[1]
y = str(i).join(x)
print(y)
but I get this
2
8
I was expecting this;
my2
name8
what I am doing wrong? Is there any good way to do this? a simple way..
try
y = str(i) + str(x)
it should works.
The str(i).join(x), means that you see x as an iterable of strings (a string is an iterable of strings), and you are going to construct a string by adding i in between the elements of x.
You probably want to print('{}{}'.format(i+x)) however:
for i,j,k in list1:
x = j.split('.')[1]
print('{}{}'.format(i+x))
Try this:
for x in list1:
print(x[0] + x[1][2])
or
for x in list1:
print(x[0] + x[1].split('.')[1])
output
# my2
# name8
You should be able to achieve this via f strings and list comprehension, though it'll be pretty rigid.
list_1 = [('my', '1.2.3', 2),('name', '9.8.7', 3)]
# for item in list_1
# create tuple of (item[0], item[1].split('.')[1], item[2])
# append to a new list
list_2 = [(f"{item[0]}{item[1].split('.')[1]}", f"{item[2]}") for item in list_1]
print(list_2)
List comprehensions (and dict comprehensions) are some of my favorite things about python3
https://www.pythonforbeginners.com/basics/list-comprehensions-in-python
https://www.digitalocean.com/community/tutorials/understanding-list-comprehensions-in-python-3
Going with the author's theme,
list1 = [('my', '1.2.3', 2),('name', '9.8.7', 3)]
for i,j,k in list1:
extracted = j.split(".")
y = i+extracted[1] # specified the index here instead
print(y)
my2
name8
[Program finished]

Matching and Combining Multiple 2D lists in Python

I am trying to combine (2) 2D lists based on a common value in both lists.
The values in the list are unique so there is nothing to take in to account for a list entry having any of the same values.
The example is:
list1 = [['hdisk37', '00f7e0b88577106a']]
list2 = [['1', '00f7e0b8cee02cd6'], ['2', '00f7e0b88577106a']]
With the desired result of:
list3 = [['hdisk37', '00f7e0b88577106a','2']]
The common value is at list1[0][1] and list2[1][1].
The pythonic way to get the needed result using set objects:
list1 = [['hdisk37', '00f7e0b88577106a']]
list2 = [['1', '00f7e0b8cee02cd6'], ['2', '00f7e0b88577106a']]
set1 = set(list1[0])
list3 = [list(set1 | s) for s in map(set, list2) if set1 & s]
print(list3)
The output:
[['00f7e0b88577106a', '2', 'hdisk37']]
set1 & s is intersection of two sets(returns a new set with elements common to the first set and all others)
set1 | s is union of a specified sets
Try this:
result = []
for inner_list1 in list1:
for inner_list2 in list2:
set1 = set(inner_list1)
set2 = set(inner_list1)
if set1.intersection(set2):
result.append(list(set1.union(set2)))
For each inner list in both lists, check if the intersection between them is not empty. In case it isn't, they are both merged and added to the final result.
This method returns all the possible "second value" matches as a dict, from the second value to the resulting list. It also takes an arbitrary number of these lists of lists (not just two).
import collections
a = [['hdisk37', '00f7e0b88577106a']]
b = [['1', '00f7e0b8cee02cd6'], ['2', '00f7e0b88577106a']]
def combine(*lols): # list of lists
ret = collections.defaultdict(set)
for lol in lols:
for l in lol:
ret[l[1]].add(l[1])
ret[l[1]].add(l[0])
return {k:list(v) for k,v in ret.items()}
print combine(a,b)
Output:
$ python test.py
{'00f7e0b8cee02cd6': ['00f7e0b8cee02cd6', '1'], '00f7e0b88577106a': ['hdisk37', '2', '00f7e0b88577106a']}
To get your exact output requested, you'd do:
combine(list1, list2).get('00f7e0b88577106a')
If you wanna try something different you could do a
merger = lambda x,y : set(x)|set(y) if set(x)&set(y) else x
results = []
for item in list1:
result = reduce(merger,[item]+list2)
if isinstance(result,set):
results.append(result)
print results

Sum of lists for each element of list1 with all in list2

I want make script that reads lines from file, than takes slices from each line, combines all slices from 1 line with all slices from 2 line, then combines all slices from previous step with 3rd line.
For example, we have
Stackoverflow (4)
python (3)
question (3)
I get first list with slices of (number) letters.
lst = ['Stac', 'tack', 'acko', 'ckov', 'kove', 'over', 'verf', 'erfl', 'rflo', 'flow']
Then i need to combine it with second list:
lst = ['pyt', 'yth', 'tho', 'hon']
Desired output:
finallist = ['Stacpyt', 'tackpyt', 'ackopyt', 'ckovpyt', 'kovepyt', 'overpyt', 'verfpyt', 'erflpyt', 'rflopyt', 'flowpyt' 'Stacyth', 'tackyth', 'ackoyth', 'ckovyth', 'koveyth', 'overyth', 'verfyth', 'erflyth', 'rfloyth', 'flowyth', ..... , 'erflhon', 'rflohon', 'flowhon']
then with 3rd list:
lst = ['que', 'ues', 'est', 'sti', 'tio', 'ion']
finallist = ['Stacpytque', 'tackpytque', 'ackopytque', 'ckovpytque', 'kovepytque', 'overpytque', 'verfpytque', 'erflpytque', 'rflopytque', .... 'erflhonion', 'rflohonion', 'flowhonion']
I stuck at point where I need to make finallist with combined results.
I am trying pieces of code like this, but its wrong:
for i in lst:
for y in finallist:
finallist.append(i + y)
So if finallist is empty - it should copy lst in first loop iteration, and if finallist is not empty it should combine each element with lst and so on.
I used re.match() in order to get the word and the integer value from your file.
Then, I compute all the sliced subwords and add them to a list, which is then added to a global list.
Finally, I compute all the possibilties you are looking for thank to itertools.product() which behaves like a nested for-loop.
Then, .join() the tuples obtained and you get the final list you wanted.
from itertools import product
from re import match
the_lists = []
with open("filename.txt", "r") as file:
for line in file:
m = match(r'(.*) \((\d+)\)', line)
word = m.group(1)
num = int(m.group(2))
the_list = [word[i:i+num] for i in range(len(word) - num + 1)]
the_lists.append(the_list)
combinaisons = product(*the_lists)
final_list = ["".join(c) for c in combinaisons]
Use ittertools
import itertools
list1 = ['Stac', 'tack', 'acko', 'ckov', 'kove', 'over', 'verf', 'erfl', 'rflo', 'flow']
list2 = ['pyt', 'yth', 'tho', 'hon']
list3 = ['que', 'ues', 'est', 'sti', 'tio', 'ion']
final_list = list(itertools.product(list(itertools.product(list1,list2)),list3))
This will give you all combinations, then you can just join all of them to get your string.
import itertools
def combine(lst):
result = list(itertools.product(*lst))
result = [''.join(item) for item in result]
return result
list1 = ['Stac', 'tack', 'acko', 'ckov', 'kove', 'over', 'verf', 'erfl', 'rflo', 'flow']
list2 = ['pyt', 'yth', 'tho', 'hon']
list3 = ['que', 'ues', 'est', 'sti', 'tio', 'ion']
lst = [list1, list2, list3] # append more list to lst, then pass lst to combination
print combine(lst)
Append all of the candidate lists to lst, and the combine() function will generate all kinds of combinations and then returns the result as a list.

Categories

Resources