restoring list from unified_diff output - python

I need to generate the diff between two arrays of strings:
a=['1','2']
b=['1','2','3']
To achieve this I'm using the difflib library in Python (2.6):
c=difflib.unified_diff(a,b)
and I save the content of
d=list(c)
which is something like:
['--- \n', '+++ \n', '## -1,2 +1,3 ##\n', ' 1', ' 2', '+3']
How can I build the second array from the first using the output of the unified_diff function?
The behavior that I'm looking for is something like:
>>> merge(a,d)
>>> ['1','2','3']
P.S. the array can have duplicate entries and the order in which each entry appears is important for my application. Moreover, from one iteration to another there could be changes both in the middle/begin of the array, as well as new entries added at the end.

Not sure that my sample is a good style, but you can use something like this:
from collections import Counter
a=['1','2']
b=['1','2','3']
a.extend(b)
[k for k,v in Counter(a).items() if v == 1]
OR if your lists could have only unique items:
list(set(a) ^ set(b))
OR:
missed_in_a = [x for x in a if x not in b]
missed_in_b = [x for x in b if x not in a]
OR:
a=['1','2']
b=['1','2','3']
c = [x for x in a]
c.extend(b)
diff = [x for x in c if a.count(x)+b.count(x) == 1]
The last one(hope i understand you correctly(sorry if not so) now):
a = ['1','2','3','4']
b = ['2','2','3','6','5']
from difflib import unified_diff
def merge(a,b):
output = []
for line in list(unified_diff(a,b))[3:]:
if '+' in line:
output.append(line.strip('+'))
elif not '-' in line:
output.append(line.strip())
return output
print merge(a,b)

Related

How can I convert a list comprehension to a normal for loop?

I'm trying to learn how I can convert Python list comprehensions to a normal for-loop.
I have been trying to understand it from the pages on the net, however when I'm trying myself, I can't seem to get it to work.
What I am trying to convert is the following:
1:
n, m = [int(i) for i in inp_lst[0].split()]
and this one (which is a little bit harder):
2:
lst = [[int(x) for x in lst] for lst in nested[1:]]
However, I am having no luck with it.
What I have tried:
1:
n = []
for i in inp_lst[0].split():
n.append(int(i))
print(n)
If I can get some help, I will really appreciate it :D
Generally speaking, a list comprehension like:
a = [b(c) for c in d]
Can be written using a for loop as:
a = []
for c in d:
a.append(b(c))
Something like:
a, b = [c(d) for d in e]
Might be generalized to:
temp = []
for d in e:
temp.append(c(d))
a, b = temp
Something like:
lst = [[int(x) for x in lst] for lst in nested[1:]]
Is no different.
lst = []
for inner_lst in nested[1:]:
lst.append([int(x) for x in inner_lst])
If we expand that inner list comprehension:
lst = []
for inner_lst in nested[1:]:
temp = []
for x in inner_lst:
temp.append(int(x))
lst.append(temp)

How to reverse a sublist in python

Given the following list:
a = ['aux iyr','bac oxr','lmn xpn']
c = []
for i in a:
x = i.split(" ")
b= x[1][::-1] --- Got stuck after this line
Can anyone help me how to join it to the actual list and bring the expected output
output = ['aux ryi','bac rxo','lmn npx']
I believe you need two lines of codes, first splitting the values:
b = [x.split() for x in a]
Which returns:
[['aux', 'iyr'], ['bac', 'oxr'], ['lmn', 'xpn']]
And then reverting the order:
output = [x[0] +' '+ x[1][::-1] for x in b]
Which returns:
['aux ryi', 'bac rxo', 'lmn npx']
You can use the following simple comprehension:
[" ".join((x, y[::-1])) for x, y in map(str.split, a)]
# ['aux ryi', 'bac rxo', 'lmn npx']

Python, toggle variables in a script

I've got a script running that I want to toggle between different variables.
Let's say I've got a list of urls and I want to concatenate one of the variables a, b or c. I don't care which but I'd expect the variables to repeat but the list would run through once.
v would end up looking like
url1+string1
url2+string2
url3+string3
url4+string1
url5+string2
etc
def function1():
list = [url1,url2,url3,url4,url5.......]
a = 'string1'
b = 'string2'
c = 'string3'
for i in list:
v = i+(a then b then c then a then b then c)
I was able to get this to work on my own but I'm new and learning, does anyone have a more elegant solution to this?
a = 'a'
b = 'b'
c = 'c'
list1 = ['string1','string2','string3','string4','string5','string6','string7','string8']
list2 = [a, b, c]
c = 0
for i in list1:
if c == len(list2):
c = 0
vv = i + list2[int(c)]
c = c + 1
print vv
it returns what I was looking for but it's messy:
string1a
string2b
string3c
string4a
string5b
string6c
string7a
string8b
You can utilise itertools.cycle to repeat one of the iterables, eg:
from itertools import cycle, izip
list1 = ['string1','string2','string3','string4','string5','string6','string7','string8']
list2 = ['a', 'b', 'c']
for fst, snd in izip(list1, cycle(list2)):
print fst + snd # or whatever
#string1a
#string2b
#string3c
#string4a
#string5b
#string6c
#string7a
#string8b
Note that while cycle will repeat its elements indefinitely, izip stops on the shortest iterable (list1).
list = [url1..1]
list2 = [a, b, c]
for i in list:
for a in list2:
v = i+a
EDIT
Okay, that makes more sense-
How about this...
set_it = iter(list2)
for i in list:
try:
a = set_it.next()
except:
set_it = iter(list2)
a = set_it.next()
v = i+a
Though I feel the same as you- there is probably a smoother way to accomplish that...

Python to filter a comma-separated list on another csv list

I have two strings:
s1 = "Brendon, Melissa, Jason, , McGuirk" #the gauranteed string in format "x, y, z"
s2 = "brandon,melissa,jxz ,paula,coach" #the messy string
and would like to create a Python (2.7) list that uses the value in l1 if it exists, otherwise pass through the value in l2. I have working code, but even with the list comprehensions, I feel like there may be a more Pythonic way of doing this. Any ideas what that might be?
l1 = [x.strip() for x in s1.split(',')]
l2 = [x.strip() for x in s2.split(',')]
f = lambda s: s[1] if s[1] else s[0]
final = [f(x) for x in zip(l2, l1)]
The list "final" now contains:
['Brendon', 'Melissa', 'Jason', 'paula', 'McGuirk']
Which is correct.
------- edit
So, looking at Jon's answer below, a or b seems like the simplest, most readable approach. I moved the string cleaning to a small function, and ended up with this. Any further improvements to make?
trim_csv = lambda csv: [s.strip() for s in csv.split(',')]
print [a or b for a, b in zip(trim_csv(s1), trim_csv(s2))]
Works for your example
s1 = "Brendon, Melissa, Jason, , McGuirk"
s2 = "brandon, melissa, jxz, paula, coach"
print [a or b for a, b in zip(s1.split(', '), s2.split(', '))]
Slightly more generic one that can be adapated:
import re
from itertools import izip_longest, ifilter, imap
s1 = "Brendon, Melissa, Jason, , McGuirk"
s2 = "brandon, melissa, jxz, paula, coach"
def take_first_not_empty(*args):
splitter = re.compile(r'\s*?,\s*').split
words = imap(splitter, args)
return [next(ifilter(None, vals), '') for vals in izip_longest(*words, fillvalue='')]
Something like this?
>>> s1 = "Brendon, Melissa, Jason, , McGuirk"
>>> s2 = "brandon, melissa, jxz, paula, coach"
>>> [x if x else y for x,y in zip( s1.split(', '),s2.split(', '))]
['Brendon', 'Melissa', 'Jason', 'paula', 'McGuirk']

How to split a list into subsets based on a pattern?

I'm doing this but it feels this can be achieved with much less code. It is Python after all. Starting with a list, I split that list into subsets based on a string prefix.
# Splitting a list into subsets
# expected outcome:
# [['sub_0_a', 'sub_0_b'], ['sub_1_a', 'sub_1_b']]
mylist = ['sub_0_a', 'sub_0_b', 'sub_1_a', 'sub_1_b']
def func(l, newlist=[], index=0):
newlist.append([i for i in l if i.startswith('sub_%s' % index)])
# create a new list without the items in newlist
l = [i for i in l if i not in newlist[index]]
if len(l):
index += 1
func(l, newlist, index)
func(mylist)
You could use itertools.groupby:
>>> import itertools
>>> mylist = ['sub_0_a', 'sub_0_b', 'sub_1_a', 'sub_1_b']
>>> for k,v in itertools.groupby(mylist,key=lambda x:x[:5]):
... print k, list(v)
...
sub_0 ['sub_0_a', 'sub_0_b']
sub_1 ['sub_1_a', 'sub_1_b']
or exactly as you specified it:
>>> [list(v) for k,v in itertools.groupby(mylist,key=lambda x:x[:5])]
[['sub_0_a', 'sub_0_b'], ['sub_1_a', 'sub_1_b']]
Of course, the common caveats apply (Make sure your list is sorted with the same key you're using to group), and you might need a slightly more complicated key function for real world data...
In [28]: mylist = ['sub_0_a', 'sub_0_b', 'sub_1_a', 'sub_1_b']
In [29]: lis=[]
In [30]: for x in mylist:
i=x.split("_")[1]
try:
lis[int(i)].append(x)
except:
lis.append([])
lis[-1].append(x)
....:
In [31]: lis
Out[31]: [['sub_0_a', 'sub_0_b'], ['sub_1_a', 'sub_1_b']]
Use itertools' groupby:
def get_field_sub(x): return x.split('_')[1]
mylist = sorted(mylist, key=get_field_sub)
[ (x, list(y)) for x, y in groupby(mylist, get_field_sub)]

Categories

Resources