Hi I am new to programming and want to learn python. I am working on a code that should return items that are most redundant in a list. If there are more than 1 then it should return all.
Ex.
List = ['a','b','c','b','d','a'] #then it should return both a and b.
List = ['a','a','b','b','c','c','d'] #then it should return a b and c.
List = ['a','a','a','b','b','b','c','c','d','d','d'] #then it should return a b and d.
Note: We don't know what element is most common in the list so we have to find the most common element and if there are more than one it should return all. If the list has numbers or other strings as elements then also the code has to work
I have no idea how to proceed. I can use a little help.
Here is the whole program:
from collections import Counter
def redundant(List):
c = Counter(List)
maximum = c.most_common()[0][1]
return [k for k, v in c.items()if v == maximum]
def find_kmers(DNA_STRING, k):
length = len(DNA_STRING)
a = 0
List_1 = []
string_1 = ""
while a <= length - k:
string_1 = DNA_STRING[a:a+k]
List_1.append(string_1)
a = a + 1
redundant(List_1)
This program should take DNA string and length of kmer and find what are the kemers of that length that are present in that DNA string.
Sample Input:
ACGTTGCATGTCGCATGATGCATGAGAGCT
4
Sample Output:
CATG GCAT
You can use collections.Counter:
from collections import Counter
def solve(lis):
c = Counter(lis)
mx = c.most_common()[0][1]
#or mx = max(c.values())
return [k for k, v in c.items() if v == mx]
print (solve(['a','b','c','b','d','a']))
print (solve(['a','a','b','b','c','c','d']))
print (solve(['a','a','a','b','b','b','c','c','d','d','d'] ))
Output:
['a', 'b']
['a', 'c', 'b']
['a', 'b', 'd']
A slightly different version of the above code using itertools.takewhile:
from collections import Counter
from itertools import takewhile
def solve(lis):
c = Counter(lis)
mx = max(c.values())
return [k for k, v in takewhile(lambda x: x[1]==mx, c.most_common())]
inputData = [['a','b','c','b','d','a'], ['a','a','b','b','c','c','d'], ['a','a','a','b','b','b','c','c','d','d','d'] ]
from collections import Counter
for myList in inputData:
temp, result = -1, []
for char, count in Counter(myList).most_common():
if temp == -1: temp = count
if temp == count: result.append(char)
else: break
print result
Output
['a', 'b']
['a', 'c', 'b']
['a', 'b', 'd']
>>> def maxs(L):
... counts = collections.Counter(L)
... maxCount = max(counts.values())
... return [k for k,v in counts.items() if v==maxCount]
...
>>> maxs(L)
['a', 'b']
>>> L = ['a','a','b','b','c','c','d']
>>> maxs(L)
['a', 'b', 'c']
>>> L = ['a','a','a','b','b','b','c','c','d','d','d']
>>> maxs(L)
['d', 'a', 'b']
Just for the sake of giving a solution not using collections & using list comprehensions.
given_list = ['a','b','c','b','d','a']
redundant = [(each, given_list.count(each)) for each in set(given_list) if given_list.count(each) > 1]
count_max = max(redundant, key=lambda x: x[1])[1]
final_list = [char for char, count in redundant if count == count_max]
PS - I myself haven't used Counters yet :( Time to learn!
Related
e.g. find substring containing 'a', 'b', 'c' in a string 'abca', answer should be 'abc', 'abca', 'bca'
Below code is what I did, but is there better, pythonic way than doing 2 for loops?
Another e.g. for 'abcabc' count should be 10
def test(x):
counter = 0
for i in range(0, len(x)):
for j in range(i, len(x)+1):
if len((x[i:j]))>2:
print(x[i:j])
counter +=1
print(counter)
test('abca')
You can condense it down with list comprehension:
s = 'abcabc'
substrings = [s[b:e] for b in range(len(s)-2) for e in range(b+3, len(s)+1)]
substrings, len(substrings)
# (['abc', 'abca', 'abcab', 'abcabc', 'bca', 'bcab', 'bcabc', 'cab', 'cabc', 'abc'], 10)
You can use combinations from itertools:
from itertools import combinations
string = "abca"
result = [string[x:y] for x, y in combinations(range(len(string) + 1), r = 2)]
result = [item for item in result if 'a' in item and 'b' in item and 'c' in item]
I'm doing a project for my school and for now I have the following code:
def conjunto_palavras_para_cadeia1(conjunto):
acc = []
conjunto = sorted(conjunto, key=lambda x: (len(x), x))
def by_size(words, size):
result = []
for word in words:
if len(word) == size:
result.append(word)
return result
for i in range(0, len(conjunto)):
if i > 0:
acc.append(("{} ->".format(i)))
acc.append(by_size(conjunto, i))
acc = ('[%s]' % ', '.join(map(str, acc)))
print( acc.replace(",", "") and acc.replace("'", "") )
conjunto_palavras_para_cadeia1(c)
I have this list: c = ['A', 'E', 'LA', 'ELA'] and what I want is to return a string where the words go from the smallest one to the biggest on in terms of length, and in between they are organized alphabetically. I'm not being able to do that...
OUTPUT: [;1 ->, [A, E], ;2 ->, [LA], ;3 ->, [ELA]]
WANTED OUTPUT: ’[1->[A, E];2->[LA];3->[ELA]]’
Taking a look at your program, the only issue appears to be when you are formatting your output for display. Note that you can use str.format to insert lists into strings, something like this:
'{}->{}'.format(i, sublist)
Here's my crack at your problem, using sorted + itertools.groupby.
from itertools import groupby
r = []
for i, g in groupby(sorted(c, key=len), key=len):
r.append('{}->{}'.format(i, sorted(g)).replace("'", ''))
print('[{}]'.format(';'.join(r)))
[1->[A, E];2->[LA];3->[ELA]]
A breakdown of the algorithm stepwise is as follows -
sort elements by length
group consecutive elements by length
for each group, sort sub-lists alphabetically, and then format them as strings
at the end, join each group string and surround with square brackets []
Shortest solution (with using of pure python):
c = ['A', 'E', 'LA', 'ELA']
result = {}
for item in c:
result[len(item)] = [item] if len(item) not in result else result[len(item)] + [item]
str_result = ', '.join(['{0} -> {1}'.format(res, sorted(result[res])) for res in result])
I will explain:
We are getting items one by one in loop. And we adding them to dictionary by generating lists with index of word length.
We have in result:
{1: ['A', 'E'], 2: ['LA'], 3: ['ELA']}
And in str_result:
1 -> ['A', 'E'], 2 -> ['LA'], 3 -> ['ELA']
Should you have questions - ask
My Question is that if we need to find the intersect between two strings?
How could we do that?
For example "address" and "dress" should return "dress".
I used a dict to implement my function, but I can only sort these characters and not output them with the original order? So how should I modify my code?
def IntersectStrings(first,second):
a={}
b={}
for c in first:
if c in a:
a[c] = a[c]+1
else:
a[c] = 1
for c in second:
if c in b:
b[c] = b[c]+1
else:
b[c] = 1
l = []
print a,b
for key in sorted(a):
if key in b:
cnt = min(a[key],b[key])
while(cnt>0):
l.append(key)
cnt = cnt-1
return ''.join(l)
print IntersectStrings('address','dress')
There are lots of intersecting strings. One way you could create a set of all substrings of each string and then intersect. If you want the biggest intersection just find the max from the resulting set, e.g.:
def substrings(s):
for i in range(len(s)):
for j in range(i, len(s)):
yield s[i:j+1]
def intersect(s1, s2):
return set(substrings(s1)) & set(substrings(s2))
Then you can see the intersections:
>>> intersect('address', 'dress')
{'re', 'ss', 'ess', 'es', 'ress', 'dress', 'dres', 'd', 'e', 's', 'res', 'r', 'dre', 'dr'}
>>> max(intersect('address', 'dress'), key=len)
'dress'
>>> max(intersect('sprinting', 'integer'), key=len)
'int'
I am looking to count the items in a list recursively. For example, I have a list few lists:
a = ['b', 'c', 'h']
b = ['d']
c = ['e', 'f']
h = []
I was trying to find a way in which I find out the length of list 'a'. But in list 'a' I have 'b', 'c' and 'h' ... hence my function then goes into list 'b' and counts the number of elements there... Then list 'c' and then finally list 'h'.
b = ['d']
c = ['e', 'f']
h = []
a = [b,c,h]
def recur(l):
if not l: # keep going until list is empty
return 0
else:
return recur(l[1:]) + len(l[0]) # add length of list element 0 and move to next element
In [8]: recur(a)
Out[8]: 3
Added print to help understand the output:
def recur(l,call=1):
if not l:
return 0
else:
print("l = {} and l[0] = {} on recursive call {}".format(l,l[0],call))
call+=1
return recur(l[1:],call) + len(l[0])
If you want to get more deeply nested lists you can flatten and get the len():
b = ['d']
c = ['e', 'f',['x', 'y'],["x"]]
h = []
a = [b,c,h]
from collections import Iterable
def flatten_nest(l):
if not l:
return l
if isinstance(l[0], Iterable) and not isinstance(l[0],basestring): # isinstance(l[0],str) <- python 3
return flatten_nest(l[0]) + flatten_nest(l[1:])
return l[:1] + flatten_nest(l[1:])
In [13]: len(flatten_nest(a))
Out[13]: 6
The solution that worked for me was this:
def recur(arr):
if not arr:
return 0
else:
return 1 + recur(arr[1:])
I have two lists that contain many of the same items, including duplicate items. I want to check which items in the first list are not in the second list. For example, I might have one list like this:
l1 = ['a', 'b', 'c', 'b', 'c']
and one list like this:
l2 = ['a', 'b', 'c', 'b']
Comparing these two lists I would want to return a third list like this:
l3 = ['c']
I am currently using some terrible code that I made a while ago that I'm fairly certain doesn't even work properly shown below.
def list_difference(l1,l2):
for i in range(0, len(l1)):
for j in range(0, len(l2)):
if l1[i] == l1[j]:
l1[i] = 'damn'
l2[j] = 'damn'
l3 = []
for item in l1:
if item!='damn':
l3.append(item)
return l3
How can I better accomplish this task?
You didn't specify if the order matters. If it does not, you can do this in >= Python 2.7:
l1 = ['a', 'b', 'c', 'b', 'c']
l2 = ['a', 'b', 'c', 'b']
from collections import Counter
c1 = Counter(l1)
c2 = Counter(l2)
diff = c1-c2
print list(diff.elements())
Create Counters for both lists, then subtract one from the other.
from collections import Counter
a = [1,2,3,1,2]
b = [1,2,3,1]
c = Counter(a)
c.subtract(Counter(b))
To take into account both duplicates and the order of elements:
from collections import Counter
def list_difference(a, b):
count = Counter(a) # count items in a
count.subtract(b) # subtract items that are in b
diff = []
for x in a:
if count[x] > 0:
count[x] -= 1
diff.append(x)
return diff
Example
print(list_difference("z y z x v x y x u".split(), "x y z w z".split()))
# -> ['y', 'x', 'v', 'x', 'u']
Python 2.5 version:
from collections import defaultdict
def list_difference25(a, b):
# count items in a
count = defaultdict(int) # item -> number of occurrences
for x in a:
count[x] += 1
# subtract items that are in b
for x in b:
count[x] -= 1
diff = []
for x in a:
if count[x] > 0:
count[x] -= 1
diff.append(x)
return diff
Counters are new in Python 2.7.
For a general solution to substract a from b:
def list_difference(b, a):
c = list(b)
for item in a:
try:
c.remove(item)
except ValueError:
pass #or maybe you want to keep a values here
return c
you can try this
list(filter(lambda x:l1.remove(x),li2))
print(l1)
Try this one:
from collections import Counter
from typing import Sequence
def duplicates_difference(a: Sequence, b: Sequence) -> Counter:
"""
>>> duplicates_difference([1,2],[1,2,2,3])
Counter({2: 1, 3: 1})
"""
shorter, longer = sorted([a, b], key=len)
return Counter(longer) - Counter(shorter)