counting occurrence of consecutive elements in a list (python) - python

Given a list
a=['g','d','r','x','s','g','d','r']
I want to count for example the occurrence of gdr, so I would apply some function
a.function('g','d','r') and return 2 to me (because gdr occurred 2 times).

If they are only strings in the list, you can join them in a long string and use its count method:
>>> a=['g','d','r','x','s','g','d','r']
>>> ''.join(a).count('gdr')
2
If you have a mix of strings and numbers for example, you may use map(str,a) before joining and the method would still work:
>>> a=[1,13,4,2,1,13,4]
>>> ''.join(map(str,a)).count('1134')
2
If you need all this packed in a function, you can have something like this:
def count_pattern(lst, *pattern):
return ''.join(l).count(''.join(pattern))
And will be able to call it: count_pattern(a, 'g', 'd', 'r') or count_pattern(a, 'gdr') or even count_pattern(a, 'gd', 'r').

You can do that as:
def get_occur(list, seq):
return ''.join(list).count(''.join(seq))
>>> get_occur(['a', 's', 'a', 's'], ['a', 's'])
2
If you are passing in strings, you can do:
return your_string.count(pattern)

What about:
>>> a=['g','d','r','x','s','g','d','r']
>>> print "".join(a).count("gdr")
2

Related

How to get items from a list that contain all specific characters from another one

I have 2 lists and i would like to create a third one that only contains items of the first that have all characters from the second one.
I tried some range(), for, len(), etc ideas that i got but got no success at all :/
e.g.
all_types = ['T','L','R','B','TL','TR','TB','LR','LB','BR','TLR','TLB','TRB','LRB','TBLR']
chars = ['R', 'B']
To
selected_types = ['BR', 'TBR', 'LRB', 'TBLR']
selected_types = [t for t in all_types if all(char in t for char in chars)]
You could use a set for chars and use its issubset() method to filter elements of your list:
all_types = ['T','L','R','B','TL','TR','TB','LR','LB','BR','TLR','TLB','TRB','LRB','TBLR']
chars = {'R', 'B'}
selected_types = [ t for t in all_types if chars.issubset(t) ]
# ['BR', 'TRB', 'LRB', 'TBLR']
If you can't change the type of the chars variable to a set for some reasons, you could use a filter with a temporary set built on the fly:
from functools import partial
selected_types = [*filter(partial(set(chars).issubset),all_types)]
all_types = ['T','L','R','B','TL','TR','TB','LR','LB','BR','TLR','TLB','TRB','LRB','TBLR']
chars = ['R', 'B']
selected_types = []
for t in all_types:
if all([c in t for c in chars]):
selected_types.append(t)

making the loop to continue to produce for outputs

Here is the code I have:
def generate(x)
two = {}
for x in range(1,7315):
two.update({vowels[random.randint(0,4)] + alpha[random.randint(0,21)]:0})
return two
generate(x)
this only returns a single value, how could I make it return multiple values?
return a tuple with your values
def returnTwoNumbers():
return (1, 0)
print(returnTwoNumbers()[0])
print(returnTwoNumbers()[1])
#output:
#1
#0
It also looks like you're trying to get a random vowel from your list of vowels. Using random.choice is a much more readable way to get a random item from a list:
import random
vowelList = ['a', 'e', 'i', 'o', 'u']
print (random.choice(vowelList))
You can use a tuple to return multiple values from a function e.g.:
return (one, two, three)
You have wrong indentation
def generate():
two = {}
for x in range(1,7315):
two.update({vowels[random.randint(0,4)] + alpha[random.randint(0,21)]:0})
return two
twos = generate()

How to separate uppercase and lowercase letters in a string?

I have written code that separates the characters at 'even' and 'odd' indices, and I would like to modify it so that it separates characters by upper/lower case.
I can't figure out how to do this for a string such as "AbBZxYp". I have tried using .lower and .upper but I think I'm using them incorrectly.
def upperLower(string):
odds=""
evens=""
for index in range(len(string)):
if index % 2 == 0:
evens = evens + string[index]
if not (index % 2 == 0):
odds = odds + string[index]
print "Odds: ", odds
print "Evens: ", evens
Are you looking to get two strings, one with all the uppercase letters and another with all the lowercase letters? Below is a function that will return two strings, the upper then the lowercase:
def split_upper_lower(input):
upper = ''.join([x for x in input if x.isupper()])
lower = ''.join([x for x in input if x.islower()])
return upper, lower
You can then call it with the following:
upper, lower = split_upper_lower('AbBZxYp')
which gives you two variables, upper and lower. Use them as necessary.
>>> filter(str.isupper, "AbBZxYp")
'ABZY'
>>> filter(str.islower, "AbBZxYp")
'bxp'
Btw, for odd/even index you could just do this:
>>> "AbBZxYp"[::2]
'ABxp'
>>> "AbBZxYp"[1::2]
'bZY'
There is an itertools recipe called partition that can do this. Here is the implementation:
From itertools recipes:
def partition(pred, iterable):
'Use a predicate to partition entries into false entries and true entries'
# partition(is_odd, range(10)) --> 0 2 4 6 8 and 1 3 5 7 9
t1, t2 = tee(iterable)
return filterfalse(pred, t1), filter(pred, t2)
Upper and Lowercase Letters
You can manually implement the latter recipe, or install a library that implements it for you, e.g. pip install more_itertools:
import more_itertools as mit
iterable = "AbBZxYp"
pred = lambda x: x.islower()
children = mit.partition(pred, iterable)
[list(c) for c in children]
# [['A', 'B', 'Z', 'Y'], ['b', 'x', 'p']]
Here partition uses a predicate function to determine if each item in an iterable is lowercase. If not, it is filtered into the false group. Otherwise, it is filtered into the group of true items. We iterate to expose these groups.
Even and Odd Indices
You can modify this to work for odd and even indices as well:
import itertools as it
import more_itertools as mit
iterable = "AbBZxYp"
pred = lambda x: x[0] % 2 != 0
children = mit.partition(pred, tuple(zip(it.count(), iterable)))
[[i[1] for i in list(c)] for c in children]
# [['A', 'B', 'x', 'p'], ['b', 'Z', 'Y']]
Here we zip an itertools.count() object to enumerate the iterable. Then we iterate the children so that the sub items yield the letters only.
See also more_itertools docs for more tools.

python unique string creation

I've looked at several other SO questions (and google'd tons) that are 'similar'-ish to this, but none of them seem to fit my question right.
I am trying to make a non fixed length, unique text string, only containing characters in a string I specify. E.g. made up of capital and lower case a-zA-Z characters. (for this example I use only a, b, and c lower case)
Something like this (broken code below)
def next(index, validCharacters = 'abc'):
return uniqueShortAsPossibleString
The index argument would be an index (integer) that relate to a text string, for instance:
next(1) == 'a'
next(2) == 'b'
next(3) == 'c'
next(4) == 'aa'
next(5) == 'ab'
next(6) == 'ac'
next(7) == 'ba'
next(8) == 'bb'
next(9) == 'bc'
next(10) == 'ca'
next(11) == 'cb'
next(12) == 'cc'
And so forth. The string:
Must be unique, I'll be using it as an identifier, and it can only be a-zA-Z chars
As short as possible, with lower index numbers being shortest (see above examples)
Contain only the characters specified in the given argument string validCharacters
In conclusion, how could I write the next() function to relate an integer index value to an unique short string with the characters specified?
P.S. I'm new to SO, this site has helped me tons throughout the years, and while I've never made an account or asked a question (till now), I really hope I've done an okay job explaining what I'm trying to accomplish with this.
What you are trying to do is write the parameter of the next function in another base.
Let's suppose validCharacters contains k characters: then the job of the next function will be to transform parameter p into base k by using the characters in validCharacters.
In your example, you can write the numbers in base 3 and then associate each digit with one letter:
next(1) -> 1 -> 'a'
next(2) -> 2 -> 'b'
next(4) -> 11 -> 'aa'
next(7) -> 21 -> 'ba'
And so forth.
With this method, you can call next(x) without knowing or computing any next(x-i), which you can't do with iterative methods.
You're trying to convert a number to a number in another base, but using arbitrary characters for the digits of that base.
import string
chars = string.lowercase + string.uppercase
def identifier(x, chars):
output = []
base = len(chars)
while x:
output.append(chars[x % base])
x /= base
return ''.join(reversed(output))
print identifier(1, chars)
This lets you jump to any position, you're counting so the identifiers are totally unique, and it is easy to use any character set of any length (of two or more), and lower numbers give shorter identifiers.
itertools can always give you obfuscated one-liner iterators:
from itertools import combinations_with_replacement, chain
chars = 'abc'
a = chain(*(combinations_with_replacement(chars, i) for i in range(1, len(chars) + 1)))
Basically, this code creates an iterator that combines all combinations of chars of lengths 1, 2, ..., len(chars).
The output of for x in a: print x is:
('a',)
('b',)
('c',)
('a', 'b')
('a', 'c')
('b', 'a')
('b', 'c')
('c', 'a')
('c', 'b')
('a', 'b', 'c')
('a', 'c', 'b')
('b', 'a', 'c')
('b', 'c', 'a')
('c', 'a', 'b')
('c', 'b', 'a')
You can't really "associate" the index with annoying, but the following is a generator that will yield and provide the output you're asking for:
from itertools import combinations_with_replacement
def uniquenames(chars):
for i in range(1, len(chars)):
for j in combinations_with_replacement(chars, i):
yield ''.join(j)
print list(uniquenames('abc'))
# ['a', 'b', 'c', 'aa', 'ab', 'ac', 'bb', 'bc', 'cc']
As far as I understood we shouldn't specify maximum length of output string. So range is not enough:
>>> from itertools import combinations_with_replacement, count
>>> def u(chars):
... for i in count(1):
... for k in combinations_with_replacement(chars, i):
... yield "".join(k)
...
>>> g = u("abc")
>>> next(g)
'a'
>>> next(g)
'b'
>>> next(g)
'c'
>>> next(g)
'aa'
>>> next(g)
'ab'
>>> next(g)
'ac'
>>> next(g)
'bb'
>>> next(g)
'bc'
So it seems like you are trying to enumerate through all the strings generated by the language {'a','b','c'}. This can be done using finite state automata (though you don't want to do that). One simple way to enumerate through the language is to start with a list and append all the strings of length 1 in order (so a then b then c). Then append each letter in the alphabet to each string of length n-1. This will keep it in order as long as you append all the letters in the alphabet to a given string before moving on to the lexicographically next string.

List Comprehension for removing duplicates of characters in a string [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How do you remove duplicates from a list whilst preserving order?
So the idea is the program takes a string of characters and removes the same
string with any duplicated character only appearing
once -- removing any duplicated copy of a character.
So Iowa stays Iowa but the word eventually would become eventually
Here is an inefficient method:
x = 'eventually'
newx = ''.join([c for i,c in enumerate(x) if c not in x[:i]])
I don't think that there is an efficient way to do it in a list comprehension.
Here it is as an O(n) (average case) generator expression. The others are all roughly O(n2).
chars = set()
string = "aaaaa"
newstring = ''.join(chars.add(char) or char for char in string if char not in chars)
It works because set.add returns None, so the or will always cause the character to be yielded from the generator expression when the character isn't already in the set.
Edit: Also see refaim's solutions. My solution is like his second one, but it uses the set in the opposite way.
My take on his OrderedDict solution:
''.join(OrderedDict((char, None) for char in word))
Without list comprehensions:
from collections import OrderedDict
word = 'eventually'
print ''.join(OrderedDict(zip(word, range(len(word)))).keys())
With list comprehensions (quick and dirty solution):
word = 'eventually'
uniq = set(word)
print ''.join(c for c in word if c in uniq and not uniq.discard(c))
>>> s='eventually'
>>> "".join([c for i,c in enumerate(s) if i==s.find(c)])
'evntualy'
note that using a list comprehension with join() is silly when you can just use a generator expression. You should tell your teacher to update their question
You could make a set from the string, then join it together again. This works since sets can only contain unique values. The order wont be the same though:
In [1]: myString = "mississippi"
In [2]: set(myString))
Out[2]: set(['i', 'm', 'p', 's'])
In [3]: print "".join(set(myString))
Out[3]: ipsm
In [4]: set("iowa")
Out[4]: set(['a', 'i', 'o', 'w'])
In [5]: set("eventually")
Out[5]: set(['a', 'e', 'l', 'n', 't', 'u', 'v', 'y'])
Edit: Just saw the "List Comprehension" in the title so this probably isnt what your looking for.
Create a set from the original string, and then sort by position of character in original string:
>>> s='eventually'
>>> ''.join(sorted(set(s), key=s.index))
'evntualy'
Taken from this question, I think this is the fastest way:
>>> def remove_dupes(str):
... chars = set()
... chars_add = chars.add
... return ''.join(c for c in str if c not in chars and not chars_add(c))
...
>>> remove_dupes('hello')
'helo'
>>> remove_dupes('testing')
'tesing'
word = "eventually"
evntualy = ''.join(
c
for d in [dict(zip(word, word))]
for c in word
if d.pop(c, None) is not None)
Riffing off of agf's (clever) solution but without making a set outside of the generator expression:
evntualy = ''.join(s.add(c) or c for s in [set()] for c in word if c not in s)

Categories

Resources