Split a string and add into `tuple` - python

I know of only two simple ways to split a string and add into tuple
import re
1. tuple(map(lambda i: i, re.findall('[\d]{2}', '012345'))) # ('01', '23', '45')
2. tuple(i for i in re.findall('[\d]{2}', '012345')) # ('01', '23', '45')
Is there other simple ways?

I'd go for
s = "012345"
[s[i:i + 2] for i in range(0, len(s), 2)]
or
tuple(s[i:i + 2] for i in range(0, len(s), 2))
if you really want a tuple.

Usually one uses tuples when the dimensions/length is fixed (with possibly different types) and lists when there is an arbitrary number of values of the same type.
What is the reason to use a tuple instead of a list here?
Samples for tuples:
coordinates in a fixed dimensional space (e.g. 2d: (x, y) )
representation of dict key/value-pairs (e.g. ("John Smith", 38))
things where the number of tuple components is known before evaluating the expression
...
Samples for lists:
splitted string ("foo|bar|buz" splited on |s: ["foo", "bar", "buz"])
command line arguments (["-f", "/etc/fstab")
things where the number of list elements is (usually) not known before evaluating the expression
...

Another alternative:
s = '012345'
map(''.join, zip(*[iter(s)]*2))
Or if you need a tuple:
tuple(map(''.join, zip(*[iter(s)]*2)))
This method of grouping items into n-length groups comes straight from the documentation for zip().

Python 3
Here I make the assumption that the OP defines "simpler" as not using regular expressions.
Given a strings in a list ["jim_bob", "slim_jim"] that have some common pattern:
fileNameToTupleByUnderscore = lambda x: tuple(x.split('_'))
print(list(map(fileNameToTupleByUnderscore, ["jim_bob", "slim_jim"])))
Returns
[('jim', 'bob'), ('slim', 'jim')]
Note that you can add a strip('_') before the split('_') if you want to exclude trailing underscores.

Related

function that transforms a list of strings using a list of tuples?

I want to make a function that takes in a list of strings and a list of tuples as inputs. It then reacts to keywords in the list of tuples to modify the list of strings. the function should look like transform('stringlist','tuplelist')
For example, suppose we use the stringlist is ['cloud', 'terra']
here is how the different tuples will affect the string list:
(‘uppercase’, n) will change every nth letter in the list of strings to uppercase.
ex. transform(['cloud','terra'], [(‘uppercase’, 2)]) will return ['cLoUd','tErRa']
(‘first’,n) will split the list so only the first n letters are present.
ex. transform(['cloud','terra'], [(‘first’,3)]) will return ['clo','ter']
(‘last’,n) will split the list so only the last n letters are present.
ex. transform(['cloud','terra'], [(‘last’, 3)]) will return ['oud','rra']
(‘insert’, n, x) will insert a string in the list’s nth index.
ex. transform(['cloud','terra'], [(‘insert’, 1, ‘zidane’)])
will return ['cloud','zidane','terra']
('delete', n) will delete the nth index in the list.
ex. transform(['cloud','terra'], [(‘delete’, 0)])
will return [‘terra’]
('length',n). If the length of a string is greater than n, then str(len(examplestr)) will be placed in the middle of examplestr. If examplestr cannot be split in half evenly, then it will use the position of len(examplestr)/2 rounded down the nearest whole number.
ex. transform(['cloud','terra'], [(‘length’, 2)]) will return [‘cl5oud’,’te5rra’]
if a list is comprised of multiple tuples, it should look like:
transform(['cloud','terra'],[('last',2),('delete',0)])
which outputs ['ra']
What I'm Trying
for i in range(len(tuplelist)):
if 'last' in tuplelist[i]:
output = [j[((tuplelist[i])[1]-1):] for j in stringlist]
output
this excerpt takes in:
stringlist = ['cloud','terra']
tuplelist = [('last',3)]
as inputs and outputs ['oud', 'rra']. This is mainly a proof of concept and I wanted to see if it would be possible to modify the string using only if statements and for loops. However, I would like to see if this function could be done using lambda and list comprehension without the use of imports.
I think what you need is something like this. See comments for details.
from functools import reduce
class Transformer:
# define some transformer functions.
# lambda is a short cut, but better to write independent functions.
uppercase = lambda str_list, n: [str.replace(d, d[n], d[n].upper(), 1) for d in str_list]
first = lambda str_list, n: [d[:n] for d in str_list]
last = lambda str_list, n: [d[-n:] for d in str_list]
insert = lambda str_list, n, x: str_list.insert(n, x)
# Add more functions as you need them
# define a function to transform a list of strings using a function and its args
#staticmethod
def _transform_exec(str_list, transformer):
func, args = transformer[0], transformer[1]
return func(str_list, *args)
#staticmethod
def transform(str_list, *transformers):
# Collect the string list and function list
transformer_list = []
for transformer in transformers:
# for each transformer str, find the function delegate and its args
_transformer = getattr(Transformer, transformer[0])
transformer_list.append([_transformer, transformer[1:]])
# run a pipeline of transformers
return reduce(Transformer._transform_exec, transformer_list, str_list)
if __name__ == '__main__':
res = Transformer.transform(['cloud', 'terra'], ('uppercase', 1), ('first', 3), ('last', 1))
print(res)
This produces an output
['o', 'r']
The approach is only to create a pipeline of different functions using the reduce. This is only an example to demonstrate the possible appraoch. You will have to work to make it robust

Python adding preffix to list elements in python

i have list of strings
lst = ["/foo/dir/c-.*.txt","/foo/dir2/d-.*.svc","/foo/dir3/es-.*.info"]
and i have prefix string :
/root
is there any pythonic way to add the prefix string to each element in the list
so the end result will look like this:
lst = ["/root/foo/dir/c-.*.txt","/root/foo/dir2/d-.*.svc","/root/foo/dir3/es-.*.info"]
if it can be done without iterating and creating new list ...
used:
List Comprehensions
List comprehensions provide a concise way to create lists. Common
applications are to make new lists where each element is the result of
some operations applied to each member of another sequence or
iterable, or to create a subsequence of those elements that satisfy a
certain condition.
F=Strings
F-strings provide a way to embed expressions inside string literals,
using a minimal syntax. It should be noted that an f-string is really
an expression evaluated at run time, not a constant value. In Python
source code, an f-string is a literal string, prefixed with 'f', which
contains expressions inside braces. The expressions are replaced with
their values.
lst = ["/foo/dir/c-.*.txt","/foo/dir2/d-.*.svc","/foo/dir3/es-.*.info"]
prefix = '/root'
lst =[ f'{prefix}{path}' for path in lst]
print(lst)
I am not sure of pythonic, but this will be also on possible way
list(map(lambda x: '/root' + x, lst))
Here there is comparison between list comp and map List comprehension vs map
Also thanks to #chris-rands learnt one more way without lambda
list(map('/root'.__add__, lst))
Use list comprehensions and string concatenation:
lst = ["/foo/dir/c-.*.txt","/foo/dir2/d-.*.svc","/foo/dir3/es-.*.info"]
print(['/root' + p for p in lst])
# ['/root/foo/dir/c-.*.txt', '/root/foo/dir2/d-.*.svc', '/root/foo/dir3/es-.*.info']
Just simply write:
lst = ["/foo/dir/c-.*.txt","/foo/dir2/d-.*.svc","/foo/dir3/es-.*.info"]
prefix="/root"
res = [prefix + x for x in lst]
print(res)
A simple list comprehension -
lst = ["/foo/dir/c-.*.txt","/foo/dir2/d-.*.svc","/foo/dir3/es-.*.info"]
prefix = '/root'
print([prefix + string for string in lst]) # You can give it a variable if you want

A list of a lists into a single string

Suppose I have a list of lists say
A = [[1,1,1,1],[2,2,2,2]]
and I want to create two strings from that to be
'1111'
'2222'
How would we do this in python?
Maybe list comprehension:
>>> A = [[1,1,1,1],[2,2,2,2]]
>>> l=[''.join(map(str,i)) for i in A]
>>> l
['1111', '2222']
>>>
Now you've got it.
This is pretty easily done using join and a list comprehension.
A = [[1,1,1,1],[2,2,2,2]]
a_strings = [''.join(map(str, sub_list)) for sublist in A]
See, join() takes a list of strings and makes a string concatenating all the substrings and the list comprehension I used just loops through them all. Above I combined the 2 together.
On a second thought
map() is actually deemed more efficient (when not using lambda.. etc) and for SOME more readable. I'll just add an approach using map instead of a comprehension.
a_strings = map(''.join(), map(str, A))
This first takes the inner map and makes all the ints > strs then joins all the strs together for every sub-list.
Hopefully this makes things a bit more chewable for ya, each method is close to equivalent such that for this case you could consider them style choices.

Split into list of list and convert specific element of it to integer in python

Suppose we have a string:
A = "John\t20\nChris\t30\nAby\t10\n"
I want to make A into a list of list with the first element still str and second element converted to int:
What i have done is :
A = [[lambda k,v: str(k), int(v) for k, v in s.split('\t')] for s in A.split('\n')]
Any suggestion?
You can just get the values without lambda:
[[s.split('\t')[0], int(s.split('\t')[1])] for s in A.strip().split('\n')]
Note: strip() is added to parse out the trailing '\n'.
I think you are overcomplicating this. You don't need an anonymous function here at all.
First, split the string, then iterate over groups of two from the resulting list and convert the second element of the pairs to int.
For the second part, the itertools documentation has a recipe called grouper. You can either copy-paste the function or import it from more_itertools (which needs to be installed).
>>> from more_itertools import grouper
>>>
>>> a = "John\t20\nChris\t30\nAby\t10\n"
>>> [(s, int(n)) for s, n in grouper(2, a.split())]
[('John', 20), ('Chris', 30), ('Aby', 10)]
Finally, if you want to flatten the result, apply itertools.chain.
>>> list(chain.from_iterable((s, int(n)) for s, n in grouper(2, a.split())))
['John', 20, 'Chris', 30, 'Aby', 10]
If you like to use lambda, here's a simpler way;
L = lambda s: [s.split('\t')[0], int(s.split('\t')[1])]
A3 = [L(x) for x in A.strip().split('\n')]

python sort strings with digits at the end

what is the easiest way to sort a list of strings with digits at the end where some have 3 digits and some have 4:
>>> list = ['asdf123', 'asdf1234', 'asdf111', 'asdf124']
>>> list.sort()
>>> print list
['asdf111', 'asdf123', 'asdf1234', 'asdf124']
should put the 1234 one on the end. is there an easy way to do this?
is there an easy way to do this?
Yes
You can use the natsort module.
>>> from natsort import natsorted
>>> natsorted(['asdf123', 'asdf1234', 'asdf111', 'asdf124'])
['asdf111', 'asdf123', 'asdf124', 'asdf1234']
Full disclosure, I am the package's author.
is there an easy way to do this?
No
It's perfectly unclear what the real rules are. The "some have 3 digits and some have 4" isn't really a very precise or complete specification. All your examples show 4 letters in front of the digits. Is this always true?
import re
key_pat = re.compile(r"^(\D+)(\d+)$")
def key(item):
m = key_pat.match(item)
return m.group(1), int(m.group(2))
That key function might do what you want. Or it might be too complex. Or maybe the pattern is really r"^(.*)(\d{3,4})$" or maybe the rules are even more obscure.
>>> data= ['asdf123', 'asdf1234', 'asdf111', 'asdf124']
>>> data.sort( key=key )
>>> data
['asdf111', 'asdf123', 'asdf124', 'asdf1234']
What you're probably describing is called a Natural Sort, or a Human Sort. If you're using Python, you can borrow from Ned's implementation.
The algorithm for a natural sort is approximately as follows:
Split each value into alphabetical "chunks" and numerical "chunks"
Sort by the first chunk of each value
If the chunk is alphabetical, sort it as usual
If the chunk is numerical, sort by the numerical value represented
Take the values that have the same first chunk and sort them by the second chunk
And so on
l = ['asdf123', 'asdf1234', 'asdf111', 'asdf124']
l.sort(cmp=lambda x,y:cmp(int(x[4:]), int(y[4:]))
You need a key function. You're willing to specify 3 or 4 digits at the end and I have a feeling that you want them to compare numerically.
sorted(list_, key=lambda s: (s[:-4], int(s[-4:])) if s[-4] in '0123456789' else (s[:-3], int(s[-3:])))
Without the lambda and conditional expression that's
def key(s):
if key[-4] in '0123456789':
return (s[:-4], int(s[-4:]))
else:
return (s[:-3], int(s[-3:]))
sorted(list_, key=key)
This just takes advantage of the fact that tuples sort by the first element, then the second. So because the key function is called to get a value to compare, the elements will now be compared like the tuples returned by the key function. For example, 'asdfbad123' will compare to 'asd7890' as ('asdfbad', 123) compares to ('asd', 7890). If the last 3 characters of a string aren't in fact digits, you'll get a ValueError which is perfectly appropriate given the fact that you passed it data that doesn't fit the specs it was designed for.
The issue is that the sorting is alphabetical here since they are strings. Each sequence of character is compared before moving to next character.
>>> 'a1234' < 'a124' <----- positionally '3' is less than '4'
True
>>>
You will need to due numeric sorting to get the desired output.
>>> x = ['asdf123', 'asdf1234', 'asdf111', 'asdf124']
>>> y = [ int(t[4:]) for t in x]
>>> z = sorted(y)
>>> z
[111, 123, 124, 1234]
>>> l = ['asdf'+str(t) for t in z]
>>> l
['asdf111', 'asdf123', 'asdf124', 'asdf1234']
>>>
L.sort(key=lambda s:int(''.join(filter(str.isdigit,s[-4:]))))
rather than splitting each line myself, I ask python to do it for me with re.findall():
import re
import sys
def SortKey(line):
result = []
for part in re.findall(r'\D+|\d+', line):
try:
result.append(int(part, 10))
except (TypeError, ValueError) as _:
result.append(part)
return result
print ''.join(sorted(sys.stdin.readlines(), key=SortKey)),

Categories

Resources