Get sum of integers from list of strings - python

alist = [["Chanel-1000, Dior-2000, Prada-500"],
["Chloe-200,Givenchy-400,LV-600"], ["Bag-1,Bagg-2,Baggg-3"]]
alist_min = [
min(map(str.strip, x[0].split(',')),
key=lambda i: int(str.strip(i).split('-')[-1])) for x in alist
]
print(alist_min)
Given this script how to get the sum of alist_min it will only print the integer so given the result of [Prada-500, Chloe-200, Bagg-1] by doing the summation of the list the output would be
#total: 701

You can use sum() and list comprehension with split() function:
sum([int(x.split('-')[1]) for x in alist_min])
Full code:
alist = [["Chanel-1000, Dior-2000, Prada-500"],
["Chloe-200,Givenchy-400,LV-600"], ["Bag-1,Bagg-2,Baggg-3"]]
alist_min = [
min(map(str.strip, x[0].split(',')),
key=lambda i: int(str.strip(i).split('-')[-1])) for x in alist
]
print(alist_min)
print(sum([int(x.split('-')[1]) for x in alist_min]))
Output:
['Prada-500', 'Chloe-200', 'Bag-1']
701
Explanation:
use split() to split each string in alist_min at character -, into two, the second one has the number.
Convert this to an int.
Use above logic in list comprehension to generate list of numbers
Use sum() to take sum of this list

You can use regular expression, along with map and sum
import re
sum(map(int,(map(lambda x:re.findall('\d+',x)[0], alist_min))))
#output: 701

Related

Remove leading zeros in forecast period string

I am needing to format forecast period columns to later merge with another data frame.
Columns of my data frame are:
current_cols = [
'01+11',
'02+10',
'03+09',
'04+08',
'05+07',
'06+06',
'07+05',
'08+04',
'09+03',
'10+02',
'11+01'
]
desired_out = [
'1+11',
'2+10',
'3+9',
'4+8',
'5+7',
'6+6',
'7+5',
'8+4',
'9+3',
'10+2',
'11+1'
]
Originally, I tried to split the list by split('+'), and use lstrip('0') for each element in the list. Then recombine elements within tuple with + in between.
Is there a better approach? I'm having trouble combining elements in tuples back together, with + in between. Help would be much appreciated.
You can use re module for the task:
import re
pat = re.compile(r"\b0+")
out = [pat.sub(r"", s) for s in current_cols]
print(out)
Prints:
[
"1+11",
"2+10",
"3+9",
"4+8",
"5+7",
"6+6",
"7+5",
"8+4",
"9+3",
"10+2",
"11+1",
]
current_cols =['01+11','02+10','03+09','04+08','05+07','06+06','07+05','08+04','09+03','10+02','11+01']
desired_out = []
for item in current_cols:
if item[0] == "0":
item = item[1:]
if "+0" in item:
item = item.replace('+0', '+')
desired_out.append(item)
You can do it with nested comprehensions, conversion to int(), and formatting using an f-string:
current_cols = [
'01+11',
'02+10',
'03+09',
'04+08',
'05+07',
'06+06',
'07+05',
'08+04',
'09+03',
'10+02',
'11+01'
]
desired_out = [
f'{int(a)}+{int(b)}' for (a, b) in [
e.split('+') for e in current_cols
]
]
The code above will set desired_out with:
['1+11', '2+10', '3+9', '4+8', '5+7', '6+6', '7+5', '8+4', '9+3', '10+2', '11+1']
This method is implementing your original thought of splitting each element using the + signal as separator, extracting the leading zeros from each pair element (done with the int() conversion inside the f-string), and combining them back, with a + sign in between (also using the f-string).
The inner comprehension is just walking each element of the list, and splitting them by the + sign. The outer comprehension converts each element of each pair to int() to get rid of the leading zeros.
We want a bunch of map operations here to do the following:
split each element of current_cols on "+":
map(lambda s: s.split("+"), current_cols)
lstrip the "0" out of each element of the resulting lists:
map(lambda l: (x.lstrip("0") for x in l), ...)
join the resulting values on "+":
map("+".join, ...)
Then, we list out the elements of these map operations:
list(
map("+".join,
map(lambda l: (x.lstrip('0') for x in l),
map(lambda s: s.split('+'), current_cols)
)
)
)
which gives:
['1+11',
'2+10',
'3+9',
'4+8',
'5+7',
'6+6',
'7+5',
'8+4',
'9+3',
'10+2',
'11+1']

How to group all the first characters of a string in a list of string , all second character of a string and so on in a list of string in python

a=["cypatlyrm","aolsemone","nueeleuap"]
o/p needed is : canyoupleasetellmeyournamep
I have tried
for i in range(len(a)):
for j in range(len(a)):
res+=a[j][i]
it gives o/p : canyouple
how to get full output ?
You can use itertools.zip_longest with fill value as empty string'' and itertools.chain and the join the result to get what you want.
from itertools import zip_longest, chain
seq = ["cypatlyrm", "aolsemone", "nueeleuap"]
res = ''.join(chain.from_iterable(zip_longest(*seq, fillvalue='')))
print(res)
Output
canyoupleasetellmeyournamep
Using zip_longest makes sure that this also works with cases where the element sizes are not equal. If all elements in the list are guaranteed to be the same length then a normal zip would also work.
If all the elements have the same length then you can use this approach that does not need libraries that have to be imported.
seq = ["cypatlyrm", "aolsemone", "nueeleuap"]
res = ''
for i in range(len(seq[0])):
for j in seq:
res += j[i]
print(res)

Sum all numbers in a list of strings

sorry if this is very noob question, but I have tried to solve this on my own for some time, gave it a few searches (used the "map" function, etc.) and I did not find a solution to this. Maybe it's a small mistake somewhere, but I am new to python and seem to have some sort of tunnel vision.
I have some text (see sample) that has numbers inbetween. I want to extract all numbers with regular expressions into a list and then sum them. I seem to be able to do the extraction, but struggle to convert them to integers and then sum them.
import re
df = ["test 4497 test 6702 test 8454 test",
"7449 test"]
numlist = list()
for line in df:
line = line.rstrip()
numbers = re.findall("[0-9]+", line) # find numbers
if len(numbers) < 1: continue # ignore lines with no numbers, none in this sample
numlist.append(numbers) # create list of numbers
The sum(numlist) returns an error.
You don't need a regex for this. Split the strings in the list, and sum those that are numeric in a comprehension:
sum(sum(int(i) for i in s.split() if i.isnumeric()) for s in df)
# 27102
Or similarly, flatten the resulting lists, and sum once:
from itertools imprt chain
sum(chain.from_iterable((int(i) for i in s.split() if i.isnumeric()) for s in df))
# 27102
This is the source of your problem:
finadall returns a list which you are appending to numlist, a list. So you end up with a list of lists. You should instead do:
numlist.extend(numbers)
So that you end up with a single list of numbers (well, actually string representations of numbers). Then you can convert the strings to integers and sum:
the_sum = sum(int(n) for n in numlist)
Iterate twice over df and append each digit to numlist:
numlist = list()
for item in df:
for word in item.split():
if word.isnumeric():
numlist.append(int(word))
print(numlist)
print(sum(numlist))
Out:
[4497, 6702, 8454, 7449]
27102
You could make a one-liner using list comprehension:
print(sum([int(word) for item in df for word in item.split() if word.isnumeric()]))
>>> 27102
It's as easy as
my_sum = sum(map(int, numbers_list))
Here is an option using map, filter and sum:
First splits the strings at the spaces, filters out the non-numbers, casts the number-strings to int and finally sums them.
# if you want the sum per string in the list
sums = [sum(map(int, filter(str.isnumeric, s.split()))) for s in df]
# [19653, 7449]
# if you simply want the sum of all numbers of all strings
sum(sum(map(int, filter(str.isnumeric, s.split()))) for s in df)
# 27102

How to search through an arry containing strings, and create a new array with only integers

If I have an array that contains only strings, but some of them are numbers, how would I search through the array, determine which strings are actually numbers, and add those numbers to a new array? An example of the array is as follows: [ "Chris" , "90" , "Dave" , "76" ]
I have tried using a for loop to consecutively use isdigit() on each index, and if it is true to add that item to the new array.
scores = []
for i in range(len(name_and_score_split)):
if name_and_score_split[i].isdigit() == True:
scores.append(name_and_score_split[i])
When the above code is ran it tells me list data type does not have the "isdigit" function
edit: iv'e found that my problem is the list is actually a list of lists.
Use a list-comprehension and also utilise the for-each property of Python for rather than iterating over indices:
lst = ["Chris" , "90" , "Dave" , "76"]
scores = [x for x in lst if x.isdigit()]
# ['90', '76']
Alternately, filter your list:
scores = list(filter(lambda x: x.isdigit(), lst))
Assuming what you're trying if for integers you can do something like:
// Taken from and changing float by int.
def is_number(s):
try:
int(s)
return True
except ValueError:
return False
Then you can do
[x for x in name_and_score_split if is_number(x)]
If you want list of int:
s = ["Chris", "90", "Dave", "76"]
e = [int(i) for i in s if i.isdigit()]
print(e)
# OUTPUT: [90, 76]

searching a list of strings for integers

Given the following list of strings:
my_list = ['element0 123 321\n', 'element1 223 32221\n', 'element2 19823 328771\n', ... ]
how can I split each entry into a list of tuples:
[ (123, 321), (223, 32221), (19823, 328771), ... ]
In my other poor attempt, I managed to extract the numbers, but I encountered a problem, the element placeholder also contains a number which this method includes! It also doesn't write to a tuple, rather a list.
numbers = list()
for s in my_list:
for x in s:
if x.isdigit():
numbers.append((x))
numbers
We can first build a regex that identifies positive integers:
from re import compile
INTEGER_REGEX = compile(r'\b\d+\b')
Here \d stands for digit (so 0, 1, etc.), + for one or more, and \b are word boundaries.
We can then use INTEGER_REGEX.findall(some_string) to identify all positive integers from the input. Now the only thing left to do is iterate through the elements of the list, and convert the output of INTEGER_REGEX.findall(..) to a tuple. We can do this with:
output = [tuple(INTEGER_REGEX.findall(l)) for l in my_list]
For your given sample data, this will produce:
>>> [tuple(INTEGER_REGEX.findall(l)) for l in my_list]
[('123', '321'), ('223', '32221'), ('19823', '328771')]
Note that digits that are not separate words will not be matched. For instance the 8 in 'see you l8er' will not be matched, since it is not a word.
your attempts iterates on each char of the string. You have to split the string according to blank. A task that str.split does flawlessly.
Also numbers.append((x)) is numbers.append(x). For a tuple of 1 element, add a comma before the closing parenthese. Even if that doesn't solve it either.
Now, the list seems to contain an id (skipped), then 2 integers as string, so why not splitting, zap the first token, and convert as tuple of integers?
my_list = ['element0 123 321\n', 'element1 223 32221\n', 'element2 19823 328771\n']
result = [tuple(map(int,x.split()[1:])) for x in my_list]
print(result)
gives:
[(123, 321), (223, 32221), (19823, 328771)]

Categories

Resources