Split a list of strings by comma - python

I want to convert
['60,78', '70,77', '80,74', '90,75', '100,74', '110,75']
in to
['60', '78', '70', '77'.. etc]
I thought I could use
for word in lines:
word = word.split(",")
newlist.append(word)
return newlist
but this produces this instead:
[['60', '78'], ['70', '77'], ['80', '74'], ['90', '75'], ['100', '74'], ['110', '75']]
Can anyone please offer a solution?

You need to use list.extend instead of list.append.
newlist = []
for word in lines:
word = word.split(",")
newlist.extend(word) # <----
return newlist
Or, using list comprehension:
>>> lst = ['60,78', '70,77', '80,74', '90,75', '100,74', '110,75']
>>> [x for xs in lst for x in xs.split(',')]
['60', '78', '70', '77', '80', '74', '90', '75', '100', '74', '110', '75']

str.split actually returns a list.
Return a list of the words in the string, using sep as the delimiter string.
Since you are appending the returned list to newlist, you are getting a list of lists. Instead use list.extend method, like this
for word in lines:
newlist.extend(word.split(","))
But you can simply use nested list comprehension like this
>>> data = ['60,78', '70,77', '80,74', '90,75', '100,74', '110,75']
>>> [item for items in data for item in items.split(",")]
['60', '78', '70', '77', '80', '74', '90', '75', '100', '74', '110', '75']

using itertools.chain :
from itertools import chain
print(list(chain.from_iterable(ele.split(",") for ele in l)))
['60', '78', '70', '77', '80', '74', '90', '75', '100', '74', '110', '75']
The more items you have to flatten chain does it a bit more efficiently:
In [1]: l= ["1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20" for _ in range(100000)]
In [2]: from itertools import chain
In [3]: l= ["1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30" for _ in range(10000)]
In [4]: timeit (list(chain.from_iterable(ele.split(",") for ele in l)))
100 loops, best of 3: 17.7 ms per loop
In [5]: timeit [item for items in l for item in items.split(",")]
10 loops, best of 3: 20.9 ms per loop

I think this was the easiest way (thanks to a friend who helped with this)
list=['60,78', '70,77', '80,74', '90,75', '100,74', '110,75']
for word in list:
chapter, number = word.split(',') #word = word.split(',')
print(word)

Related

Get all the csv strings of a list as single elements in a new list with a comprehension list?

I have a list as follows:
listt = ['34','56,67','45,56,67','45']
I would like to get a list of single values.
this is my code:
new_list=[]
for element in listt:
if ',' in element:
subl=element.split(',')
new_list = new_list + subl
else:
new_list.append(element)
result:
['34', '56', '67', '45', '56', '67', '45']
Is there actually a way to do this with a comprehension list? (i.e. one liner).
It looks like too much code for such a tiny thing.
thanks.
spam = ['34','56,67','45,56,67','45']
eggs = [num for item in spam for num in item.split(',')]
print(eggs)
output
['34', '56', '67', '45', '56', '67', '45']
listt = ['34','56,67','45,56,67','45']
print(','.join(listt).split(','))
Prints:
['34', '56', '67', '45', '56', '67', '45']

Sort list of strings by position

I have a list of strings with the following pattern
my_list = ['/path/to/my/data/S1B_IW_GRDH_1SDV_20190610T030906_20190610T030931_016628_01F4BE_6B99_VV.tif',
'/path/to/my/data/S1A_IW_GRDH_1SDV_20190523T030954_20190523T031019_027349_0315A8_999E_VV.tif',
'/path/to/my/data/S1A_IW_GRDH_1SDV_20190511T030953_20190511T031018_027174_03102E_402F_VV.tif',
'/path/to/my/data/S1A_IW_GRDH_1SDV_20190628T030956_20190628T031021_027874_032595_0B1F_VV.tif',
'/path/to/my/data/S1A_IW_GRDH_1SDV_20190604T030955_20190604T031020_027524_031B16_BD33_VV.tif',
'/path/to/my/data/S1B_IW_GRDH_1SDV_20190622T030907_20190622T030932_016803_01F9F1_D6E9_VV.tif',
'/path/to/my/data/S1B_IW_GRDH_1SDV_20190505T030904_20190505T030929_016103_01E4AD_17B5_VV.tif']
I want to sort my list in chronological order using the time information that is present on each string (20190610,.....). The problem is that at the begining of each string I have the pattern S1A or S1B which makes that using a simple mylist.sort() does not work directly.
Looking in others posts I have seen that the solution would be to use the key argument with some kind of pattern.
My question is, how to start the sorting at a specific position of each string in my list. In my case I want to start sorting at position 35 right after _1SDV_
I have seen some options like
from operator import itemgetter
my_list.sort(key = itemgetter(35))
or
my_list.sort(key = lambda x: x[35])
Copying #schwobaseggl's solution from the comments, the following solution should work.
my_list.sort(key = lambda x: x[35:])
Example:
>>> my_list = ['91', '82', '73', '64', '55', '46', '37', '28', '19']
>>> my_list.sort()
>>> my_list
['19', '28', '37', '46', '55', '64', '73', '82', '91']
>>> my_list.sort(key = lambda x: x[1:]) # sorting after first position
>>> my_list
['91', '82', '73', '64', '55', '46', '37', '28', '19']
Using regex:
import regex as re
my_list = ['/path/to/my/data/S1B_IW_GRDH_1SDV_20190610T030906_20190610T030931_016628_01F4BE_6B99_VV.tif',
'/path/to/my/data/S1A_IW_GRDH_1SDV_20190523T030954_20190523T031019_027349_0315A8_999E_VV.tif',
'/path/to/my/data/S1A_IW_GRDH_1SDV_20190511T030953_20190511T031018_027174_03102E_402F_VV.tif',
'/path/to/my/data/S1A_IW_GRDH_1SDV_20190628T030956_20190628T031021_027874_032595_0B1F_VV.tif',
'/path/to/my/data/S1A_IW_GRDH_1SDV_20190604T030955_20190604T031020_027524_031B16_BD33_VV.tif',
'/path/to/my/data/S1B_IW_GRDH_1SDV_20190622T030907_20190622T030932_016803_01F9F1_D6E9_VV.tif',
'/path/to/my/data/S1B_IW_GRDH_1SDV_20190505T030904_20190505T030929_016103_01E4AD_17B5_VV.tif']
my_list.sort(key=lambda x: re.findall("\d{8}", x)[0])
print(my_list)
Output:
['/path/to/my/data/S1B_IW_GRDH_1SDV_20190505T030904_20190505T030929_016103_01E4AD_17B5_VV.tif',
'/path/to/my/data/S1A_IW_GRDH_1SDV_20190511T030953_20190511T031018_027174_03102E_402F_VV.tif',
'/path/to/my/data/S1A_IW_GRDH_1SDV_20190523T030954_20190523T031019_027349_0315A8_999E_VV.tif',
'/path/to/my/data/S1A_IW_GRDH_1SDV_20190604T030955_20190604T031020_027524_031B16_BD33_VV.tif',
'/path/to/my/data/S1B_IW_GRDH_1SDV_20190610T030906_20190610T030931_016628_01F4BE_6B99_VV.tif',
'/path/to/my/data/S1B_IW_GRDH_1SDV_20190622T030907_20190622T030932_016803_01F9F1_D6E9_VV.tif',
'/path/to/my/data/S1A_IW_GRDH_1SDV_20190628T030956_20190628T031021_027874_032595_0B1F_VV.tif']

How to read 1st column from csv and separate into multidimensional array

I am trying to separate a column that I read from a .csv file into a multidimensional array. So, if the first column is read into a single array and looks like this:
t = ['90-0066', '24', '33', '34', '91-0495', '22', '33', '92-6676', '23', '32']
How do I write the code in python for every value like '90-0066' the following numbers are put into an array until the next - value? So I would like the array to look like:
t = [['24', '33', '34'], ['22', '33'], ['23', '32']]
Thanks!
You can use itertools.groupby in a list comprehension:
from itertools import groupby
t = [list(g) for k, g in groupby(t, key=str.isdigit) if k]
t becomes:
[['24', '33', '34'], ['22', '33'], ['23', '32']]
If the numbers are possibly floating points, you can use regex instead:
import re
t = [list(g) for k, g in groupby(t, key=lambda s: bool(re.match(r'\d+(?:\.\d+)?$', s)) if k]
Or zip longest with two list comprehensions:
>>> from itertools import zip_longest
>>> l=[i for i,v in enumerate(t) if not v.isdigit()]
>>> [t[x+1:y] for x,y in zip_longest(l,l[1:])]
[['24', '33', '34'], ['22', '33'], ['23', '32']]
>>>

Python Lists Unsolved [duplicate]

This question already has answers here:
Add SUM of values of two LISTS into new LIST
(22 answers)
Closed 6 years ago.
I'm pretty new to Python although I have learned most the basic's although I need to be able to read from a csv file (which so far works), then append the data from this csv into lists which is working, and the part I am unsure about is using two of these lists and / 120 and * 100
for example the list1 first score is 55 and list2 is 51, I want to merge these together into a list to equal 106 and then add something which can divide then times each one as there is 7 different numbers in each list.
import csv
list1 = []
list2 = []
with open("scores.csv") as f:
reader = csv.reader(f)
for row in reader:
list1.append(row[1])
list2.append(row[2])
print (list1)
print (list2)
OUTPUT
['55', '25', '40', '21', '52', '42', '19']
['51', '36', '50', '39', '53', '33', '40']
EXPECTED OUTPUT (WANTED OUTPUT)
['106', '36', '90', '60', '105', '75', '59']
which then needs to be divided by 120 and * 100 for each one.
Check out zip.
for a, b in zip(list1, list2):
# .... do stuff
so for you maybe:
output = [((int(a)+int(b))/120)*100 for a, b in zip(list1, list2)]
Make a new list that takes your desired calculations into account.
>>> list1 = ['55', '25', '40', '21', '52', '42', '19']
>>> list2 = ['51', '36', '50', '39', '53', '33', '40']
>>> result = [(int(x)+int(y))/1.2 for x,y in zip(list1, list2)]
>>> result
[88.33333333333334, 50.833333333333336, 75.0, 50.0, 87.5, 62.5, 49.16666666666667]

removing whitespace and \n from list in python?

list1= ['34 5\n', '67 37\n', '40 33\n', '99 100\n', '55 22']
the above is the list that i have, how can i make that to
['34','5','67','37','40','33','99','100','55','22']
I want to remove the white space and '\n'. rstrip , strip , replace have been tried but none of the worked
list1.rstrip('\n')
list1.strip('\n')
list1.remove('\n')
Use a pair of nested list comprehensions, the inner of which splits the strings on whitespace and the outer of which merges the nested lists together.
>>> list1= ['34 5\n', '67 37\n', '40 33\n', '99 100\n', '55 22']
>>> [x for sublist in [s.split() for s in list1] for x in sublist]
['34', '5', '67', '37', '40', '33', '99', '100', '55', '22']
Or, if this isn't to your liking, do it with loops instead. (This is probably clearer than a nested list comprehension, come to think of it.)
>>> result = []
>>> for s in list1:
for num in s.split():
result.append(num)
>>> result
['34', '5', '67', '37', '40', '33', '99', '100', '55', '22']
from itertools import chain
list2 = list(chain.from_iterable(map(str.split,list1)))
you can break this down to be somewhat more readable
flatten_list = chain.from_iterable
list_of_splits = map(str.split,list1) #essentially [s.split() for s in list1]
list2 = flatten_list(list_of_splits)
list2 = []
for item1 in list1:
split_list = " ".split(item1)
for item2 in split_list:
list2.append(item2.rstrip())

Categories

Resources