I am reading a text file which contains some numbers and letters in each row.
The first number of each row is a unique ID, and I want to copy all the same IDs into a separate list.
For example, if my list after reading the file is something like this:
[
['507', 'W', '1000', '1'],
['1', 'M', '6', '2'],
['1', 'W', '1400', '3'],
['1', 'M', '8', '8'],
['1', 'T', '101', '10'],
['507', 'M', '4', '12'],
['1', 'W', '1700', '15'],
['1', 'M', '7', '16'],
['507', 'M', '8', '20'],
...
]
The expected output should be the following:
[
['507', 'W', '1000', '1','507', 'M', '4', '12','507', 'M', '8', '20'],
['1', 'M', '6', '2','1', 'M', '8', '8','1', 'T', '101', '10','1', 'W', '1700', '15','1', 'M', '7', '16']
...
]
and so on for all other unique IDs in file.
All the rows starting with "507" should be stored in a different list and the rows starting with "1" stored in another and so forth.
My current code:
import operator
fileName = '/home/salman/Desktop/input.txt'
lineList = []
first_number = []
common_number = []
with open(fileName) as f:
for line in f:
lineList = f.readlines()
lineList.append(line)
lineList = [line.rstrip('\n') for line in open(fileName)]
first_number = [i.split()[0] for i in lineList]
print("Rows in list:" + str(lineList))
print("First number in list : " + str(first_number))
common_number = list(set(first_number))
print("Common Numbers in first number list : "+ str(common_number))
print("Repeated value and their index's are :")
This is my attempt. First please read this document on groupby: https://docs.python.org/3/library/itertools.html#itertools.groupby and how it is important to order your sequence first. Here your key is the first element of the lists so I order by that. sorted: https://docs.python.org/3/howto/sorting.html
Flatten a list of lists: How to make a flat list out of list of lists?
Explanation: Sort the elements so consecutive entries have the same key i.e. first element. When that key changes, then we know that all items with the previous key have been exhausted. So basically we need to find where the first element of consecutive entries change. That's what the groupby object provide. It gives a tuple of (key, group) where key would be the first element that identifies each group and group would be a generator of all lists with the same key (so a generator which really is just a list of lists). We unpack them and flatten them.
import itertools
lst = [
['507', 'W', '1000', '1'],
['1', 'M', '6', '2'],
['1', 'W', '1400', '3'],
['1', 'M', '8', '8'],
['1', 'T', '101', '10'],
['507', 'M', '4', '12'],
['1', 'W', '1700', '15'],
['1', 'M', '7', '16'],
['507', 'M', '8', '20']
]
lst = sorted(lst, key=lambda x: x[0])
groups = itertools.groupby(lst, key=lambda x: x[0])
groups = [[*group] for _, group in groups]
# 3rd element
grp_3rd = [[entry[2] for entry in group] for group in groups]
# you could sum it up right here
grp_3rd = [sum(float(entry[2]) for entry in group) for group in groups]
# or you could do to see each key and the corresponding sum i.e. {'1': 3222.0, '507': 1012.0}
grp_3rd = {group[0][0]: sum(float(entry[2]) for entry in group) for group in groups}
# continue on to your output
flatten = lambda list_: [sublist for l in list_ for sublist in l]
groups = [flatten(group) for group in groups]
output:
[['1', 'M', '6', '2', '1', 'W', '1400', '3', '1', 'M', '8', '8', '1', 'T', '101', '10', '1','W', '1700', '15', '1', 'M', '7', '16'],
['507', 'W', '1000', '1', '507', 'M', '4', '12', '507', 'M', '8', '20']]
The answer from Cedric below is easier to understand so if you can easily follow that here is how you could change it.
rows = [['507', 'W', '1000', '1'],
['1', 'M', '6', '2'],
['1', 'W', '1400', '3'],
['1', 'M', '8', '8'],
['1', 'T', '101', '10'],
['507', 'M', '4', '12'],
['1', 'W', '1700', '15'],
['1', 'M', '7', '16'],
['507', 'M', '8', '20']]
# get the output and sum directly
merged = {}
for row in rows:
if row[0] not in merged:
merged[row[0]] = [[], 0]
merged[row[0]][0].extend(row[1:])
merged[row[0]][1] += float(row[2])
# get the output and the list of 3rd elements
merged = {}
for row in rows:
if row[0] not in merged:
merged[row[0]] = ([], [])
merged[row[0]][0].extend(row[1:])
merged[row[0]][1].append(float(row[2]))
Something like this:
rows = [['507', 'W', '1000', '1'],
['1', 'M', '6', '2'],
['1', 'W', '1400', '3'],
['1', 'M', '8', '8'],
['1', 'T', '101', '10'],
['507', 'M', '4', '12'],
['1', 'W', '1700', '15'],
['1', 'M', '7', '16'],
['507', 'M', '8', '20']]
merged = {}
for row in rows:
if row[0] in merged:
merged[row[0]].extend(row[1:])
else:
merged[row[0]] = row
print(merged)
Output:
{
'507': ['507', 'W', '1000', '1', 'M', '4', '12', 'M', '8', '20'],
'1': ['1', 'M', '6', '2', 'W', '1400', '3', 'M', '8', '8', 'T', '101', '10', 'W', '1700', '15', 'M', '7', '16']
}
Or .extend(row) if you really want to repeat the ID
Related
I am fairly new to coding, and I need to put columns from a CSV file into a list. I cannot use any libraries like Pandas. This is the current code I have, but it is taking each character individually. What do I need to change so it takes the entire word?
def readfile(f):
with open(f) as csv_file:
csv_reader= csv.reader(csv_file, delimiter= ',')
for i in csv_reader:
newlist= list(i[1])
print(newlist)
This is an example of the output created.
['P', 'O', 'P', 'U', 'L', 'A', 'T', 'I', 'O', 'N']
['5', '2', '2', ',', '8', '1', '8']
['1', '5', '5', ',', '6', '5', '6']
['9', '6', '6', ',', '7', '0', '9']
['7', '7', '3', ',', '8', '8', '7']
['8', ',', '4', '4', '7', ',', '6', '0', '9']
['1', '4', ',', '4', '8', '4', ',', '2', '4', '2']
['1', ',', '3', '6', '4', ',', '4', '0', '0']
['1', ',', '1', '7', '1', ',', '0', '2', '7']
['4', ',', '3', '5', '0', ',', '9', '0', '1']
['5', ',', '0', '4', '6', ',', '7', '8', '0']
['4', '0', ',', '6', '0', '1']
['4', '4', ',', '9', '0', '9']
['3', '8', ',', '6', '6', '6']
I need it to all be in one list, like [522,818 , 155,656 , etc]
Assuming you would like to concatenate the rows from a csv containing a list in each row, such that an input csv looking like:
population
1,2
3,4
would print -> [1,2,3,4]
You can use the extend function on the python list builtin.
Here's how it would look:
import csv
with open('example.csv') as ff:
reader = csv.reader(ff)
reader.next() # skip the header that you arent using
concat_output = []
for row in reader:
concat_output.extend(row)
print(concat_output)
Perhaps this is what you are looking for:
>>>''.join(['5', '2', '2', ',', '8', '1', '8'])
'522,818'
I just found this earlier thread which provides more background/terminology: How to concatenate items in a list to a single string?.
I have a list that contains a set of strings like this:
list = ['235,ACCESS,19841136,22564960,4291500,20,527434,566876','046,ALLOWED,24737321,27863065,1086500,3,14208500,14254500']
I'm trying to make the elements of the list a sublist but without splitting the string.
I tried new_list = list(map(list, list)). This is the result taking as reference the first element of the list:
print(new_list[0]):
[['2', '3', '5', ',', 'A', 'C', 'C', 'E', 'S',',','1', '9', '8', '4', '1', '1', '3', '6', ',', '2', '2', '5', '6', '4', '9', '6', '0', ',', '4', '2', '9', '1', '5', '0', '0', ',', '2', '0', ',', '5', '2', '7', '4', '3', '4', ',', '5', '6', '6', '8', '7', '6']]
I would like this output:
print(new_list[0]):
[[235,'ACCESS',19841136,22564960,4291500,20,527434,566876]]
Thanks in advance for your help!
You can try split() with delimiter , like this -
new_list = [i.split(',') for i in list]
print (new_list[0])
Output:
['235', 'ACCESS', '19841136', '22564960', '4291500', '20', '527434', '566876']
One thing is that here the numbers are also represented as string. If you want integers instead you can use isdigit() method like this -
new_list = [[int(e) if e.isdigit() else e for e in i.split(',') ]for i in list]
print(new_list[0])
Output:
[235, 'ACCESS', 19841136, 22564960, 4291500, 20, 527434, 566876]
Also, please try to avoid naming your list list
So I have that a is
['2019313251', 'V', '11', '58', 'am']
['2017393939', 'V', '12', '03']
['2020123456', 'V', '13', '24']
['1997031312', 'V', '13', '25']
['2013313990', 'V', '13', '32', 'pm']
['2018423519', 'V', '14', '10', 'pm']
['2019313251', 'E', '2', '58', 'pm']
['2017393939', 'V', '3', '03']
['2017393939', 'E', '3', '04']
['2019313251', 'E', '5', '48', 'pm']
['2017313882', 'E', '17', '54']
and I want to get the values of a[2] for 'V' and make a list with it.
if a[1] == "V":
b = a[2]
b = b.strip().split(' ')
I tried this code but the ouput is
['11']
['12']
['13']
...
and so on. How can I make it horizontally and make a list to get a result like ['11', '12', '13', ...]
a = [['2019313251', 'V', '11', '58', 'am'],
['2017393939', 'V', '12', '03'],
['2020123456', 'V', '13', '24'],
['1997031312', 'V', '13', '25'],
['2013313990', 'V', '13', '32', 'pm'],
['2018423519', 'V', '14', '10', 'pm'],
['2019313251', 'E', '2', '58', 'pm'],
['2017393939', 'V', '3', '03'],
['2017393939', 'E', '3', '04'],
['2019313251', 'E', '5', '48', 'pm'],
['2017313882', 'E', '17', '54']]
output = [x[2] for x in a if x[1] == 'V']
print(output)
output is your result like ['11', '12', '13', ...]
I have 3 nested lists:
STEP = [['S', '1', 'B', '3'], ['S', '3', 'B', '11'], ['S', '5', 'B', '12'], ['S', '4', 'B', '13'], ['S', '2', 'B', '14']]
TRANSITION = [['T', '2', 'B', '4'], ['T', '7', 'B', '4'], ['T', '3', 'S', '4'], ['T', '5', 'S', '5'], ['T', '1', 'S', '2'], ['T', '8', 'S', '2'], ['T', '6', 'S', '1'], ['T', '9', 'S', '2'], ['T', '4', 'S', '1'], ['T', '10', 'S', '1']]
BRANCH = [['B', '3', 'T', '1'], ['B', '3', 'T', '7'], ['B', '4', 'S', '3'], ['B', '11', 'T', '3'], ['B', '11', 'T', '5'], ['B', '12', 'T', '6'], ['B', '12', 'T', '8'], ['B', '13', 'T', '4'], ['B', '13', 'T', '9'], ['B', '14', 'T', '2'], ['B', '14', 'T', '10']]
Each element holds information as such:
# Example
STEP[0] = ['S', '1', 'B', '3']
Where:
'S' is the STEP type
'1' is the STEP number id
'B' is the linked BRANCH type
'3' is the linked BRANCH number id
Starting from a STEP the data is all linked, so using the linked reference you can find the next element and the next until another STEP is reached.
This is some parameters of the data:
STEPS are connected to single BRANCHES
BRANCHES are connected to one or more TRANSITIONS
TRANSITIONS can be connected to a single BRANCH or STEP
The BRANCH data can have a fork where a single BRANCH id has one or more options for TRANSITIONS.
I would like to combine these forks to the same `BRANCH' id, ie:
# BRANCH[0] and BRANCH[1] both have an id of '3'
# therefore, need to be combined
BRANCH[0] = ['B', '3', 'T', ['1', '7']]
This should be done to create a new list that combines all 'like' BRANCHES.
My attempt thus far (did not get very far):
for i in B:
if i[1] == B['all except current i'][1]
# append the branch id and the two transitions
I'm pretty sure there are easier ways, but based on your example, you can try :
BRANCH = [['B', '3', 'T', '1'], ['B', '3', 'T', '7'], ['B', '4', 'S', '3'], ['B', '11', 'T', '3'], ['B', '11', 'T', '5'], ['B', '12', 'T', '6'], ['B', '12', 'T', '8'], ['B', '13', 'T', '4'], ['B', '13', 'T', '9'], ['B', '14', 'T', '2'], ['B', '14', 'T', '10']]
tmp = {}
final = []
for x in BRANCH:
if not f"{x[0]}-{x[1]}" in tmp:
tmp[f"{x[0]}-{x[1]}"] = [x[3]]
else:
tmp[f"{x[0]}-{x[1]}"].append(x[3])
for k, v in tmp.items():
one, two = k.split("-")
for x in BRANCH:
if x[0] == one and x[1] == two:
if not [one, two, x[2], v] in final:
final.append([one, two, x[2], v])
print(final)
[['B', '3', 'T', ['1', '7']], ['B', '4', 'S', ['3']], ['B', '11', 'T', ['3', '5']], ['B', '12', 'T', ['6', '8']], ['B', '13', 'T', ['4', '9']], ['B', '14', 'T', ['2', '10']]]
Demo
You can use a test for similarity of the branches and then loop over the branches checking similarity. The rest is just guarding against duplicates and massaging the data to a list of lists. I did randomize the data and add another item to check that it wouldn't choke on more than a pair of similar branches.
# Check similarity (first three fields equal).
def similar_p(one, two):
for item in range(len(one) - 1):
if one[item] != two[item]:
return False
return True
# Data. Works sorted and not.
branches = [
['B', '14', 'T', '2'],
['B', '12', 'T', '6'],
['B', '14', 'T', '10'],
['B', '13', 'T', '4'],
['B', '3', 'T', '9'],
['B', '12', 'T', '8'],
['B', '13', 'T', '9'],
['B', '3', 'T', '7'],
['B', '4', 'S', '3'],
['B', '11', 'T', '5'],
['B', '3', 'T', '1'],
['B', '11', 'T', '3'],
]
merge_dict = {}
# Loop over branches. Uncomment print statements to watch the action.
for i in range(len(branches)):
# print('check for similars to branch {}'.format(branches[i]))
# try/except to ensure the dictionary item is actually there.
try:
# print(merge_dict[tuple(branches[i][0:3])])
if branches[i][3] not in merge_dict[tuple(branches[i][0:3])]:
merge_dict[tuple(branches[i][0:3])].append(branches[i][3])
# print('initial appending to branch {}'.format(branches[i]))
except (KeyError):
merge_dict[tuple(branches[i][0:3])] = [branches[i][3]]
# print('starting branch {}'.format(branches[i]))
for j in range((i + 1), len(branches), 1):
if similar_p(branches[i], branches[j]):
if branches[j][3] not in merge_dict[tuple(branches[i][0:3])]:
merge_dict[tuple(branches[i][0:3])].append(branches[j][3])
# print('appending similar branch {} to branch {}'.format(branches[j], branches[i]))
merged = list()
# Massage into a list. Sorting is on you, kid.
for k,v in merge_dict.items():
if len(v) == 1:
merged.append([*k, *v])
else:
merged.append([*k, v])
print(merged)
Output:
[['B', '14', 'T', ['2', '10']], ['B', '12', 'T', ['6', '8']], ['B', '13', 'T', ['4', '9']], ['B', '3', 'T', ['9', '7', '1']], ['B', '4', 'S', '3'], ['B', '11', 'T', ['5', '3']]]
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have a large list an example of this list would be:
list = ['1', '2', '3', 'x', '4', '5', '6', 'x', '7', '8', '9', 'x']
I am struggling to create separate lists from this total list value separated by the value 'x'.
So for example, I need the above list to be converted to this:
list1 = ['1', '2', '3',]
list2 = ['4', '5', '6',]
list3 = ['7', '8', '9',]
Thank you for any help that you can give.
You can use a temporary list to store the values between 'x' and then use a condition to find 'x' and load the list to your new set of list
my_list = ['1', '2', '3', 'x', '4', '5', '6', 'x', '7', '8', '9', 'x']
new_list = []
temp_list = []
for value in my_list :
if value != 'x':
temp_list.append(value)
else:
new_list.append(temp_list)
temp_list = []
for sub_list in new_list:
print(sub_list)
Results:
['1', '2', '3']
['4', '5', '6']
['7', '8', '9']
Just for fun here is a version using all kinds of fun methods but IMO harder to follow.
test_list = ['1', '2', '3', 'x', '4', '5', '6', 'x', '7', '8', '9', 'x']
idx_list = [idx + 1 for idx, val in enumerate(test_list) if val == 'x']
res = [test_list[i: j] for i, j in zip([0] + idx_list, idx_list +
([len(test_list)] if idx_list[-1] != len(test_list) else []))]
print(res)
Her'e a slightly more general version using groupby that will work for all iterables:
from itertools import groupby
def split(iterable, delimiter):
return [list(v) for k, v in groupby(iterable, delimiter.__eq__) if not k]
l = ['1', '2', '3', 'x', '4', '5', '6', 'x', '7', '8', '9', 'x']
print(split(l, 'x'))
# [['1', '2', '3'], ['4', '5', '6'], ['7', '8', '9']]
You can do this very efficiently by cleverly using strings. I convert the list to a string, so I can use the split function which splits it in sublists based on a part of the string. I than convert those back to lists and ignore the last empty one.
l = ['1', '2', '3', 'x', '4', '5', '6', 'x', '7', '8', '9', 'x']
s = "".join(l)
list1,list2,list3 = [list(substring) for substring in s.split('x')][:-1] #To skip the empty list from the trailing 'x'