How to unpack values from a file - python

If as a input i have a file that read-
0->54:15
1->41:12
2->35:6
3->42:10
4->34:7
5->58:5
6->55:12
7->39:6
8->36:12
9->38:15
10->53:13
11->56:12
12->51:5
13->48:8
14->60:14
15->46:12
16->57:6
17->52:9
18->40:11
Actually this is an adjacency list. I want my code to read the file and take the values as -> u=0,v=54, w=15 and then go with my plan. How can i do this? Thank you in advance for your time to read and answer this.

Using .split would be good.
For each line in the file (You can get this by using the open() function) split it using the arrow and the colon.
for line in lines:
split_line = line.split("->") # Split by the arrow first
split_line = split_line[0] + split_line[1].split(":")
u, v, w = split_line # Note u, v, and w are strings
I would recommend using JSON format so you can use the json module in python the parse the file into variables easily.

If you had a single string:
import re
s = \
'''0->54:15
1->41:12
2->35:6
3->42:10
4->34:7
5->58:5
6->55:12
7->39:6
8->36:12
9->38:15
10->53:13
11->56:12
12->51:5
13->48:8
14->60:14
15->46:12
16->57:6
17->52:9
18->40:11'''
s = s.split('\n')
output = [re.split('->|:', x) for x in s]
output
[['0', '54', '15'], ['1', '41', '12'], ['2', '35', '6'], ['3', '42', '10'], ['4', '34', '7'], ['5', '58', '5'], ['6', '55', '12'], ['7', '39', '6'], ['8', '36', '12'], ['9', '38', '15'], ['10', '53', '13'], ['11', '56', '12'], ['12', '51', '5'], ['13', '48', '8'], ['14', '60', '14'], ['15', '46', '12'], ['16', '57', '6'], ['17', '52', '9'], ['18', '40', '11']]
If you want a dictionary
d = {x[0]:[x[1],x[2]] for x in output}
d
{'0': ['54', '15'], '1': ['41', '12'], '2': ['35', '6'], '3': ['42', '10'], '4': ['34', '7'], '5': ['58', '5'], '6': ['55', '12'], '7': ['39', '6'], '8': ['36', '12'], '9': ['38', '15'], '10': ['53', '13'], '11': ['56', '12'], '12': ['51', '5'], '13': ['48', '8'], '14': ['60', '14'], '15': ['46', '12'], '16': ['57', '6'], '17': ['52', '9'], '18': ['40', '11']}
If you want a dataframe:
import pandas as pd
df = pd.DataFrame(output, columns=['u','v','w'])
df
u v w
0 0 54 15
1 1 41 12
2 2 35 6
3 3 42 10
4 4 34 7
5 5 58 5
6 6 55 12
7 7 39 6
8 8 36 12
9 9 38 15
10 10 53 13
11 11 56 12
12 12 51 5
13 13 48 8
14 14 60 14
15 15 46 12
16 16 57 6
17 17 52 9
18 18 40 11

Here is how you can use re.split() to split strings with multiple delimiters:
from re import split
with open('file.txt','r') as f:
l = f.read().splitlines()
lst = [list(filter(None, split('[(\-\>):]',s))) for s in l]
print(lst)
Output:
[['0', '54', '15'],
['1', '41', '12'],
['2', '35', '6'],
['3', '42', '10'],
['4', '34', '7'],
['5', '58', '5'],
['6', '55', '12'],
['7', '39', '6'],
['8', '36', '12'],
['9', '38', '15'],
['10', '53', '13'],
['11', '56', '12'],
['12', '51', '5'],
['13', '48', '8'],
['14', '60', '14'],
['15', '46', '12'],
['16', '57', '6'],
['17', '52', '9'],
['18', '40', '11']]
Breaking it down:
This: lst = [list(filter(None, split('[(\-\>):]',s))) for s in l] is the equivalent of:
lst = [] # The main list
for s in l: # For every line in the list of lines
uvw = split('[(\-\>):]',s) # uvw = a list of the numbers
uvw = list(filter(None,uvw)) # There is an empty string in the list, so filter it out
lst.append(uvw) # Add the list to the main list

I'm going to challenge the way that you're getting the input file in the first place: if you have any control over how you get this input, I'd encourage you to change its format. (If not, maybe this answer will help people who have a similar issue in the future).
There is typically little reason to "roll your own" serialization and deserialization like this - it's reinventing the wheel, given that most modern languages have built-in libraries to do this already. Rather, if at all possible, you should use a standard serialization and deserialization mechanism like Python pickle or a JSON serializer (or even a CSV, so that you can use a CSV parser).

Related

How to append a 2-dimensional list to a 3-dimensional list? [duplicate]

This question already has answers here:
How do I clone a list so that it doesn't change unexpectedly after assignment?
(24 answers)
List of lists changes reflected across sublists unexpectedly
(17 answers)
Closed 1 year ago.
It should work, but it doesn't, here's the code:
with open('data.txt') as f:
lines = f.readlines()
#creating a 3-dimensional list for the tables
all_tables = []
table = []
row = []
for line in lines:
if line != "\n":
line = line.rstrip()
row = line.split(" ")
while "" in row:
row.remove("")
for number in row:
number = int(number)
table.append(row)
else: #if line is empty
print(f"adding this table: {table}")
all_tables.append(table)
print(f"all_tables is now: {all_tables}")
table.clear()
#did this, since the last table wouldn't get appended
#since the codeblock beneath the else woun't get executed the last time
print(f"adding this table: {table}")
all_tables.append(table)
print(f"all_tables are now: {all_tables}")
but the output of the code is:
adding this table: [['97', '18', '90', '62', '17'], ['98', '88', '49', '41', '74'], ['66', '9', '83', '69', '91'], ['33', '57', '3', '71', '43'], ['11', '50', '7', '10', '28']]
all_tables is now: [[['97', '18', '90', '62', '17'], ['98', '88', '49', '41', '74'], ['66', '9', '83', '69', '91'], ['33', '57', '3', '71', '43'], ['11', '50', '7', '10', '28']]]
adding this table: [['6', '34', '13', '5', '9'], ['50', '21', '66', '77', '3'], ['60', '74', '40', '12', '33'], ['69', '57', '99', '18', '95'], ['70', '72', '49', '71', '87']]
all_tables is now: [[['6', '34', '13', '5', '9'], ['50', '21', '66', '77', '3'], ['60', '74', '40', '12', '33'], ['69', '57', '99', '18', '95'], ['70', '72', '49', '71', '87']], [['6', '34', '13', '5', '9'], ['50', '21', '66', '77', '3'], ['60', '74', '40', '12', '33'], ['69', '57', '99', '18', '95'], ['70', '72', '49', '71', '87']]]
adding this table: [['75', '12', '11', '91', '56'], ['82', '22', '18', '77', '10'], ['85', '1', '13', '89', '31'], ['62', '69', '39', '5', '92'], ['16', '49', '21', '60', '64']]
all_tables are now: [[['75', '12', '11', '91', '56'], ['82', '22', '18', '77', '10'], ['85', '1', '13', '89', '31'], ['62', '69', '39', '5', '92'], ['16', '49', '21', '60', '64']], [['75', '12', '11', '91', '56'], ['82', '22', '18', '77', '10'], ['85', '1', '13', '89', '31'], ['62', '69', '39', '5', '92'], ['16', '49', '21', '60', '64']], [['75', '12', '11', '91', '56'], ['82', '22', '18', '77', '10'], ['85', '1', '13', '89', '31'], ['62', '69', '39', '5', '92'], ['16', '49', '21', '60', '64']]]
Process finished with exit code 0
with my data.txt being:
97 18 90 62 17
98 88 49 41 74
66 9 83 69 91
33 57 3 71 43
11 50 7 10 28
6 34 13 5 9
50 21 66 77 3
60 74 40 12 33
69 57 99 18 95
70 72 49 71 87
75 12 11 91 56
82 22 18 77 10
85 1 13 89 31
62 69 39 5 92
16 49 21 60 64
So instead of appending "table" to "all_tables", it appends it and changes all existing elements of "all_tables" to "table".
How can I prevent that from happening?
I just want to add the 2-dimensional list to the 3-dimensional list

How to check if a value in a text value exits in a 2D array?

I have a 2D array that looks like this:
[['A.J. Greer', 'COL', 'LW', '15', '1', '1', '2', '14', '9', '20', '5'],
['Aaron Ekblad', 'FLA', 'D', '82', '13', '24', '37', '47', '180', '114', '88'],
['Adam Clendening', 'CLS', 'D', '4', '0', '0', '0', '0', '3', '1', '3'],
['Adam Cracknell', 'FA', 'C', '2', '0', '0', '0', '0', '3', '6', '0'],
['Adam Erne', 'DET', 'LW', '65', '7', '13', '20', '40', '70', '159', '26'],
['Adam Gaudette', 'VAN', 'C', '56', '5', '7', '12', '18', '55', '48', '15'],
['Adam Henrique', 'ANH', 'C', '82', '18', '24', '42', '24', '122', '78', '71'],
['Adam Johnson', 'PIT', 'C', '6', '0', '2', '2', '0', '3', '11', '3'],
['Adam Larsson', 'EDM', 'D', '82', '3', '17', '20', '44', '117', '256', '128'],
['Adam Lowry', 'WPG', 'LW', '78', '12', '11', '23', '33', '105', '223', '49'],
['Adam McQuaid', 'FA', 'D', '50', '3', '4', '7', '42', '28', '122', '88'],
['Adam Pelech', 'NYI', 'D', '78', '5', '16', '21', '24', '110', '149', '116'],
['Adrian Kempe', 'LA', 'C', '81', '12', '16', '28', '50', '118', '86', '21'],
['Alan Quine', 'CGY', 'C', '13', '3', '2', '5', '6', '11', '14', '2'],
['Alec Martinez', 'LA', 'D', '60', '4', '14', '18', '8', '78', '78', '135'],
['Aleksander Barkov', 'FLA', 'C', '82', '35', '61', '96', '8', '206', '28', '61'],
['Alex Biega', 'VAN', 'D', '41', '2', '14', '16', '22', '91', '101', '43'],
['Alex Chiasson', 'EDM', 'RW', '73', '22', '16', '38', '32', '123', '85', '31']]
It's a list of players and there stats, I also have a text file that looks like this:
Name Team Pos Games G A Pts PIM SOG Hits BS
================================================================================
A.J. Greer COL LW 15 1 1 2 14 9 20 5
Aaron Ekblad FLA D 82 13 24 37 47 180 114 88
Adam Clendening CLS D 4 0 0 0 0 3 1 3
Adam Cracknell FA C 2 0 0 0 0 3 6 0
Adam Erne DET LW 65 7 13 20 40 70 159 26
Adam Gaudette VAN C 56 5 7 12 18 55 48 15
Adam Henrique ANH C 82 18 24 42 24 122 78 71
Adam Johnson PIT C 6 0 2 2 0 3 11 3
I want to check if a player in my text file exits in my 2D list and if he does aI want to add all there point totals. This is what I did so far:
sum = 0
f = open(filename, "r")
lines = f.readlines()
for names in lines:
if names == stat_list[0]:
sum += stat_list[6]
return sum
However I keep getting zero, any thoughts?
I tried doing this to check each line in my text file and to only check the names in the beginning but it still gives me 0.
sum = 0
f = open(filename, "r")
lines = f.readlines()
while True:
for names in lines:
if names[20] == stat_list[0]:
sum += stat_list[6]
return sum
Hint: try debugging or adding a bunch of println statements - your code is probably doing exactly what you tell it to.
I would guess the problem would be in this line:
if names == stat_list[0]:
You haven't given us what stat_list is, but it might be comparing the name ('A.J. Greer) to the entire first line of the 2d array (stat_list). This would always be false, and thus sum would never changed.
You might want to add a counter like so:
sum = 0
f = open(filename, "r")
lines = f.readlines()
index = 0
for names in lines:
if names == stat_list[index][0]:
sum += stat_list[6]
index += 1
return sum

Python, Rearanging a numpy array by column 0 value, signed integers

I've got a folder with a dataset which is poorly sorted, and id like to rearrange the information that I'm pulling from it as I'm reading it. Therefore I am wondering, is there an easy way to sort following input:
[['-10' '10']
['-10' '20']
['-15' '10']
['-15' '20']
['-5' '10']
['-5' '20]
['0' '10']
['0' '20']
['10' '10']
['10' '20']
['15' '10']
['15' '20']
['5' '10']
['5' '20]
into following output:
[['-15' '10']
['-15' '20']
['-10' '10']
['-10' '20']
['-5' '10']
['-5' '20]
['0' '10']
['0' '20']
['5' '10']
['5' '20]
['10' '10']
['10' '20']
['15' '10']
['15' '20']]
How about using pandas dataframe?
import pandas as pd
data = [['5', '10'], ['4', '20']]
dataframe = pd.DataFrame(data).sort_values(by=0) #define by as index
print(dataframe)
#Output:
# 0 1
#1 4 20
#0 5 10
I'm afraid you'll need to cast your str values to int for the desired sort order. Then, you just want to sort a list by multiple attributes. If you want to have str values in the output, too, you'll also need to cast backwards.
import operator
a = [['-10', '10'],
['-10', '20'],
['-15', '10'],
['-15', '20'],
['-5', '10'],
['-5', '20'],
['0', '10'],
['0', '20'],
['10', '10'],
['10', '20'],
['15', '10'],
['15', '20'],
['5', '10'],
['5', '20']]
print(a)
b = [[int(e[0]), int(e[1])] for e in a] # to int
b = sorted(b, key=operator.itemgetter(0, 1)) # sort
b = [[str(e[0]), str(e[1])] for e in b] # to str
print(b)
Output:
[['-10', '10'], ['-10', '20'], ['-15', '10'], ['-15', '20'], ['-5', '10'], ['-5', '20'], ['0', '10'], ['0', '20'], ['10', '10'], ['10', '20'], ['15', '10'], ['15', '20'], ['5', '10'], ['5', '20']]
[['-15', '10'], ['-15', '20'], ['-10', '10'], ['-10', '20'], ['-5', '10'], ['-5', '20'], ['0', '10'], ['0', '20'], ['5', '10'], ['5', '20'], ['10', '10'], ['10', '20'], ['15', '10'], ['15', '20']]
Hope that helps!
EDIT: Or just use some lambda expression in sorted:
c = sorted(a, key = lambda x: (int(x[0]), int(x[1])))
print(c)

Python: sort list except first line

I have this list :
[['Nom', 'Francais', 'Anglais', 'Maths'], ['Catherine', '9', '17', '9'], ['Karim', '12', '15', '11'], ['Rachel', '15', '15', '14'], ['Roger', '12', '14', '12'], ['Gabriel', '7', '13', '8'], ['Francois', '14', '8', '15'], ['Henri', '10', '12', '13'], ['Stephane', '18', '12', '8'], ['Karine', '9', '10', '10'], ['Marie', '10', '10', '10'], ['Claire', '15', '9', '12'], ['Marine', '12', '9', '12']]
I want to sort it with the names (or, in another words, by alphabetical order of the [0] element of each list within the list) but i don't want don't want the first list (['Nom', 'Francais', 'Anglais', 'Maths']) to be sorted with the others , how can in do that ?
Thanks a lot !
You can use range assignment:
>>> from pprint import pprint # just to have a nice display
>>> data = [['Nom', 'Francais', 'Anglais', 'Maths'], ['Catherine', '9', '17', '9'], ['Karim', '12', '15', '11'], ['Rachel', '15', '15', '14'], ['Roger', '12', '14', '12'], ['Gabriel', '7', '13', '8'], ['Francois', '14', '8', '15'], ['Henri', '10', '12', '13'], ['Stephane', '18', '12', '8'], ['Karine', '9', '10', '10'], ['Marie', '10', '10', '10'], ['Claire', '15', '9', '12'], ['Marine', '12', '9', '12']]
>>> pprint(data)
[['Nom', 'Francais', 'Anglais', 'Maths'],
['Catherine', '9', '17', '9'],
['Karim', '12', '15', '11'],
['Rachel', '15', '15', '14'],
['Roger', '12', '14', '12'],
['Gabriel', '7', '13', '8'],
['Francois', '14', '8', '15'],
['Henri', '10', '12', '13'],
['Stephane', '18', '12', '8'],
['Karine', '9', '10', '10'],
['Marie', '10', '10', '10'],
['Claire', '15', '9', '12'],
['Marine', '12', '9', '12']]
>>> data[1:] = sorted(data[1:])
>>> pprint(data)
[['Nom', 'Francais', 'Anglais', 'Maths'],
['Catherine', '9', '17', '9'],
['Claire', '15', '9', '12'],
['Francois', '14', '8', '15'],
['Gabriel', '7', '13', '8'],
['Henri', '10', '12', '13'],
['Karim', '12', '15', '11'],
['Karine', '9', '10', '10'],
['Marie', '10', '10', '10'],
['Marine', '12', '9', '12'],
['Rachel', '15', '15', '14'],
['Roger', '12', '14', '12'],
['Stephane', '18', '12', '8']]
Personally, I'd do something like this. But it assumes you're semi-comfortable with Pandas. This gives you a lot more flexibility to do more with the data.
import pandas as pd
nl = [['Nom', 'Francais', 'Anglais', 'Maths'], ['Catherine', '9', '17', '9'], ['Karim', '12', '15', '11'], ['Rachel', '15', '15', '14'], ['Roger', '12', '14', '12'], ['Gabriel', '7', '13', '8'], ['Francois', '14', '8', '15'], ['Henri', '10', '12', '13'], ['Stephane', '18', '12', '8'], ['Karine', '9', '10', '10'], ['Marie', '10', '10', '10'], ['Claire', '15', '9', '12'], ['Marine', '12', '9', '12']]
df = pd.DataFrame(columns = nl[0])
for l, c in zip(nl[0], range(4)):
df[l] = [ r[c] for r in nl[1:] ]
df.sort_values(by = 'Nom', inplace = True)
df.reset_index(drop = True, inplace = True)
which yields:
Nom Francais Anglais Maths
0 Catherine 9 17 9
1 Claire 15 9 12
2 Francois 14 8 15
3 Gabriel 7 13 8
4 Henri 10 12 13
5 Karim 12 15 11
6 Karine 9 10 10
7 Marie 10 10 10
8 Marine 12 9 12
9 Rachel 15 15 14
10 Roger 12 14 12
11 Stephane 18 12 8
and then if you need a .csv per your most recent comment, it's simply:
df.to_csv('/directory/my_filename.csv', index = False)

get max and min value in dictionary

Illinois: ['13', '12', '18', '23', '26', '25', '24', '19', '13', '10', '15', '14', '14', '4', '3']
Indiana: ['7', '6', '7', '8', '11', '11', '13', '12', '7', '7', '7', '7', '9', '2', '2']
Those are in my dictionary as d.
How would I get the largest and smallest value in each key in the dictionary and get the index where's the value is.
For example:
In Illinois, 26 is the largest value which is index 5 and 3 is the smallest value which is index 15.
in Indiana: 13 is largest value which is index 7 and 2 is the smallest value which is index 14
The output:
Illinois: 26 in index 5 and 3 in index 15
Indiana: 13 in index 7 and 2 in index 14
How would I do this?
d = {}
for row in csv_f:
d[row[0]]=row[1:]
You can get the max and mins printed out as your string is like this:
(assuming you only want the first occurrence)
MY_D = {'Illinois': ['13', '12', '18', '23', '26', '25', '24', '19', '13', '10', '15', '14', '14', '4', '3'],
'Indiana': ['7', '6', '7', '8', '11', '11', '13', '12', '7', '7', '7', '7', '9', '2', '2']}
for k,v in MY_D.items():
#This assumes that everything in v is an int, or rather can be converted to one.
my_l = [int(n) for n in v]
#if not
#my_l = [int(n) for n in v if n.isdigit()]
_max, _min = max(my_l), min(my_l)
print("%s: Min - %d in index %d, Max - %d in index %d" % (k, _min, my_l.index(_min), _max, my_l.index(_max)))
Here is a solution returning a dict {country: (maxval, index), (minval, index))}:
d = {
'Illinois': ['13', '12', '18', '23', '26', '25', '24', '19', '13', '10', '15', '14', '14', '4', '3'],
'Indiana': ['7', '6', '7', '8', '11', '11', '13', '12', '7', '7', '7', '7', '9', '2', '2']
}
maxmin = {}
for state, numbers in d.items():
maxmin[state] = (
max(enumerate(numbers), key=lambda x: int(x[1])),
min(enumerate(numbers), key=lambda x: int(x[1]))
)
print(maxmin)
Bit thrown together, but seems to do the job.
d = {"Illinois": ['13', '12', '18', '23', '26', '25', '24', '19', '13', '10', '15', '14', '14', '4', '3'],
"Indiana": ['7', '6', '7', '8', '11', '11', '13', '12', '7', '7', '7', '7', '9', '2', '2']}
if __name__ == "__main__":
print d
for state in d:
# returns the numbers with their index (#, index)
pairs = [(int(d[state][x]), x) for x in xrange(len(d[state]))]
minpair = min(pairs)
maxpair = max(pairs)
print "%s: %d in index %d and %d in index %d"%(state,maxpair[0],maxpair[1],
minpair[0],minpair[1])
Output:
{'Indiana': ['7', '6', '7', '8', '11', '11', '13', '12', '7', '7', '7', '7', '9', '2', '2'], 'Illinois': ['13', '12', '18', '23', '26', '25', '24', '19', '13', '10', '15', '14', '14', '4', '3']}
Indiana: 13 in index 6 and 2 in index 13
Illinois: 26 in index 4 and 3 in index 14
to get around the blank string, you could break up the list comprehension into
pairs = []
for x in xrange(len(d[state])):
try:
pairs.append( (int(d[state][x]), x) )
except ValueError:
pass # not a valid number

Categories

Resources