Reading row elements in a new array in python - python

I am converting code to write a function from a different data type.
The original code was:
note_inf_track = np.array([(n.note, n.onset/div, n.duration/div, n.velocity, n.channel, track_nr)
for n in m.tracks[track_nr].notes],
dtype = [('pitch', np.int),
('onset', np.float),
('duration', np.float),
('velocity', np.int),
('channel', np.int),
('track', np.int)])
Now my input data is a 2-dimensional list, I am not working with notes anymore.
for line in lines:
#print("b");
element = [];
for x in line.split(','):
element.append(x.strip('\r\n'));
elements.append(element);
note_inf_track = np.array([(round((round(np.asarray(elements[2], dtype="float")))), (round(np.asarray(elements[0], dtype="float"))),(round(np.asarray(elements[:][1], dtype="float"))))],
dtype = [('pitch', np.int),
('onset', np.float),
('duration', np.float)])
I am struggling to add the columns at once.
elements[2] seems to give me the row instead of the column. I can't seem to replace the for loop. Maybe my syntax is all off, I am used to java and c++, fairly new to Python.
--Update--
Based on Tarun Gaba's answer, I tried this:
note_inf_track = np.array([((round(el[2])), float(el[0]),float(el[1])) for el in elements],
dtype = [('pitch', np.int)
('onset', np.float),
('duration', np.float)]);
Gives me an error:
note_inf_track = np.array([((round(el[2])), float(el[0]),float(el[1])) for el in elements],
TypeError: a float is required
Here is the output of print(elements):
[['0.066667', ' 0.200000', ' 50.180000', ' 0.000644'], ['0.266667', ' 0.266667', ' 59.180000', ' 0.006583'], ['0.550000', ' 0.366667', ' 59.180000', ' 0.002129'], ['0.933333', ' 0.350000', ' 59.180000', ' 0.005972'], ['1.316667', ' 0.050000', ' 59.180000', ' 0.010053'], ['1.366667', ' 0.166667', ' 61.180000', ' 0.008109'], ['1.550000', ' 0.233333', ' 61.180000', ' 0.009170'], ['1.783333', ' 0.416667', ' 63.180000', ' 0.023811'], ['2.250000', ' 0.166667', ' 63.180000', ' 0.016253'], ['2.416667', ' 0.850000', ' 64.180000', ' 0.019314'], ['3.300000', ' 0.116667', ' 64.180000', ' 0.018684'], ['3.433333', ' 0.133333', ' 64.180000', ' 0.016786'], ['3.583333', ' 0.333333', ' 63.180000', ' 0.008623'], ['4.816667', ' 0.383333', ' 63.180000', ' 0.036858'], ['5.200000', ' 0.166667', ' 61.180000', ' 0.006060'], ['5.366667', ' 0.366667', ' 63.180000', ' 0.010417'], ['5.783333', ' 0.333333', ' 63.180000', ' 0.008371'], ['6.116667', ' 0.383333', ' 64.180000', ' 0.007488'], ['6.533333', ' 0.233333', ' 64.180000', ' 0.014582'], ['6.766667', ' 0.333333', ' 63.180000', ' 0.004457'], ['7.533333', ' 0.516667', ' 61.180000', ' 0.004700'], ['8.050000', ' 0.316667', ' 63.180000', ' 0.006959'], ['8.366667', ' 0.300000', ' 64.180000', ' 0.013522'], ['8.666667', ' 0.166667', ' 63.180000', ' 0.008083'], ['8.833333', ' 0.150000', ' 64.180000', ' 0.010620'], ['8.983333', ' 0.250000', ' 63.180000', ' 0.004493'], ['9.233333', ' 0.116667', ' 64.180000', ' 0.012834'], ['9.350000', ' 0.333333', ' 63.180000', ' 0.005321'], ['9.716667', ' 0.300000', ' 64.180000', ' 0.006902'], ['10.033333', ' 0.183333', ' 63.180000', ' 0.002515'], ['10.216667', ' 0.133333', ' 62.180000', ' 0.005928'], ['10.350000', ' 0.600000', ' 63.180000', ' 0.004920'], ['10.950000', ' 0.133333', ' 64.180000', ' 0.006754'], ['11.083333', ' 0.116667', ' 63.180000', ' 0.003831'], ['11.200000', ' 0.316667', ' 62.180000', ' 0.002493']]

elements is a list of lists here.
To access 3rd column(as what you seem to be trying by elements[2]), you need to do something like this:
elements = [[1,2,3], \
[4,5,6], \
[7, 8, 9]]
column = [i[2] for i in elements]
print column
#[3,6,9]
For your case, It should be something on the lines of:
np.array([el[2] for el in elements], [float(el[0]) for el in elements], [float(el[1])) for el in elements], dtype= .....

The problem is that your data is read as list of strings.
Modify your code from:
element.append(x.strip('\r\n'));
To:
element.append(float(x.strip('\r\n')));
To have your data as floats. You could also use round(float(...)) if you need rounded data.
Then put the data into a numpy array:
>>> import numpy as np
>>> data = np.array(elements)
And access to the columns as data[:, column_idx], e.g. for column 3:
>>> data[:, 2]

Related

How to make a function that lays mines to random squares on nested list?

The field is created like this:
field = []
for row in range(10):
field.append([])
for col in range(15):
field[-1].append(" ")
Tuples represent free squares where mines can be layed
free = []
for x in range(15):
for y in range(10):
free.append((x, y))
I have to lay the mines trough this function:
def lay_mines(field, free, number_of_mines):
for _ in number_of_mines:
mines = random.sample(free, number_of_mines)
field(mines) = ["x"]
I was thinking using random.sample() or random.choice(). I just can't get it to work. How can I place the string "x" to a certain random coordinate?
import random
def lay_mines(x, y, number_of_mines=0):
f = [list(' ' * x) for _ in range(y)]
for m in random.sample(range(x * y), k=number_of_mines): # random sampling without replacement
f[m % y][m // y] = 'X'
return f
field = lay_mines(15, 10, 20)
print(*field, sep='\n')
Prints:
['X', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', 'X', ' ', ' ', ' ', 'X', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'X', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', 'X', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', 'X', ' ', ' ', ' ', ' ', 'X', 'X', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', 'X', ' ', ' ', ' ', ' ', 'X', ' ', ' ', ' ']
[' ', 'X', ' ', ' ', 'X', ' ', ' ', ' ', ' ', 'X', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', 'X', 'X', 'X', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', 'X', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'X']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'X', ' ', 'X', ' ', ' ', ' ', ' ']

Printing just the elements in a 2D list [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I am making a text based game and the map is a 2D list. The way i have been doing until now is like this:
self.map = [[" " for i in range(34)]
for i in range(40)] # initialize the 2d array
for layer in self.map: #print the array
print(layer)
However, this prints the commas and quote marks as well. Is there any way to print the array with just the elements in the list and the printed list should be in the same format as the text printed by the code above above.
EDIT
This is how i would like to print it, but without the speech marks and commas:
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
Use join to convert list to single string
self.map = [[" " for i in range(34)]
for i in range(40)] # initialize the 2d array
for layer in self.map: #print the array
print( "".join(layer) )
When you call print(layer) its printing a representation of that object which is an array of chars/strings.
To print the individual strings consecutively in lines you could use this code:
for layer in self.map: #iterate over the layers
for item in layer: #iterate over the items in the layer
print(item, end='') # the end='' prevents a line break
print('') #make a line break

For loop not working as intended

I am in the middle of my course work and I am now having trouble with one of my for loops.
def update():
update=[]
update1=[]
with open('Stock2.txt','r') as stockFile:
for eachLine in stockFile:
eachLine=eachLine.strip().split()
update.append(eachLine)
update.remove(update[0])
stockFile.close()
with open('Stock2.txt','r') as stockFile:
for eachLine in stockFile:
eachLine=eachLine.strip().split(' ')
update1.append(eachLine)
update1.remove(update1[0])
for eachList in update1:
loopCon=-1
for eachItem in eachList:
loopCon+=1
if eachItem=='':
eachList[loopCon]=' '
count=-1
for eachList in update1:
for eachItem in eachList:
count+=1
if eachItem != ' ':
print(count)
The last for loop that I have been working on is looping ok but when I add one to count every time it loops on the for loop 'for eachItem in eachList:' it comes up with random numbers as follows:
0 10 14 21 28 35 36 46 62 69 76 83 84 94 111
Here is the stock file I am using - Stock2.txt
GTIN-8 Product-Name Price(£) CSL ROL TSL
95820194 Windows-10-64bit 119.99 0 1 3
68196167 Cheese 1.00 0 3 8
62017014 Bread 0.93 0 3 9
86179616 10tb-memory-stick 916.96 0 0 4
19610577 Freddo 0.15 0 2 9
So on.
Is there anything I have done wrong whilst doing this as I probably would not be able to detect it that easily as I have only been doing python for almost 1 year.
Thank you for your time.
You increment count outside the if that prints. Try this instead:
for eachList in update1:
for eachItem in eachList:
if eachItem != ' ':
count+=1
print(count)
If I put a print update1 statement before your last for loop, i.e., before the statement for eachList in update1:, I get the following output:
[['95820194', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'Windows-10-64bit', ' ', ' ', ' ', '119.99', ' ', ' ', ' ', ' ', ' ', ' ', '0', ' ', ' ', ' ', ' ', ' ', ' ', '1', ' ', ' ', ' ', ' ', ' ', ' ', '3'], ['68196167', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'Cheese', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '1.00', ' ', ' ', ' ', ' ', ' ', ' ', '0', ' ', ' ', ' ', ' ', ' ', ' ', '3', ' ', ' ', ' ', ' ', ' ', ' ', '8'], ['62017014', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'Bread', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '0.93', ' ', ' ', ' ', ' ', ' ', ' ', '0', ' ', ' ', ' ', ' ', ' ', ' ', '3', ' ', ' ', ' ', ' ', ' ', ' ', '9'], ['86179616', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '10tb-memory-stick', ' ', ' ', '916.96', ' ', ' ', ' ', ' ', ' ', ' ', '0', ' ', ' ', ' ', ' ', ' ', ' ', '0', ' ', ' ', ' ', ' ', ' ', ' ', '4'], ['19610577', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'Freddo', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '0.15', ' ', ' ', ' ', ' ', ' ', ' ', '0', ' ', ' ', ' ', ' ', ' ', ' ', '2', ' ', ' ', ' ', ' ', ' ', ' ', '9']]
So by this it seems the output isnt random at all. What you are doing is traversing each list inside the list update1, and you are incrementing count each time you get an element in eachItem.
However you are printing count only when eachItem != ' '. So as you can see it prints 0 when eachItem == '95820194', and then it prints 10 when eachItem == 'Windows-10-64bit', and so on. Though it is incremented even when eachItem == ' ', just not printed.

Sorting data in a list using .sort()

I have a list of numbers that I need to sort to get different calculations done and I noticed that the .sort() function is sorting 2 digit numbers (10-99) and 3 digit numbers (100-999) separately in the list so I end up getting the wrong values for min(), max() and the median, any idea why this is happening?
before sorting:
[' 75.0', ' 82.43', ' 112.11', ' 89.93', ' 103.19', ' 80.6', ' 113.44', ' 105.44', ' 95.54', ' 121.98', ' 114.25', ' 109.84', ' 90.48', ' 105.84', ' 82.89', ' 113.64', ' 102.73', ' 104.57', ' 100.83', ' 75.59', ' 79.86', ' 91.11', ' 94.75', ' 109.89', ' 117.39', ' 74.71', ' 71.04', ' 92.97', ' 88.87', ' 92.95', ' 86.67', ' 101.46', ' 92.4', ' 85.2', ' 107.19', ' 117.81', ' 90.95', ' 82.02', ' 87.31', ' 106.53', ' 86.28', ' 106.62', ' 107.57', ' 89.38', ' 105.88', ' 74.45', ' 90.03', ' 107.96', ' 77.42', ' 98.9', ' 109.81', ' 102.51', ' 116.71', ' 82.92', ' 81.78', ' 74.42', ' 76.27', ' 73.84', ' 75.55', ' 102.29', ' 108.1', ' 98.84', ' 101.48', ' 77.75', ' 98.57', ' 70.31', ' 78.28', ' 80.18']
and after sorting
[' 100.83', ' 101.46', ' 101.48', ' 102.29', ' 102.51', ' 102.73', ' 103.19', ' 104.57', ' 105.44', ' 105.84', ' 105.88', ' 106.53', ' 106.62', ' 107.19', ' 107.57', ' 107.96', ' 108.1', ' 109.81', ' 109.84', ' 109.89', ' 112.11', ' 113.44', ' 113.64', ' 114.25', ' 116.71', ' 117.39', ' 117.81', ' 121.98', ' 70.31', ' 71.04', ' 73.84', ' 74.42', ' 74.45', ' 74.71', ' 75.0', ' 75.55', ' 75.59', ' 76.27', ' 77.42', ' 77.75', ' 78.28', ' 79.86', ' 80.18', ' 80.6', ' 81.78', ' 82.02', ' 82.43', ' 82.89', ' 82.92', ' 85.2', ' 86.28', ' 86.67', ' 87.31', ' 88.87', ' 89.38', ' 89.93', ' 90.03', ' 90.48', ' 90.95', ' 91.11', ' 92.4', ' 92.95', ' 92.97', ' 94.75', ' 95.54', ' 98.57', ' 98.84', ' 98.9']
Because you have a list of strings , so it is getting sorted in lexicographical order , If you are sure the list only have float values (in strings) , the use the keys argument to convert those strings to float while sorting. Example -
l.sort(float)
If you want to convert the complete list to float (since you say you want to take mean / median , etc later ) , then use -
l = list(map(float, l))
For Python 2.x , the list(..) is not needed as map() returns a list .
And if you are converting the complete list to float , then you do not need the above sort() method with keys argument , normal .sort() should work.

Python: Reading cvs file into lists and an array

I am new to Python, and this is my first post in here, so I hope you will bear over with me. I am having big trouble reading a csv file into a desired format. My file consists of 132 columns, and the head of the file looks like this:
['10520', ' 386681375.82149398', ' 85.25775430', ' -56.07840500', ' 173', ' 153', ' 151', ' 161', ' 180', ' 167', ' 189', ' 171', ' 173', ' 171', ' 207', ' 169', ' 173', ' 168', ' 184', ' 168', ' 201', ' 197', ' 204', ' 201', ' 210', ' 239', ' 211', ' 227', ' 247', ' 248', ' 266', ' 276', ' 322', ' 336', ' 331', ' 381', ' 358', ' 483', ' 532', ' 709', ' 841', ' 1004', ' 1128', ' 1540', ' 1945', ' 2747', ' 3718', ' 5378', ' 6273', ' 8415', ' 12727', ' 18248', ' 24103', ' 33688', ' 40744', ' 52821', ' 65535', ' 59114', ' 55225', ' 49919', ' 51894', ' 58381', ' 50376', ' 48315', ' 42337', ' 30577', ' 24078', ' 24337', ' 22432', ' 20191', ' 19999', ' 17674', ' 22519', ' 22542', ' 22644', ' 23966', ' 21033', ' 21326', ' 20257', ' 20441', ' 21859', ' 26976', ' 32514', ' 34732', ' 45555', ' 48416', ' 34952', ' 28511', ' 24611', ' 18843', ' 17081', ' 14592', ' 13550', ' 13011', ' 15370', ' 15827', ' 15232', ' 16054', ' 14823', ' 14538', ' 12544', ' 11865', ' 11442', ' 10089', ' 10340', ' 11269', ' 11336', ' 11873', ' 10012', ' 9824', ' 9488', ' 7696', ' 9273', ' 9502', ' 8752', ' 8341', ' 8192', ' 8293', ' 8067', ' 8402', ' 9258', ' 9290', ' 8144', ' 8009', ' 7660', ' 6772', ' 6008', ' 6792', ' 6993', ' 6662', ' 7047', ' 6662 ']
['10520', ' 386681375.86699998', ' 85.25527360', ' -56.09263480', ' 113', ' 102', ' 120', ' 124', ' 117', ' 127', ' 124', ' 118', ' 128', ' 120', ' 125', ' 120', ' 140', ' 135', ' 144', ' 127', ' 143', ' 148', ' 141', ' 153', ' 142', ' 142', ' 149', ' 152', ' 168', ' 180', ' 196', ' 188', ' 196', ' 246', ' 259', ' 270', ' 337', ' 360', ' 506', ' 540', ' 625', ' 887', ' 1122', ' 1251', ' 2007', ' 2883', ' 3238', ' 4370', ' 6240', ' 9164', ' 10751', ' 16656', ' 20996', ' 27753', ' 37774', ' 35377', ' 38637', ' 39265', ' 35183', ' 38830', ' 32149', ' 25455', ' 27272', ' 24488', ' 21036', ' 20931', ' 17166', ' 17019', ' 18196', ' 15450', ' 15120', ' 15934', ' 15021', ' 14936', ' 16253', ' 16457', ' 15873', ' 19667', ' 23150', ' 26140', ' 35761', ' 42594', ' 61758', ' 65535', ' 42354', ' 28672', ' 25173', ' 20344', ' 15883', ' 14432', ' 10575', ' 11342', ' 12348', ' 13229', ' 19632', ' 23456', ' 18102', ' 15600', ' 13425', ' 9962', ' 8281', ' 7609', ' 6948', ' 7391', ' 8878', ' 10006', ' 11295', ' 10073', ' 9410', ' 10354', ' 10667', ' 10054', ' 9011', ' 8793', ' 9055', ' 7463', ' 6692', ' 8051', ' 8330', ' 7369', ' 6612', ' 6328', ' 6545', ' 6235', ' 5895', ' 5085', ' 4876', ' 5154', ' 4649', ' 5226', ' 6137', ' 5354 ']
and I am interested in getting:
four lists/vectors/1D arrays (or what ever) of the four first colums.
The next 128 columns I would like to get into an array.
I would like to get the output without ([] , ' ") and other non-number-characters.
So fare the code looks like this
import sys, math, numpy
from numpy import *
from scipy import *
import csv
try:
ifile = sys.argv[1]
#; ofile = sys.argv[2]
except:
print "Usage:", sys.argv[0], "ifile"; sys.exit(1)
# Open and read file from std, and assign first four (orbit, time, lat, lon) columns to four lists, and last 128 columns (waveforms) to an array.
ifile = open(ifile)
orbit = []
time = []
lat = []
lon = []
#wvf= [[],[]]
try:
reader = csv.reader(ifile, delimiter=',')
for row in reader:
orbit.append(row[0])
time.append(row[1])
lat.append(row[2])
lon.append(row[3])
# wvf = [row[4:132] for row in reader] row[0:128] for col in len(reader)]
wvf = [row[4:132]],[row[1:128]]
finally:
ifile.close()
...and now do something with data...
I have thought about first splitting all lines, and thereafter gathering the last 128 columns into the array, but I haven't managed to do it.
I hope your having an idea of what I am wishing to accomplish, and are able to help me out.
Thanks
You can load the file into a numpy array using np.genfromtxt. An advantage of doing it this way is that the data goes directly from the file to a space-efficient numpy array. If you use the csv module, and store the data in Python lists, then your data will consume a lot more memory.
import sys
import numpy as np
try:
ifile = sys.argv[1]
#; ofile = sys.argv[2]
except:
print "Usage:", sys.argv[0], "ifile"; sys.exit(1)
# Open and read file from std, and assign first four (orbit, time, lat, lon)
# columns to four lists, and last 128 columns (waveforms) to an array.
def remove_bracket(line):
return float(line.strip("][ '"))
data = np.genfromtxt(ifile, delimiter = ',',
dtype = 'float',
converters = {i:remove_bracket for i in range(132)}
)
orbit = data[:,0]
time = data[:,1]
lat = data[:,2]
lon = data[:,3]
wvf = data[:,4:128]
print(wvf)
Note that the variables orbit, time, etc. are "views" of data -- they are not copies of data, and so do not require (much) additional memory. This also means that modifying orbit will also affect data, and vice versa.
Simply:
wvf = []
try:
reader = csv.reader(ifile, delimiter=',')
for row in reader:
# ...
wvf.append(row[4:132])
Initialize wvf to be an empty array like the others, then append one sub-list (slice) per row of data.
(Just in case your data is really big and you want to optimise your memory usage: there's the array module for efficient storage.)

Categories

Resources