I have a text file in the following format:
a,b,c,d,
1,1,2,3,
4,5,6,7,
1,2,5,7,
6,9,8,5,
How can i read it into a list efficiently so as to get the following
output?
list=[[1,4,1,6],[1,5,2,9],[2,6,5,8],[3,7,7,5]]
Let's assume that the file is named spam.txt:
$ cat spam.txt
a,b,c,d,
1,1,2,3,
4,5,6,7,
1,2,5,7,
6,9,8,5,
Using list comprehensions and the zip() built-in function, you can write a program such as:
>>> with open('spam.txt', 'r') as file:
... file.readline() # skip the first line
... rows = [[int(x) for x in line.split(',')[:-1]] for line in file]
... cols = [list(col) for col in zip(*rows)]
...
'a,b,c,d,\n'
>>> rows
[[1, 1, 2, 3], [4, 5, 6, 7], [1, 2, 5, 7], [6, 9, 8, 5]]
>>> cols
[[1, 4, 1, 6], [1, 5, 2, 9], [2, 6, 5, 8], [3, 7, 7, 5]]
Additionally, zip(*rows) is based on unpacking argument lists, which unpacks a list or tuple so that its elements can be passed as separate positional arguments to a function. In other words, zip(*rows) is reduced to zip([1, 1, 2, 3], [4, 5, 6, 7], [1, 2, 5, 7], [6, 9, 8, 5]).
EDIT:
This is a version based on NumPy for reference:
>>> import numpy as np
>>> with open('spam.txt', 'r') as file:
... ncols = len(file.readline().split(',')) - 1
... data = np.fromiter((int(v) for line in file for v in line.split(',')[:-1]), int, count=-1)
... cols = data.reshape(data.size / ncols, ncols).transpose()
...
>>> cols
array([[1, 4, 1, 6],
[1, 5, 2, 9],
[2, 6, 5, 8],
[3, 7, 7, 5]])
You can try the following code:
from numpy import*
x0 = []
for line in file('yourfile.txt'):
line = line.split()
x = line[1]
x0.append(x)
for i in range(len(x0)):
print x0[i]
Here the first column is appended onto x0[]. You can append the other columns in a similar fashion.
You can use data_py package to read column wise data from a file.
Install this package by using
pip install data-py==0.0.1
Example
from data_py import datafile
df1=datafile("C:/Folder/SubFolder/data-file-name.txt")
df1.separator=","
[Col1,Col2,Col3,Col4,Col5]=["","","","",""]
[Col1,Col2,Col3,Col4,Col5]=df1.read([Col1,Col2,Col3,Col4,Col5],lineNumber)
print(Col1,Col2,Col3,Col4,Col5)
For details please follow the link https://www.respt.in/p/python-package-datapy.html
Related
There are two lists of lists, I need to make one list out of them.
a = [[1,2,3],[4,5,6]]
b = [[1,2,3],[4,5,6]]
I_need = [[1,1,2,2,3,3],[4,4,5,5,6,6]]
or one more question, how to duplicate the list to have the same result.
I will be glad for any help!
As you marked your question with Numpy tag, I assume that you
want to use just Numpy.
To easier tell apart elements of both source arrays, I defined them as:
a = [[ 1, 2, 3],[ 4, 5, 6]]
b = [[10,20,30],[40,50,60]]
To get your expected result, run:
result = np.dstack([a, b]).reshape(2, -1)
The result is:
array([[ 1, 10, 2, 20, 3, 30],
[ 4, 40, 5, 50, 6, 60]])
If you want a plain pythonic list (instead of a Numpy array),
append .tolist() to your code.
If you have python lists:
I_need = [[e for x in zip(*l) for e in x] for l in zip(a,b)]
output: [[1, 1, 2, 2, 3, 3], [4, 4, 5, 5, 6, 6]]
If you have numpy arrays:
a = np.array([[1,2,3],[4,5,6]])
b = np.array([[1,2,3],[4,5,6]])
I_need = np.c_[a.ravel(), b.ravel()].reshape(2,-1)
output:
array([[1, 1, 2, 2, 3, 3],
[4, 4, 5, 5, 6, 6]])
I want to extract range of columns. I know how to do that in numpy but I don't want to use numpy slicing operator.
import numpy as np
a = [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
arr = np.array(a)
k = 0
print(arr[k:, k+1]) # --> [2 7]
print([[a[r][n+1] for n in range(0,k+1)] for r in range(k,len(a))][0]) # --> [2]
What's wrong with second statement?
You're overcomplicating it. Get the rows with a[k:], then get a cell with row[k+1].
>>> [row[k+1] for row in a[k:]]
[2, 7]
a = [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
k = 0
print(list(list(zip(*a[k:]))[k+1])) # [2, 7]
Is this what you're looking for?
cols = [1,2,3] # extract middle 3 columns
cols123 = [[l[col] for col in cols] for l in a]
# [[2, 3, 4], [7, 8, 9]]
I have a 2D list = [[1, 8, 3], [4, 5, 6], [0, 5, 7]], and I want to delete columns in a loop.
For example, columns with index: 0(first) and 2(last) - - the result after deletions should be: [8, 5, 5].
There is a problem, because when I delete the 0th column, the size of the list is decreased to (0,1), and the 2nd index is out of scope.
What is the fastest method to delete columns in a loop without the out-of-scope problem?
For a better picture:
[[1, 8, 3],
[4, 5, 6],
[0, 5, 7]]
There is no such shortcut in python except for iterating over all the list items and removing those index values.
However, you can use pandas which is meant for some other purpose but will do the task.
import pandas as pd
s = [[1, 8, 3], [4, 5, 6], [0, 5, 7]]
df = pd.DataFrame(s,columns=['val1','val2','val3'])
li = df.drop('val1',axis=1).values.tolist()
now li will look like this
[[8, 3], [5, 6], [5, 7]]
You can use numpy like this:
import numpy as np
my_list = np.array([[1, 8, 3], [4, 5, 6], [0, 5, 7]])
new_list = my_list[:, 1].copy()
print(new_list)
Output:
>>> [8, 5, 5]
Also numpy.delete(your_list, index, axis) is do the same job:
new_list = np.delete(my_list,(0, 2), axis=1)
(0, 2) is the indices of the columns 0 and 2
axis=1 says numpy that (0, 2) are columns indices not rows.
if you want to delete rows 0 and 2 you can change axis=1 to axis=0
Output is a little different:
>>> array([[8],
[5],
[5]])
For a pure python approach:
my_list = [[1, 8, 3], [4, 5, 6], [0, 5, 7]]
new_list = [value[1] for value in my_list]
print(new_list)
Output:
>>> [8, 5, 5]
L is 2D list:
print(map(lambda x: x[1:], L))
data= [[1, 8, 3], [4, 5, 6], [0, 5, 7]]
index_to_remove=[0,2]
[list(x) for x in zip(*[d for i,d in enumerate(zip(*data)) if i not in index_to_remove])]
If I understood your question correctly, you want to keep the middle element (index 1) of each list,in that case I would suggest creating a new list. There could be other better ways, for sure. But you could try this, if this works for you:
twoD_list = [[1, 8, 3], [4, 5, 6], [0, 5, 7]]
def keep_col( twoD_list ,index_to_keep = 1):
final_list = []
for x in twoD_list:
final_list.append(x[index_to_keep])
return final_list
final_list = keep_col( twoD_list , 1)
Final output:
[8,5,5]
Assuming you always want only the second element and the inner lists always have at least two elements.
Pure python with list comprehension:
lst = [
[1, 8, 3],
[4, 5, 6],
[0, 5, 7],
]
filtered_lst = [
inner_element
for inner_lst in lst
for i, inner_element in enumerate(inner_lst)
if i == 1
]
print(filtered_lst)
# [8, 5, 5]
If you want you can the reassign the new list to the old variable:
lst = filtered_lst
The advantages of this method are:
no need to worry about the list being altered while you iterate it,
no need to import other libraries
list comprehension is built-in
list comprehension is often the fastest way to filter a list (see for example this article)
easier to read and maintain that other solutions (in my opinion).
Via itemgetter to extract the value at index 1.
from operator import itemgetter
my_list = [[1, 8, 3], [4, 5, 6], [0, 5, 7]]
result = list(map(itemgetter(1), my_list))
try this
my_list = [[1, 8, 3], [4, 5, 6], [0, 5, 7]]
filter_col=[0,2]
col_length=3
my_list=[[x[i] for i in range(col_length) if i not in filter_col] for x in my_list]
u do not want to directly mutate the list that you are working on
this performs a list comprehension to create a new list from the existing list
edit:
just saw u wanted only a flat list
assuming u only want one element for the list u can use
my_list=[x[1] for x in my_list]
I would like a list to be stored into another list from right to left diagonally without importing anything if possible
eg. list =
[[1, 4, 6]
[6, 3, 7]
[2, 7, 9]]
say I'd like to store [6, 3, 2] into another list, how would i go about doing it? I have tried many ways for hours and still cant find a solution
With a list comprehension:
l =[[1, 4, 6],
[6, 3, 7],
[2, 7, 9]]
diagonal = [row[-i] for i, row in enumerate(l, start=1)]
print(diagonal)
Output
[6, 3, 2]
The following snipped
l =[[1, 4, 6],
[6, 3, 7],
[2, 7, 9]]
d = len(l)
a = []
for i in range(0,d):
a.append(l[i][d-1-i])
print(a)
results in the output you expected:
[6, 3, 2]
You can use a list comprehension and use list indexing twice to select your row and column:
L = [[1, 4, 6],
[6, 3, 7],
[2, 7, 9]]
n = len(L)
res = [L[i][n-i-1] for i in range(n)]
# [6, 3, 2]
An alternative formulation is to use enumerate as per #OlivierMelançon's solution.
If you can use a 3rd party library, you can use NumPy to extract the diagonal of a flipped array:
import numpy as np
arr = np.array(L)
res = np.diag(np.fliplr(arr))
# array([6, 3, 2])
When you want to create a list out from another list, list comprehension is a very good way to go.
a = yourlist
print([a[i][(i+1)*-1] for i in range(len(a))])
This list comprehension loops through the lists taking the the furthes back integer and the second furthes back and so on.
Using numpy and rotate (90)
import numpy as np
list = [[1, 4, 6],[6, 3, 7],[2, 7, 9]]
np.diag(np.rot90(array))
Output :
array([6, 3, 2])
or without using numpy:
list = [[1, 4, 6],[6, 3, 7],[2, 7, 9]]
res=[]
i=-1
for elm in list :
res.append(elm[i])
i-=1
print res
#[6, 3, 2]
My Python code generates a list everytime it loops:
list = np.genfromtxt('temp.txt', usecols=3, dtype=[('floatname','float')], skip_header=1)
But I want to save each one - I need a list of lists right?
So I tried:
list[i] = np.genfromtxt('temp.txt', usecols=3, dtype=[('floatname','float')], skip_header=1)
But Python now tells me that "list" is not defined. I'm not sure how I go about defining it. Also, is a list of lists the same as an array??
Thank you!
You want to create an empty list, then append the created list to it. This will give you the list of lists. Example:
>>> l = []
>>> l.append([1,2,3])
>>> l.append([4,5,6])
>>> l
[[1, 2, 3], [4, 5, 6]]
Create your list before your loop, else it will be created at each loop.
>>> list1 = []
>>> for i in range(10) :
... list1.append( range(i,10) )
...
>>> list1
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5, 6, 7, 8, 9], [2, 3, 4, 5, 6, 7, 8, 9], [3, 4, 5, 6, 7, 8, 9], [4, 5, 6, 7, 8, 9], [5, 6, 7, 8, 9], [6, 7, 8, 9], [7, 8, 9], [8, 9], [9]]
Use append method, eg:
lst = []
line = np.genfromtxt('temp.txt', usecols=3, dtype=[('floatname','float')], skip_header=1)
lst.append(line)
First of all do not use list as a variable name- that is a builtin function.
I'm not super clear of what you're asking (a little more context would help), but maybe this is helpful-
my_list = []
my_list.append(np.genfromtxt('temp.txt', usecols=3, dtype=[('floatname','float')], skip_header=1))
my_list.append(np.genfromtxt('temp2.txt', usecols=3, dtype=[('floatname','float')], skip_header=1))
That will create a list (a type of mutable array in python) called my_list with the output of the np.getfromtext() method in the first 2 indexes.
The first can be referenced with my_list[0] and the second with my_list[1]
Just came across the same issue today...
In order to create a list of lists you will have firstly to store your data, array, or other type of variable into a list. Then, create a new empty list and append to it the lists that you just created. At the end you should end up with a list of lists:
list_1=data_1.tolist()
list_2=data_2.tolist()
listoflists = []
listoflists.append(list_1)
listoflists.append(list_2)