How would I get python to run through a .txt document, find a specific heading and then put information from each line in to a list for printing? And then once finished, look for another heading and do the same with the information there...
If you had a csv file as follows:
h1,h2,h3
a,b,c
d,e,f
g,h,i
Then the following would do as you request (if I understood you correctly)
def getColumn(title,file):
result = []
with open(file) as f:
headers = f.readline().split(',')
index = headers.index(title)
for l in f.readlines():
result.append(l.rstrip().split(',')[index])
return result
For example:
print(getColumn("h1",'cf.csv') )
>>> ['a', 'd', 'g']
File test.txt
a
b
c
heading1
d
e
f
heading2
g
h
heading3
>>> from itertools import takewhile, imap
>>> with open('test.txt') as f:
for heading in ('heading1', 'heading2', 'heading3'):
items = list(takewhile(heading.__ne__, imap(str.rstrip, f)))
print items
['a', 'b', 'c']
['d', 'e', 'f']
['g', 'h']
Related
I'm trying to find the number of rows and columns in a matrix file. The matrix doesn't have spaces between the characters but does have separate lines. The sample down below should return 3 rows and 5 columns but that's not happening.
Also when I print the matrix each line has \n in it. I want to remove that. I tried .split('\n') but that didn't help. I ran this script earlier with a different data set separated with commas I had the line.split(',') in the code and that worked it would return the correct number of rows and columns as well as print the matrix with no \n, I'm not sure what changed by removing the comma from the line.split().
import sys
import numpy
with open(sys.argv[1], "r") as f:
m = [[char for char in line.split(' ')] for line in f if line.strip('\n') ]
m_size = numpy.shape(m)
print(m)
print("%s, %s" % m_size)
Sample data:
aaaaa
bbbbb
ccccc
Output:
[['aaaaa\n'], ['bbbbb\n'], ['ccccc']]
3, 1,
IIUC:
with open(sys.argv[1]) as f:
m = np.array([[char for char in line.strip()] for line in f])
>>> m
array([['a', 'a', 'a', 'a', 'a'],
['b', 'b', 'b', 'b', 'b'],
['c', 'c', 'c', 'c', 'c']], dtype='<U1')
>>> m.shape
(3, 5)
I am importing a list of strings into python from a text file. When I print the list to check it has been done properly, I see it is double bracketed.
From what I understand this means it multidimensional but this is not what I want. How do I prevent this?
An example of my code is shown below.
test_list_a_file = "path/to/test_list_a_file"
with open(test_list_a_file, 'r') as openfile:
test_list_a = [line.split() for line in openfile.readlines()]
print(test_list_a[0:4])
This returns...
[['A'], ['B'], ['C'], ['D']]
When I manually create my list in python to see what happens:
test_list_b = ['A', 'B', 'C', 'D', 'E', 'F']
print(test_list_b[0:4])
It works just fine...
['A', 'B', 'C', 'D']
The file test_list_a_file looks like this:
A
B
C
D
E
F
Whats wrong with the way I am importing it?
test_list_a_file = "path/to/test_list_a_file"
with open(test_list_a_file, 'r') as openfile:
test_list_a = [line.strip() for line in openfile.readlines()]
should work
or alternatively
with open(test_list_a_file, 'r') as openfile:
test_list_a = openfile.read().split()
you can use a loop
like this:
`for i in range(0,5):
print(test_list_b[i])`
output:
A
B
C
D
E
F
Try line.split('\n') It will detect the Value after Enter in the text file.
I would like to convert a list of strings into a dictionary.
The list looks like such after I have split it into the seperate words:
[['ice'], ['tea'], ['silver'], ['gold']]
Which I want to convert to a dictionary which looks like such:
{ 1 : ['i', 'c', 'e']
2 : ['t','e','a']
3 : ['s','i','l','v','e','r']
4 : ['g','o','l','d']}
This is my code thus far:
import itertools
def anagram1(dict):
with open('words.txt', 'r') as f:
data = f.read()
data = data.split()
x = []
y = []
for word in data:
x1 = word.split()
x.append(x1)
for letters in word:
y1 = letters.split()
y.append(y1)
d = dict(itertools.zip_longest(*[iter(y)] * 2, fillvalue=""))
To which I receive the following error:
TypeError: 'dict' object is not callable
import pprint
l = [['ice'], ['tea'], ['silver'], ['gold']]
d = {idx: list(item[0]) for idx, item in enumerate(l, start =1)}
pprint.pprint(d)
{1: ['i', 'c', 'e'],
2: ['t', 'e', 'a'],
3: ['s', 'i', 'l', 'v', 'e', 'r'],
4: ['g', 'o', 'l', 'd']}
Following should do the job:
with open('file.txt', 'r') as f:
data = f.read()
data = data.split()
data_dict = {i:v for i,v in enumerate(data)}
I have a file with a list of letters corresponding to another letter:
A['B', 'D']
B['A', 'E']
C[]
D['A', 'G']
E['B', 'H']
F[]
G['D']
H['E']
I need to import these lists to their corresponding letter, to hopefully have variables that look like this:
vertexA = ['B', 'D']
vertexB = ['A', 'E']
vertexC = []
vertexD = ['A', 'G']
vertexE = ['B', 'H']
vertexF = []
vertexG = ['D']
vertexH = ['E']
What would be the best way to do this? I tried searching for an answer but was unlucky in doing so. Thanks for any help.
You can try using dictionaries rather than variables, and I think it makes it easier as well to populate your data from your textfile.
vertex = {}
vertex['A'] = ['B', 'D']
vertex['A']
>>> ['B', 'D']
When you read your input file, the inputs should look like this:
string='A["B","C"]'
So, we know that the first letter is the name of the list.
import ast
your_list=ast.literal_eval(string[1:])
your_list:
['B', 'C']
You can take care of the looping, reading file, and string manipulation for proper naming...
Building a dictionary would probably be best. Each letter of the alphabet would be a key, and then the value would be a list of associated letters. Here's a proof of concept (not tested):
from string import string.ascii_uppercase
vertices = {}
# instantiate dict with uppercase letters of alphabet
for c in ascii_uppercase:
vertices[c] = []
# iterate over file and populate dict
with open("out.txt", "rb") as f:
for i, line in enumerate(f):
if line[0].upper() not in ascii_uppercase:
# you probably want to do some additional error checking
print("Error on line {}: {}".format(i, line))
else: # valid uppercase letter at beginning of line
list_open = line.index('[')
list_close = line.rindex(']') + 1 # one past end
# probably would want to validate record is in correct format before getting here
# translate hack to remove unwanted chars
row_values = line[list_open:list_close].translate(None, "[] '").split(',')
# do some validation for cases where row_values is empty
vertices[line[0].upper()].extend([e for e in row_values if e.strip() != ''])
Using it would then be easy:
for v in vertices['B']:
# do something with v
File A.txt:
A['B', 'D']
B['A', 'E']
C[]
D['A', 'G']
E['B', 'H']
F[]
G['D']
H['E']
The code:
with open('A.txt','r') as file:
file=file.read().splitlines()
listy=[[elem[0],elem[1:].strip('[').strip(']').replace("'",'').replace(' ','').split(',')] for elem in file]
This makes a nested list, but as Christian Dean said, is a better way to go.
Result:
[['A', ['B', 'D']], ['B', ['A', 'E']], ['C', ['']], ['D', ['A', 'G']], ['E', ['B', 'H']], ['F', ['']], ['G', ['D']], ['H', ['E']]]
Given a string that is a sequence of several values separated by a commma:
mStr = 'A,B,C,D,E'
How do I convert the string to a list?
mList = ['A', 'B', 'C', 'D', 'E']
You can use the str.split method.
>>> my_string = 'A,B,C,D,E'
>>> my_list = my_string.split(",")
>>> print my_list
['A', 'B', 'C', 'D', 'E']
If you want to convert it to a tuple, just
>>> print tuple(my_list)
('A', 'B', 'C', 'D', 'E')
If you are looking to append to a list, try this:
>>> my_list.append('F')
>>> print my_list
['A', 'B', 'C', 'D', 'E', 'F']
In the case of integers that are included at the string, if you want to avoid casting them to int individually you can do:
mList = [int(e) if e.isdigit() else e for e in mStr.split(',')]
It is called list comprehension, and it is based on set builder notation.
ex:
>>> mStr = "1,A,B,3,4"
>>> mList = [int(e) if e.isdigit() else e for e in mStr.split(',')]
>>> mList
>>> [1,'A','B',3,4]
Consider the following in order to handle the case of an empty string:
>>> my_string = 'A,B,C,D,E'
>>> my_string.split(",") if my_string else []
['A', 'B', 'C', 'D', 'E']
>>> my_string = ""
>>> my_string.split(",") if my_string else []
[]
>>> some_string='A,B,C,D,E'
>>> new_tuple= tuple(some_string.split(','))
>>> new_tuple
('A', 'B', 'C', 'D', 'E')
You can split that string on , and directly get a list:
mStr = 'A,B,C,D,E'
list1 = mStr.split(',')
print(list1)
Output:
['A', 'B', 'C', 'D', 'E']
You can also convert it to an n-tuple:
print(tuple(list1))
Output:
('A', 'B', 'C', 'D', 'E')
You can use this function to convert comma-delimited single character strings to list-
def stringtolist(x):
mylist=[]
for i in range(0,len(x),2):
mylist.append(x[i])
return mylist
#splits string according to delimeters
'''
Let's make a function that can split a string
into list according the given delimeters.
example data: cat;dog:greff,snake/
example delimeters: ,;- /|:
'''
def string_to_splitted_array(data,delimeters):
#result list
res = []
# we will add chars into sub_str until
# reach a delimeter
sub_str = ''
for c in data: #iterate over data char by char
# if we reached a delimeter, we store the result
if c in delimeters:
# avoid empty strings
if len(sub_str)>0:
# looks like a valid string.
res.append(sub_str)
# reset sub_str to start over
sub_str = ''
else:
# c is not a deilmeter. then it is
# part of the string.
sub_str += c
# there may not be delimeter at end of data.
# if sub_str is not empty, we should att it to list.
if len(sub_str)>0:
res.append(sub_str)
# result is in res
return res
# test the function.
delimeters = ',;- /|:'
# read the csv data from console.
csv_string = input('csv string:')
#lets check if working.
splitted_array = string_to_splitted_array(csv_string,delimeters)
print(splitted_array)