How to call elements from a .txt file with python? - python

I have a .txt file that was saved with python. It has the form:
file_inputs
Where the first line is just a title that helps me remember the order of each element that was saved and the second line is a sequence of a string ('eos') and other elements inside. How can I call the elements so that inputs[0] returns a string ('eos') and inputs[1] returns the number "5", for example?

I am not sure why you want inputs[&] to return 5.
However here is a the standard (simple) way to read a text file with python:
f = open('/path/to/file.txt', 'r')
content = f. read()
#do whatever you want there
f. close()
To get eos printed first you might want to iterate through the content string until you find a space.
For the 5 idk.

if i could understand, you will have to do something like this
input = open(<file_name>, 'r')
input = input.readlines()
input.pop(0) #to remove the title str
#now you can have an array in wich line of .txt file is a str
new_input = [None]*len(input)
for index, line in enumerate(input):
new_input[index] = line.split(",") #with this your input should be an array of arrays in wich element is a line of your .txt with all your elements
#in the end you should be able to call
input[0][1] #first line second element if i didnt mess up this should be 5

Related

reading line in output file that repeats but has different associated values

I'm trying to use python to open and read a file with a line that repeats in my output. The line is:
"AVE. CELL LNTHS[bohr] = 0.4938371E+02 0.4938371E+02 0.4938371E+02"
the values change in each line ( with every step), but all lines start with AVE. CELL LNTHS[bohr]. I want to take the first of the three values from every line, and make a list.the image is a snip of the output file and repeating line of interest.
You can use the float command to convert a string to number. Also, use split to split the line first on the '=' then on space. Lastly, use list comprehension to build a list from the parts of the string.
path_to_file = r"C:\Documents\whatever.csv"
with open(path_to_file, "r") as file:
for line in file:
if line.startswith("AVE. CELL LNTHS[bohr]"):
values = [float(x) for x in line.split("=")[1].split()]
# Do something with the values
print(values)

Printing characters from a given sequence till a certain range only. How to do this in Python?

I have a file in which I have a sequence of characters. I want to read the second line of that file and want to read the characters of that line to a certain range only.
I tried this code, however, it is only printing specific characters from both lines. And not printing the range.
with open ("irumfas.fas", "r") as file:
first_chars = [line[1] for line in file if not line.isspace()]
print(first_chars)
Can anyone help in this regard? How can I give a range?
Below is mentioned the sequence that I want to print.But I want to start printing the characters from the second line of the sequence till a certain range only.
IRUMSEQ
ATTATAAAATTAAAATTATATCCAATGAATTCAATTAAATTAAATTAAAGAATTCAATAATATACCCCGGGGGGATCCAATTAAAAGCTAAAAAAAAAAAAAAAAAA
The following approach can be used.
Consider the file contains
RANDOMTEXTSAMPLE
SAMPLERANDOMTEXT
RANDOMSAMPLETEXT
with open('sampleText.txt') as sampleText:
content = sampleText.read()
content = content.split("\n")[1]
content = content[:6]
print(content)
Output will be
SAMPLE
I think you want something like this:
with open("irumfas.fas", "r") as file:
second_line = file.readlines()[1]
print(second_line[0:9])
readlines() will give you a list of the lines -- which we index to get only the 2nd line. Your existing code will iterate over all the lines (which is not what you want).
As for extracting a certain range, you can use list slices to select the range of characters you want from that line -- in the example above, its the first 10.
You can slice the line[1] in the file as you would slice a list.
You were very close:
end = 6 # number of characters
with open ("irumfas.fas", "r") as file:
first_chars = [line[1][:end] for line in file if not line.isspace()]
print(first_chars)

Reading and taking specific file contents in a list in python

I have a file containing:
name: Sam
placing: 2
quote: I'll win.
name: Jamie
placing: 1
quote: Be the best.
and I want to read the file through python and append specific contents into a list. I want my first list to contain:
rank = [['Sam', 2],['Jamie', 1]]
and second list to contain:
quo = ['I'll win','Be the best']
first off, i start reading the file by:
def read_file():
filename = open("player.txt","r")
playerFile = filename
player = [] #first list
quo = [] #second list
for line in playerFile: #going through each line
line = line.strip().split(':') #strip new line
print(line) #checking purpose
player.append(line[1]) #index out of range
player.append(line[2])
quo.append(line[3])
I'm getting an index out of range in the first append. I have split by ':' but I can't seem to access it.
When you do line = line.strip().split(':') when line = "name: Sam"
you will receive ['name', ' Sam'] so first append should work.
The second one player.append(line[2] will not work.
As zython said in the comments , you need to know the format of the file and each blank line or other changes in the file , can make you script to fail.
You should analyze the file differently:
If you can rely on the fact that "name" and "quote" are always existing fields in each player data , you should look for this field names.
for example:
for line in file:
# Run on each line and insert to player list only the lines with "name" in it
if ("name" in line):
# Line with "name" was found - do what you need with it
player.append(line.split(":")[1])
A few problems,
The program attempts to read three lines worth of data in a single iteration of the for loop. But that won't work, because the loop, and the split command are parsing only a single line per iteration. It will take three loop iterations to read a single entry from your file.
The program needs handling for blank lines. Generally, when reading files like this, you probably want a lot of error handling, the data is usually not formatted perfectly. My suggestion is to check for blank lines, where line has only a single value which is an empty string. When you see that, ignore the line.
The program needs to collect the first and second lines of each entry, and put those into a temporary array, then append the temporary array to player. So you'll need to declare that temporary array above, populate first with the name field, next with the placing field, and finally append it to player.
Zero-based indexing. Remember that the first item of an array is list[0], not list[1]
I think you are confused on how to check for a line and add content from line to two lists based on what it contains. You could use in to check what line you are on currently. This works assuming your text file is same as given in question.
rank, quo = [], []
for line in playerFile:
splitted = line.split(": ")
if "name" in line:
name = splitted[1]
elif "placing" in line:
rank.append([name, splitted[1]])
elif "quote" in line:
quo.append(splitted[1])
print(rank) # [['Sam', '2'],['Jamie', '1']]
print(quo) # ["I'll win",'Be the best']
Try this code:
def read_file():
filename = open("player.txt", "r")
playerFile = filename
player = []
rank = []
quo = []
for line in playerFile:
value = line.strip().split(": ")
if "name" in line:
player.append(value[1])
if "placing" in line:
player.append(value[1])
if "quote" in line:
quo.append(value[1])
rank.append(player)
player = []
print(rank)
print(quo)
read_file()

Transferring numbers from a text file to an array

I have a text file that has three lines and would like the first number of each line stored in an array, the second in another, so on and so fourth. And for it to print the array out.
The text file:
0,1,2,3,0
1,3,0,0,2
2,0,3,0,1
The code I'm using (I've only showed the first array for simplicity):
f=open("ballot.txt","r")
for line in f:
num1=line[0]
num1=[]
print(num1)
I expect the result for it to print out the first number of each line:
0
1
2
the actual result i get is
[]
[]
[]
It looks like you reset num1 right? Every time num1 is reseted to an empty list before printing it.
f=open("ballot.txt","r")
for line in f:
num1=line[0]
#num1=[] <-- remove this line
print(num1)
This will return the first char of the line. If you want the first number (i.e. everything before the first coma), you can try this:
f=open("ballot.txt","r")
for line in f:
num1=line.split(',')[0]
print(num1)
You read in the line fine and assign the first char of the line to the variable, but then you overwrite the variable with an empty list.
f=open("ballot.txt","r")
for line in f:
num1=line.strip().split(',')[0] # splits the line by commas and grabs 1st val
# ^^^^^^^^^^^^^^^^^^^^^^^^^^
print(num1)
This should do what you want. In your simple case, it's index 0, but you could index any value.
Since the file is comma-delimited, splitting the line by the comma will give you all the columns. Then you index the one you want. The strip() call gets rid of the newline character (which would otherwise be hanging off the last column value).
As for the big picture, trying to get lists from each column, read in the whole file into a data structure. Then process the data structure to make your lists.
def get_column_data(data, index):
return [values[index] for values in data]
with open("ballot.txt", "r") as f:
data = f.read()
data_struct = []
for line in data.splitlines():
values = line.split(',')
data_struct.append(values)
print(data, '\nData Struct is ', data_struct)
print(get_column_data(data_struct, 0))
print(get_column_data(data_struct, 1))
The get_column_data function parses the data structure and makes a list (via list comprehension) of the values of the proper column.
In the end, the data_struct is a list of lists, which can be accessed as a two-dimensional array if you wanted to do that.

Removing lines from a txt file based on the structure of the line

Code:
with open("filename.txt" 'r') as f: #I'm not sure about reading it as r because I would be removing lines.
lines = f.readlines() #stores each line in the txt into 'lines'.
invalid_line_count = 0
for line in lines: #this iterates through each line of the txt file.
if line is invalid:
# something which removes the invalid lines.
invalid_line_count += 1
print("There were " + invalid_line_count + " amount of invalid lines.")
I have a text file like so:
1,2,3,0,0
2,3,0,1,0
0,0,0,1,2
1,0,3,0,0
3,2,1,0,0
The valid line structure is 5 values split by commas.
For a line to be valid, it must have a 1, 2, 3 and two 0's. It doesn't matter in what position these numbers are.
An example of a valid line is 1,2,3,0,0
An example of an invalid line is 1,0,3,0,0, as it does not contain a 2 and has 3 0's instead of 2.
I would like to be able to iterate through the text file and remove invalid lines.
and maybe a little message saying "There were x amount of invalid lines."
Or maybe as suggested:
As you read each line from the original file, test it for validity. If it passes, write it out to the new file. When you're finished, rename the original file to something else, then rename the new file to the original file.
I think that the csv module may help so I read the documentation and it doesn't help me.
Any ideas?
You can't remove lines from a file, per se. Rather, you have to rewrite the file, including only the valid lines. Either close the file after you've read all the data, and reopen in mode "w", or write to a new file as you process the lines (which takes less memory in the short term.
Your main problem with detecting line validity seems to be handling the input. You want to convert the input text to a list of values; this is a skill you should get from learning your tools. The ones you need here are split to divide the line, and int to convert the values. For instance:
line_vals = line.split(',')
Now iterate through line_vals, and convert each to integer with int.
Validity: you need to count the quantity of each value you have in this list. You should be able to count things by value; if not back up to your prior lessons and review basic logic and data flow. If you want the advanced method for this, use collections.Counter, which is a convenient type of dictionary that accumulates counts from any sequence.
Does that get you moving? If you're still lost, I recommend some time with a local tutor.
One of the possible right approaches:
with open('filename.txt', 'r+') as f: # opening file in read/write mode
inv_lines_cnt = 0
valid_list = [0, 0, 1, 2, 3] # sorted list of valid values
lines = f.read().splitlines()
f.seek(0)
f.truncate(0) # truncating the initial file
for l in lines:
if sorted(map(int, l.split(','))) == valid_list:
f.write(l+'\n')
else:
inv_lines_cnt += 1
print("There were {} amount of invalid lines.".format(inv_lines_cnt))
The output:
There were 2 amount of invalid lines.
The final filename.txt contents:
1,2,3,0,0
2,3,0,1,0
3,2,1,0,0
This is a mostly language-independent problem. What you would do is open another file for writing. As you read each line from the original file, test it for validity. If it passes, write it out to the new file. When you're finished, rename the original file to something else, then rename the new file to the original file.
For a line to be valid, each line must have a 1, 2, 3 and 2 0's. It doesn't matter in what position these numbers are.
CHUNK_SIZE = 65536
def _is_valid(line):
"""Check if a line is valid.
A line is valid if it is of length 5 and contains '1', '2', '3',
in any order, as well as '0', twice.
:param list line: The line to check.
:return: True if the line is valid, else False.
:rtype: bool
"""
if len(line) != 5:
# If there's not exactly five elements in the line, return false
return False
if all(x in line for x in {"1", "2", "3"}) and line.count("0") == 2:
# Builtin `all` checks if a condition (in this case `x in line`)
# applies to all elements of a certain iterator.
# `list.count` returns the amount of times a specific
# element appears in it. If "0" appears exactly twice in the line
# and the `all` call returns True, the line is valid.
return True
# If the previous block doesn't execute, the line isn't valid.
return False
def get_valid_lines(path):
"""Get the valid lines from a file.
The valid lines will be written to `path`.
:param str path: The path to the file.
:return: None
:rtype: None
"""
invalid_lines = 0
contents = []
valid_lines = []
with open(path, "r") as f:
# Open the `path` parameter in reading mode.
while True:
chunk = f.read(CHUNK_SIZE)
# Read `CHUNK_SIZE` bytes (65536) from the file.
if not chunk:
# Reaching the end of the file, we get an EOF.
break
contents.append(chunk)
# If the chunk is not empty, add it to the contents.
contents = "".join(contents).split("\n")
# `contents` will be split in chunks of size 65536. We need to join
# them using `str.join`. We then split all of this by newlines, to get
# each individual line.
for line in contents:
if not _is_valid(line=line):
invalid_lines += 1
else:
valid_lines.append(line)
print("Found {} invalid lines".format(invalid_lines))
with open(path, "w") as f:
for line in valid_lines:
f.write(line)
f.write("\n")
I'm splitting this up into two functions, one to check if a line is valid according to your rules, and a second one to manipulate a file. If you want to return the valid lines instead, just remove the second with statement and replace it with return valid_lines.

Categories

Resources