Python, check strings for values and remove other characters

Python, check strings for values and remove other characters - python

f= open("new_sample.txt","r")
for ranges_in_file in f:
if(ranges_in_file.find('ranges:')!= -1):
new_data = ranges_in_file.split(" ")
print('success')
hi guys, currently i am reading a .txt file line by line to find for a certain value. I am able to find the line. For example, ranges: [1.3,1.9,2.05,inf,1.64] How do i store the certain line into a list and after that, remove any excess characters in the line such as the word "ranges" and "inf"?

Given you have read the lines of a file and can get a list,
which is like
ranges_in_file = [1.3,1.9,2.05,math.inf,1.64]
you can make a comprehension over the things you need/do not need:
wanted = [x for x in ranges_in_file if x not in [math.inf, "range"] ]

You can split the variableranges and the list using a,b=ranges_in_file.split('=')
b.strip()
So b contains your required list as a string. Use c=list(b[1:len(b)-1].split(',')) to convert it into list. Then you can iterate over list and discard any values you don't want. (Remember now all entries of list are strings!)

For new_sample.txt:
ranges: [1.3,1.9,2.05,inf,1.64]
you can split data with delimiter space and ", ":
f = open("new_sample.txt", "r")
file = f.read().splitlines()
f.close()
result = []
for i in file:
b = i.split(" ")
if b[0] == "ranges:":
temp = b[1][1:-1].split(",")
for j in temp:
if j not in ["inf"]:
result.append(j)
print(result)
OUTPUT: ['1.3', '1.9', '2.05', '1.64']

If your lines always look like this, so the start with "ranges: ", followed by a list of values, and the only thing you want to remove is inf, you can turn it into a list of floats easily using a mapfunction like this:
line = "ranges: [1.3,1.9,2.05,inf,1.64]"
values = list(map(float, [x.strip('[]') for x in (line.split(' ')[1]).split(',') if 'inf' not in x]))
Output:
[1.3, 1.9, 2.05, 1.64] # list of float values
You can then apply this to every line of the file that starts with 'ranges:', which will give you a list of the individual lines value lists. Notice the use of with open(..., which is safer to use for files in general, because the file will always be closed properly no matter what happens.
values = []
with open('new_sample.txt', 'r') as f:
for line in f.readlines():
if line.startswith('ranges:'):
line_values = list(map(float, [x.strip #.... and so on, see above
values.append(line_values)
But if your lines can be different, a more general approach is needed.

Related

What's the fastest way to convert a list into a python list?

Say I have a list like such:
1
2
3
abc
What's the fastest way to convert this into python list syntax?
[1,2,3,"abc"]
The way I currently do it is to use regex in a text editor. I was looking for a way where I can just throw in my list and have it convert immediately.

Read the file, split into lines, convert each line if numeric.
# Read the file
with open("filename.txt") as f:
text = f.read()
# Split into lines
lines = text.splitlines()
# Convert if numeric
def parse(line):
try:
return int(line)
except ValueError:
return line
lines = [parse(line) for line in lines]

If it's in a text file like you mentioned in the comments then you can simply read it and split by a new line. For example, your code would be:
with open("FileName.txt", "r") as f:
L = f.read().split("\n")
print(L)
Where L is the list.
Now, if your looking for the variables to be of the correct type instead of all being strings, then you can loop through each in the list to check. For example, you could do the following:
for i in range(len(L)):
if L[i].isnumeric():
L[i] = int(L[i])

Index error iterating over list in python

So I have this file that contains 2 words each line. It looks like this.
[/lang:F </lang:foreign>
[lipmack] [lipsmack]
[Fang:foreign] <lang:foreign>
the first word is incorrectly formatted and the second one is correctly formatted. I am trying to put them in a dictionary. Below is my code.
textFile = open("hinditxt1.txt", "r+")
textFile = textFile.readlines()
flist = []
for word in textFile:
flist.append(word.split())
fdict = dict()
for num in range(len(flist)):
fdict[flist[num][0]] = flist[num][1]
First I split it then I try to put them in a dictionary. But for some reason I get "IndexError: list index out of range" when trying to put them in a dictionary. What can i do to fix it? Thanks!

It is better in python to iterate over the items of a list rather than a new range of indicies. My guess is that the IndexError is coming from a line in the input file that is blank or does not contain any spaces.
with open("input.txt", 'r') as f:
flist = [line.split() for line in f]
fdict = {}
for k, v in flist:
fdict[k] = v
print(fdict)
The code above avoids needing to access elements of the list using an index by simply iterating over the items of the list itself. We can further simplify this by using a dict comprehension:
with open("input.txt", 'r') as f:
flist = [line.split() for line in f]
fdict = {k: v for k, v in flist}
print(fdict)

With dictionaries it is typical to use the .update() method to add new key-value pairs. It would look more like:
for num in range(len(flist)):
fdict.update({flist[num][0] : flist[num][1]})
A full example without file reading would look like:
in_words = ["[/lang:F </lang:foreign>",
"[lipmack] [lipsmack]",
"[Fang:foreign] <lang:foreign>"]
flist = []
for word in in_words:
flist.append(word.split())
fdict = dict()
for num in range(len(flist)):
fdict.update({flist[num][0]: flist[num][1]})
print(fdict)
Yielding:
{'[lipmack]': '[lipsmack]', '[Fang:foreign]': '<lang:foreign>', '[/lang:F': '</lang:foreign>'}
Although your output may vary, since dictionaries do not maintain order.
As #Alex points out, the IndexError is likely from your data having improperly formatted data (i.e. a line with only 1 or 0 items on it). I suspect the most likely cause of this would be a \n at the end of your file that is causing the last line(s) to be blank.

Storing Data in the correct way in Python

In Keywords.txt I have these words and their 'values':
alone,1
amazed,10
amazing,10
bad,1
best,10
better,7
excellent,10
These are some of the keywords and their 'values' that I need to store in a data structure, a list. Each line will be later used to access/extract the word and its 'value'.
The list I made in a while loop was:
line = KeywordFile.readline()
while line != '':
line=KeywordFile.readline()
line = line.rstrip()
And I tried to convert it to a list form by doing this:
list=[line]
However, when I print the list, I get this:
['amazed,10']
['amazing,10']
['bad,1']
['best,10']
['better,7']
['excellent,10']
I don't think that I'll be able to extract my 'values' from the lists that easy if they are inside quotation marks.
I'm looking to get lists in this format:
['amazed',10]
['amazing',10]
['bad',1]
Thanks in advance!

You want to split the line into a list based on the string delimiter ','.
list = line.split(',')
Then you want to convert the second value in each list into an int.
list[1] = int(list[1])

put line amazed,10 into sample.csv
import csv
with open('sample.csv') as csvfile:
inputs = csv.reader(csvfile, delimiter=',')
for row in inputs:
row = [int(x) if x.isdigit() else x for x in row] # convert numerical string to int which would remove quotes
print(row)
output:
['amazed', 10]

How do I read each line from a file, recognize variables contained within it, and sort them in the main program

Basically I have a program where a name will be input, and three numbers calculated, and put into a list like this:
Each time that the code has been run, it will write this list into a text file. After multiple runs of the program, the text file looks like this:
What I need to do is read each line of the text file and find the highest number out of each line, and print them out with the name attached like so, highest to lowest:
I figured that I would have to use something like 'for line in lines' , and 'if num1>num2 and num1>num3: line=[name,num1]
However I realised that I couldn't take the variables back out of line. I have spent a long time researching and trying to find a way to do it, but I haven't been successful at all. Does anyone know how to do this, so python will print out the contents of the file arranged by highest number to lowest number, like so:

You can first get rid of first and last character ('[' and ']'), split line by comma, convert first, second and third element to integer, then search maximum value from this. Something like:
lines = []
with open(fname) as f:
lines = f.readlines()
for line in lines:
line = line[1:-1] # get rid of '[' and ']'
splitted = line.split(",") # split string by commas
numbers = list(map(int, splitted[1:])) # get numbers and convert to ints
maxNum = max(numbers) # search maximum
# now you can print
print("[" + splitted[0] + ", " + str(maxNum) + "]")

You can parse the file like user3599110 correctly suggests, and then you still have to sort your results. To do that, before printing anything, you store the various (name,maxNum) tuples in a list, and then perform a list sort based on maxNum.
The code looks something like:
lines = []
with open(fname) as f:
lines = f.readlines()
res_list = []
for line in lines:
line = line[1:-1] # get rid of '[' and ']'
splitted = line.split(",") # split string by commas
numbers = list(map(int, splitted[1:])) # get numbers and convert to ints
maxNum = max(numbers) # search maximum
res_list.append( (splitted[0],maxNum) ) # save (name,maxNum) on the list
# now we sort the list based on 'maxNum' which is the second element of each tuple.
# We use a simple lambda function here, and reverse=True
sorted_list = sorted(res_list, key=lambda x: x[1], reverse=True)
# now you can print
for item in sorted_list:
print("[" + item[0] + ", " + str(item[1]) + "]")

Don't hand-parse this. These are Python list literals, at least in your example, and Python can parse them correctly, safely, and easily with ast.literal_eval:
import ast
from operator import itemgetter
with open(fname) as f:
# Parse to Python lists
all_lists = map(ast.literal_eval, filter(str.strip, f))
# Convert to name, maxval pairs
max_vals = ([lst[0], max(lst[1:])] for lst in all_lists)
# Sort by the maxval from highest to lowest
sorted_vals = sorted(max_vals, key=itemgetter(1), reverse=True)
# Print results
for elem in sorted_vals:
print(elem)
No complex hand-rolled parsers, you let Python do all the heavy lifting (and it will run faster for it). ast.literal_eval is similar to regular eval, but only parses code consisting purely of Python literals; it can't execute arbitrary code, removing the major security/stability risks of using eval.

Trying to input data from a txt file in a list, then make a list, then assign values to the lines

allt = []
with open('towers1.txt','r') as f:
towers = [line.strip('\n') for line in f]
for i in towers:
allt.append(i.split('\t'))
print allt [0]
now i need help, im inputting this text
mw91 42.927 -72.84 2.8
yu9x 42.615 -72.58 2.3
HB90 42.382 -72.679 2.4
and when i output im getting
['mw91 42.927 -72.84 2.8']
where in my code and what functions can i use to define the 1st 2nd 3rd and 4th values in this list and all the ones below that will output, im trying
allt[0][2] or
allt[i][2]
but that dosent give me -72.84, its an error, then other times it goes list has no attribute split
update, maybe i need to use enumerate?? i need to make sure though the middle 2 values im imputing though can be used and numbers and not strings because im subtracting them with math

Are you sure those are tabs? You can specify no argument for split and it automatically splits on whitespace (which means you won't have to strip newlines beforehand either). I copied your sample into a file and got it to work like this:
allt = []
with open('towers1.txt','r') as f:
for line in f:
allt.append(line.split())
>>>print allt[0]
['mw91', '42.927', '-72.84', '2.8']
>>>print allt[0][1]
'42.927'
Footnote: if you get rid of your first list comprehension, you're only iterating the file once, which is less wasteful.
Just saw that you want help converting the float values as well. Assuming that line.split() splits up the data correctly, something like the following should probably work:
allt = []
with open('towers1.txt','r') as f:
for line in f:
first, *_else = line.split() #Python3
data = [first]
float_nums = [float(x) for x in _else]
data.extend(float_nums)
allt.append(data)
>>>print allt[0]
['mw91', 42.927, -72.84, 2.8]
For Python2, substitute the first, *_else = line.split() with the following:
first, _else = line.split()[0], line.split()[1:]
Finally (in response to comments below), if you want a list of a certain set of values, you're going to have to iterate again and this is where list comprehensions can be useful. If you want the [2] index value for each element in allt, you'll have to do something like this:
>>> some_items = [item[2] for item in allt]
>>> some_items
[-72.84, -72.58, -72.679]

[] implies a list.
'' implies a string.
allt = ['mw91 42.927 -72.84 2.8']
allt is a list that contains a string:
allt[0] --> 'mw91 42.927 -72.84 2.8'
allt[0][2] --> '9'
allt.split() --> ['mw91', '42.927', '-72.84', '2.8']
allt.split()[2] --> '-72.84' #This is still a string.
float(allt.split()[2]) --> -72.84 #This is now a float.

I think this should also work
with open('towers.txt', 'r') as f:
allt = map(str.split, f)
And if you need the values after the first one to be floats...
with open('towers.txt', 'r') as f:
allt = [line[:1] + map(float, line[1:]) for line in map(str.split, f)]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python, check strings for values and remove other characters - python

Given you have read the lines of a file and can get a list, which is like ranges_in_file = [1.3,1.9,2.05,math.inf,1.64] you can make a comprehension over the things you need/do not need: wanted = [x for x in ranges_in_file if x not in [math.inf, "range"] ]

Related

What's the fastest way to convert a list into a python list?

Index error iterating over list in python

Storing Data in the correct way in Python

How do I read each line from a file, recognize variables contained within it, and sort them in the main program

Trying to input data from a txt file in a list, then make a list, then assign values to the lines

Categories

Resources