Grab Specific elements from a List after reading a file - python

I am using Python and I have a text file with results from a previous complex code. It wrote to a file called 'results' structured by:
xml file name.xml
['chebi:28726', 'chebi:27466', 'chebi:27721', 'chebi:15532', 'chebi:15346']
xml file name.xml
['chebi:27868', 'chebi:27668', 'chebi:15471', 'chebi:15521', 'chebi:15346']
xml file name.xml
['chebi:28528', 'chebi:28325', 'chebi:10723', 'chebi:28493', 'chebi:15346']
etc...
my current code is:
file = open("results.txt", "r")
data = file.readlines()
for a in data:
print(a)
The problem is I want to grab the specific elements within that list, for example chebi:28528, and convert them from their current compounds into a different format. I wrote the code for this conversion already, but am having trouble with the step before the actual conversion of the compounds.
The problem is that I need to be able to loop through the file and select each element from that list but I am unable to do so.
If i do
for a in data:
for b in a:
It selects each individual character and not the entire word (chebi:28528).
Is there a way I can loop through the text file and grab just the specific Chebi compounds so that I can then convert them into a different format needed? Python is treating the entire list of compounds as 1 elements, and indexing within that list will just correspond to a character rather than the compound.

So assuming that your file is as above, it looks like you have lists in raw test format. You can loop on those word elements by converting them to Python lists using ast or something similar.
You had the right ideas but you're looping through characters actually. How about this?
import ast
with open('results.txt', 'r') as f:
data = f.readlines()
for line in data:
if '[' not in line:
continue
ls = ast.literal_eval(line)
for word in ls:
if 'chebi' in word:
process_me(word)

Related

Python 3: Pulling specific data from documents

I am new to python and using it for my internship. My goal is to pull specific data from about 100 .ls documents (all in the same folder) and then write it to another .txt file and from there import it into excel. My problem is I can read all the files, but cannot figure out how to pull the specifics from that file into a list. From the list I want to write them into a .txt file and then import to excel.
Is there anyway to read set readlines() to only capture certain lines?
It's hard to know exactly what you want without an example or sample code/content. What you might do is create a list and append the desired line to it.
result_list = [] # Create an empty list
with open("myfile.txt", "r") as f:
Lines = f.readlines() # read the lines of the file
for line in Lines: # loop through the lines
if "desired_string" in line:
result_list.append(line) # if the line contains the string, the line is added

Is there a way to select certain elements from a separate text file in Python?

I have started to make an anagram solver with Python, but since I was using a lot of words (every word in the English dictionary), I didn't want them as an array in my Python file and instead had them in a separate text file called dictionaryArray2.txt.
I can easily import my text file and display my words on the screen using Python but I cannot find a way to select a specific element from the array rather than displaying them all.
When I do print(dictionary[2]) it prints the second letter of the whole file rather than the second element of the array. I never get any errors. it just doesn't work.
I have tried multiple things but they all have the same output.
My code below:
f = open("dictionaryArray2.txt", "r")
dictionary = f.read()
f.close()
print(dictionary[2])
If you want to split the content of dictionaryArray2 into separate words, do:
f = open("dictionaryArray2.txt", "r")
dictionary = f.read()
f.close()
print(dictionary[2])
If you want to split the content of dictionaryArray2 into separate lines, do:
f = open("dictionaryArray2.txt", "r")
dictionary = f.readlines()
f.close()
words = dictionary.split()
print(words[2])
I think the problem is, you're reading the entire file into a single long list. If your input dictionary is one word per line, I think what you want is to get a text file like this:
apple
bat
To something like this:
dictionary = ['apple', 'bat']
There's an existing answer that might offer some useful code examples, but in brief, f.read() will read the entire file object f. f.readlines(), on the other hand, iterates over each line one at a time.
To quote from the official Python docs:
If you want to read all the lines of a file in a list you can also use list(f) or f.readlines().

How to replace a string character in place in a 2D list?

So I have to take a .csv file (which can be downloaded clicking this: http://ge.tt/7lx5Boj2/) and I want to convert it into a 2D list.
My code currently does so, but with one problem.
Each element of the nested list is being read as a big string rather than a list of elements because an apostrophe is being added at the beginning and end of each nested list.
For example, rather than:
["ID","Name","Type 1","Type 2","Generation","Legendary"]
I am getting:
['"ID","Name","Type 1","Type 2","Generation","Legendary"']
To resolve this, I tried to make a nested for loop to replace every apostrophe in the list with an empty character but my code doesn't do anything. It just prints the exact same string as if the replace operation never happened.
def read_info_file(filename):
opened_infocsv = open(filename, 'r') #opens an argued .csv file with INFO ormat.
linebylinelist = [fline.splitlines() for line in opened_infocsv] #converts entire .csv into a 2D list
opened_infocsv.close()
print(linebylinelist)
print('\n')
for i in linebylinelist:
for l in i:
l.replace("'","")
print(linebylinelist)
read_info_file('info_file5.csv')
Any ideas on fixing this? NOTE: I am not allowed to import CSV
EDIT : I tried changing .replace to .strip and it still doesn't work. I honestly have no idea how to fix this.
I believe the root of the problem has to do with the way in which I converted the CSV into a 2d list using list comprehension. Maybe it is possible to convert a CSV into a 2d list without converting the lines to strings first.
str.replace does not change current string - it returns a copy of the string with all occurrences of substring old replaced by new. You should assign the result of the function to the current list item.
for i in linebylinelist:
for kk,ss in enumerate(i):
i[kk] = ss.replace("'","")
use the csv module to read a csv file.
Also, to open a file, use a context manager. As an example, see below code.
import csv
filename = 'info_file5.csv'
with open(filename, 'r') as f:
reader = csv.reader(f)
for row in reader:
print(row)

How to read each line of a file to a separate list to process them individually

There are already several questions to similar topics, but none of them solves mine.
I've written multiple lists to a text file. There, every line represents a list. Looks like this:
1: ['4bf58dd8d48988d1ce941735', '4bf58dd8d48988d157941735', '4bf58dd8d48988d1f1931735', etc.]
2: ['4bf58dd8d48988d16a941735', '4bf58dd8d48988d1f6941735', '4bf58dd8d48988d143941735', etc.]
...
I created it with:
with open('user_interest.txt', 'w') as f:
for x in range(1, 1084):
temp = df.get_group(x)
temp_list = temp['CategoryID'].tolist()
f.write(str(temp_list) + "\n")
If I read the file I get the whole file as a list. If I then access the lines, I have them as class string! But I want them again as a list like before I stored them.
with open('user_interest.txt', 'r') as file:
for line in file:
#temp_list.append(line)
print(similarity_score(user_1_list, temp_list))
line is class string here, not list like I wanted. The idea with temp_list doesn't really work either.
(user_1_list is a fix value, while temp_list is not)
Here's the context of the question: I want every line to be processed in my similarity_score function. I don't need the lists "forever" just hand it over to my function. This function should be applied to every line.
The function calculates cosine similarity and I have to find top 10 most similar users to a given user. So I have to compare each other user with my given user (user_1_list).
Psedo code:
read line
convert line to a list
give list to my function
read next line ...
Probably it's just an easy fix, but I don't get it yet. I neither want each line integrated into a new list / nested list
[['foo', 'bar', ...]]
nor I want them all in a single list.
Thanks for any help and just ask if you need more information!
You should use a proper serializer like JSON to write your lists. Then, you can use the same to deserialize them:
import json
# when writing the lists
f.write(json.dumps(temp_list) + "\n")
# when reading
lst = json.loads(line)
Use Pickle or JSON to serialize/deserialize your data
If you absolutely need to do your way, you can use ast.literal_eval You can get some help here

How to take floats from a txt to a Python list as strings

I am very new at programming. I have the following problem.
I want to take some floats from a .txt file, and add them to a Python list as strings, with a comma between them, like this:
.TXT:
194220.00 38.4397984 S 061.1720742 W 0.035
194315.00 38.4398243 S 061.1721378 W 0.036
Python:
myList = ('38.4397984,061.1720742','38.4398243,061.1721378')
Does anybody know how to do this? Thank you!
There are three key pieces you'll need to do this. You'll need to know how to open files, you'll need to know how to iterate through the lines with the file open, and you'll need to know how to split the list.
Once you know all these things, it's as simple as concatenating the pieces you want and adding them to your list.
my_list = []
with open('path/to/my/file.txt') as f:
for line in f:
words = line.split()
my_list.append(words[1] + words[3])
print mylist
Python has a method open(fileName, mode) that returns a file object.
fileName is a string with the name of the file.
mode is another a string that states how will the file used. Ex 'r' for reading and 'w' for writing.
f = open(file.txt, 'r')
This will create file object in the variable f. f has now different methods you can use to read the data in the file. The most common is f.read(size) where size is optional
text = f.read()
Will save the data in the variable text.
Now you want to split the string. String is an object and has a method called split() that creates a list of words from a string separated by white space.
myList = text.split()
In your code you gave us a tuple, which from the variable name i am not sure it was what you were looking for. Make sure to read the difference between a tuple and a list. The procedure to find a tuple is a bit different.

Categories

Resources