create a list of tuples from csv file - python

I am python beginner struggling to create and save a list containing tuples from csv file in python.
The code I got for now is:
def load_file(filename):
fp = open(filename, 'Ur')
data_list = []
for line in fp:
data_list.append(line.strip().split(','))
fp.close()
return data_list
and then I would like to save the file
def save_file(filename, data_list):
fp = open(filename, 'w')
for line in data_list:
fp.write(','.join(line) + '\n')
fp.close()
Unfortunately, my code returns a list of lists, not a list of tuples... Is there a way to create one list containing multiple tuples without using csv module?

split returns a list, if you want a tuple, convert it to a tuple:
data_list.append(tuple(line.strip().split(',')))
Please use the csv module.

First question: why is a list of lists bad? In the sense of "duck-typing", this should be fine, so maybe you think about it again.
If you really need a list of tuples - only small changes are needed.
Change the line
data_list.append(line.strip().split(','))
to
data_list.append(tuple(line.strip().split(',')))
That's it.
If you ever want to get rid of custom code (less code is better code), you could stick to the csv-module. I'd strongly recommend using as many library methods as possible.
To show-off some advanced Python features: your load_file-method could also look like:
def load_file(filename):
with open(filename, 'Ur') as fp:
data_list = [tuple(line.strip().split(",") for line in fp]
I use a list comprehension here, it's very concise and easy to understand.
Additionally, I use the with-statement, which will close your file pointer, even if an exception occurred within your code. Please always use with when working with external resources, like files.

Just wrap "tuple()" around the line.strip().split(',') and you'll get a list of tuples. You can see it in action in this runnable gist.

Related

I want to write a list with brackets into my file in python

I want to generate a log file in which I have to print two lists for about 50 input files. So, there are approximately 100 lists reported in the log file. I tried using pickle.dump, but it adds some strange characters in the beginning of each value. Also, it writes each value in a different line and the enclosing brackets are also not shown.
Here is a sample output from a test code.
import pickle
x=[1,2,3,4]
fp=open('log.csv','w')
pickle.dump(x,fp)
fp.close()
output:
I want my log file to report:
list 1 is: [1,2,3,4]
If you want your log file to be readable, you are approaching it the wrong way by using pickle which "implements binary protocols"--i.e. it is unreadable.
To get what you want, replace the line
pickle.dump(x,fp)
with
fp.write(' list 1 is: '
fp.write(str(x))
This requires minimal change in the rest of your code. However, good practice would change your code to a better style.
pickle is for storing objects in a form which you could use to recreate the original object. If all you want to do is create a log message, the builtin __str__ method is sufficient.
x = [1, 2, 3, 4]
with open('log.csv', 'w') as fp:
print('list 1 is: {}'.format(x), file=fp)
Python's pickle is used to serialize objects, which is basically a way that an object and its hierarchy can be stored on your computer for use later.
If your goal is to write data to a csv, then read the csv file and output what you read inside of it, then read below.
Writing To A CSV File see here for a great tutorial if you need more info
import csv
list = [1,2,3,4]
myFile = open('yourFile.csv', 'w')
writer = csv.writer(myFile)
writer.writerow(list)
the function writerow() will write each element of an iterable (each element of the list in your case) to a column. You can run through each one of your lists and write it to its own row in this way. If you want to write multiple rows at once, check out the method writerows()
Your file will be automatically saved when you write.
Reading A CSV File
import csv
with open('example.csv', newline='') as File:
reader = csv.reader(File)
for row in reader:
print(row)
This will run through all the rows in your csv file and will print it to the console.

How to read each line of a file to a separate list to process them individually

There are already several questions to similar topics, but none of them solves mine.
I've written multiple lists to a text file. There, every line represents a list. Looks like this:
1: ['4bf58dd8d48988d1ce941735', '4bf58dd8d48988d157941735', '4bf58dd8d48988d1f1931735', etc.]
2: ['4bf58dd8d48988d16a941735', '4bf58dd8d48988d1f6941735', '4bf58dd8d48988d143941735', etc.]
...
I created it with:
with open('user_interest.txt', 'w') as f:
for x in range(1, 1084):
temp = df.get_group(x)
temp_list = temp['CategoryID'].tolist()
f.write(str(temp_list) + "\n")
If I read the file I get the whole file as a list. If I then access the lines, I have them as class string! But I want them again as a list like before I stored them.
with open('user_interest.txt', 'r') as file:
for line in file:
#temp_list.append(line)
print(similarity_score(user_1_list, temp_list))
line is class string here, not list like I wanted. The idea with temp_list doesn't really work either.
(user_1_list is a fix value, while temp_list is not)
Here's the context of the question: I want every line to be processed in my similarity_score function. I don't need the lists "forever" just hand it over to my function. This function should be applied to every line.
The function calculates cosine similarity and I have to find top 10 most similar users to a given user. So I have to compare each other user with my given user (user_1_list).
Psedo code:
read line
convert line to a list
give list to my function
read next line ...
Probably it's just an easy fix, but I don't get it yet. I neither want each line integrated into a new list / nested list
[['foo', 'bar', ...]]
nor I want them all in a single list.
Thanks for any help and just ask if you need more information!
You should use a proper serializer like JSON to write your lists. Then, you can use the same to deserialize them:
import json
# when writing the lists
f.write(json.dumps(temp_list) + "\n")
# when reading
lst = json.loads(line)
Use Pickle or JSON to serialize/deserialize your data
If you absolutely need to do your way, you can use ast.literal_eval You can get some help here

Dictionary to string not being read back as a dictionary

Since the Json And Pickle methods aren't working out, i've decided to save my dictionaries as strings, and that works, but they arent being read.
I.E
Dictionary
a={'name': 'joe'}
Save:
file = open("save.txt", "w")
file.write(str(a))
file.close()
And that works.
But my load method doesn't read it.
Load:
f = open("save.txt", "r")
a = f
f.close()
So, it just doesn't become f.
I really don't want to use json or pickle, is there any way I could get this method working?
First, you're not actually reading anything from the file (the file is not its contents). Second, when you fix that, you're going to get a string and need to transform that into a dictonary.
Fortunately both are straightforward to address....
from ast import literal_eval
with open("save.txt") as infile:
data = literal_eval(infile.read())

Saving an Element in an Array Permanently

I am wondering if it is possible to do what is explained in the title in Python. Let me explain myself better. Say you have an array:
list = []
You then have a function that takes a user's input as a string and appends it to the array:
def appendStr(list):
str = raw_input("Type in a string.")
list.append(str)
I would like to know if it's possible to save the changes the user made in the list even after the program has closed. So if the user closed the program, opened it again, and printed the list the strings he/she added would appear. Thank you for your time. This may be a duplicate question and if so I'm sorry, but I couldn't find another question like this for Python.
A simpler solution will be to use json
import json
li = []
def getinput(li):
li.append(raw_input("Type in a string: "))
To save the list you would do the following
savefile = file("backup.json", "w")
savefile.write(json.dumps(li))
And to load the file you simply do
savefile = open("backup.json")
li = json.loads(savefile.read())
You may want to handle the case where the file does not exist. One thing to note would be that complex structures like classes cannot be stored as json.
You will have to save it into a file:
Writing to a file
with open('output.txt', 'w') as f:
for item in lst: #note: don't call your variable list as that is a python reserved keyword
f.write(str(item)+'\n')
Reading from a file (at the start of the program)
with open('output.txt') as f:
lst = f.read().split('\n')
If a string, writing in a file as suggested is a good way to go.
But if the element is not a string, "pickling" might be the keyword you are looking for.
Documentation is here:
https://docs.python.org/2/library/pickle.html
It seems to me this post answer your question:
Saving and loading multiple objects in pickle file?

how to compare values in an existing dictionary and update the dictionary back to a file?

I am making an utility of sorts with dictionary. What I am trying to achieve is this:
for each XML file that I parse, the existing dictionary is loaded from a file (output.dict) and compared/updated for the current key and stored back along with existing values. I tried with has_key() and attributerror, it does not work.
Since I trying one file at a time, it creates multiple dictionaries and am unable to compare. This is where I am stuck.
def createUpdateDictionary(servicename, xmlfile):
dictionary = {}
if path.isfile == 'output.dict':
dictionary.update (eval(open('output.dict'),'r'))
for event, element in etree.iterparse(xmlfile):
dictionary.setdefault(servicename, []).append(element.tag)
f = open('output.dict', 'a')
write_dict = str(dictionary2)
f.write(write_dict)
f.close()
(here the servicename is nothing but a split '.' of xmlfile which forms the key and values are nothing by the element's tag name)
def createUpdateDictionary(servicename, xmlfile):
dictionary = {}
if path.isfile == 'output.dict':
dictionary.update (eval(open('output.dict'),'r'))
There is a typo, as the 'r' argument belongs to open(), not eval(). Furthermore, you cannot evaluate a file object as returned by open(), you have to read() the contents first.
f = open('output.dict', 'a')
write_dict = str(dictionary2)
f.write(write_dict)
f.close()
Here, you are appending the string representation to the file. The string representation is not guaranteed to represent the dictionary completely. It is meant to be readable by humans to allow inspection, not to persist the data.
Moreover, since you are using 'a' to append the data, you are storing multiple copies of the updated dictionary in the file. Your file might look like:
{}{"foo": []}{"foo": [], "bar":[]}
This is clearly not what you want; you won't even by able to eval() it later (syntax error!).
Since eval() will execute arbitrary Python code, it is considered evil and you really should not use it for object serialization. Either use pickle, which is the standard way of serialization in Python, or use json, which is a human-readable standard format supported by other languages as well.
import json
def createUpdateDictionary(servicename, xmlfile):
with open('output.dict', 'r') as fp:
dictionary = json.load(fp)
# ... process XML, update dictionary ...
with open('output.dict', 'w') as fp:
json.dump(dictionary, fp)

Categories

Resources