Python: filling a List of objects from a .txt file

Python: filling a List of objects from a .txt file - python

For starters I've programmed in C++ for the past year and a half, and this is the first time I'm using Python.
The objects have two int attributes, say i_ and j_.
The text file is as follows:
1,0
2,0
3,1
4,0
...
What I want to do is have the list filled with objects with correct attributes. For example,
print(myList[2].i_, myList[2].j_, end = ' ')
would return
3 1
Here's my attempt after reading a little online.
class myClass:
def __init__(self, i, j):
self.i_ = i
self.j_ = j
with open("myFile.txt") as f:
myList = [list(map(int, line.strip().split(','))) for line in f]
for line in f:
i = 0
while (i < 28):
myList.append(myClass(line.split(","), line.split(",")))
i +=1
But it doesn't work obviously.
Thanks in advance!

Since you're working with a CSV file you might want to use the csv module. First you would pass the file object to the csv.reader function and it will return an iterable of rows from the file. From there you can cast it to a list and slice it to the 29 rows you are required to have. Finally, you can iterate over the rows (e.g. [1,0]) and simply unpack them in the class constructor.
class MyClass:
def __init__(self, i, j):
self.i = int(i)
self.j = int(j)
def __repr__(self):
return f"MyClass(i={self.i}, j={self.j})"
with open('test.txt') as f:
rows = [r.strip().split(',') for r in f.readlines()[:29]]
my_list = [MyClass(*row) for row in rows]
for obj in my_list:
print(obj.i, obj.j)
print(len(my_list))

I not sure you really what to stick with this format
print(myList[2].i_, myList[2].j_, end = ' ')
My solution is quite manual coded and i am using dictionary to store i and j
result = {'i':[],
'j':[]}
and below is my code
result = {'i':[],
'j':[]}
with open('a.txt', 'r') as myfile:
data=myfile.read().replace('\n', ',')
print(data)
a = data.split(",")
print (a)
b = [x for x in a if x]
print(b)
for i in range( 0, len(b)):
if i % 2 == 0:
result['i'].append(b[i])
else:
result['j'].append(b[i])
print(result['i'])
print(result['j'])
print(str(result['i'][2])+","+ str(result['j'][2]))
The result: 3,1

I'm not sure what you're trying to do with myList = [list(map(int, line.strip().split(','))) for line in f]. This will give you a list of lists with those pairs converted to ints. But you really want objects from those numbers. So let's do that directly as we iterate through the lines in the file and do away with the next while loop:
my_list = []
with open("myFile.txt") as f:
for line in f:
nums = [int(i) for i in line.strip().split(',') if i]
if len(nums) >= 2:
my_list.append(myClass(nums[0], nums[1]))

Related

Increment 2 different parameters separately in a for loop

I have the following code. The file file.txt contains a list of variables. Some of them should be str type and others should be int type.
var = [None] * 3
j = 0
with open("file.txt", "r") as f:
content = f.readline().split(";")
for i in range(2, 5):
var[j] = int(content[i])
j += 1
Instead of incrementing j manually I'd like to do it in a cleaner way (e.g. within the 'instructions' of the for loop, or something like that.
What would be a shorter/better way to handle this task?

You can use enumerate:
for i, j in enumerate(range(2,5)):
var[j] = int(content[i])
Also, you don't need to initialize var at all - just use a list comprehension:
var = [int(content[i]) for i in range(2, 5)]
Another approach (may be less Pythonic/less efficient/less readable):
You can zip two ranges together:
for i, j in zip(range(len(range(2, 5))), range(2,5)):
var[j] = int(content[i])
You know that the second range is range(2, 5) and want the first range to be from zero to len(range(2, 5)) - that's range(len(range(2, 5))).

The idiomatic way to count the current iteration index is by using enumerate:
for j, i in enumerate(range(2, 5)):
var[j] = int(content[i])
(There's no need to initialize j = 0 in this case.)
However, your example code would usually just be written as:
with open("file.txt", "r") as f:
content = f.readline().split(";")
var = [int(x) for x in content[2:5]]
which uses language features such as
a slice ([2:5]) to select a part of a list
a list comprehension to create a new list from an input sequence

How to separate different input formats from the same text file with Python

I'm new to programming and python and I'm looking for a way to distinguish between two input formats in the same input file text file. For example, let's say I have an input file like so where values are comma-separated:
5
Washington,A,10
New York,B,20
Seattle,C,30
Boston,B,20
Atlanta,D,50
2
New York,5
Boston,10
Where the format is N followed by N lines of Data1, and M followed by M lines of Data2. I tried opening the file, reading it line by line and storing it into one single list, but I'm not sure how to go about to produce 2 lists for Data1 and Data2, such that I would get:
Data1 = ["Washington,A,10", "New York,B,20", "Seattle,C,30", "Boston,B,20", "Atlanta,D,50"]
Data2 = ["New York,5", "Boston,10"]
My initial idea was to iterate through the list until I found an integer i, remove the integer from the list and continue for the next i iterations all while storing the subsequent values in a separate list, until I found the next integer and then repeat. However, this would destroy my initial list. Is there a better way to separate the two data formats in different lists?

You could use itertools.islice and a list comprehension:
from itertools import islice
string = """
5
Washington,A,10
New York,B,20
Seattle,C,30
Boston,B,20
Atlanta,D,50
2
New York,5
Boston,10
"""
result = [[x for x in islice(parts, idx + 1, idx + 1 + int(line))]
for parts in [string.split("\n")]
for idx, line in enumerate(parts)
if line.isdigit()]
print(result)
This yields
[['Washington,A,10', 'New York,B,20', 'Seattle,C,30', 'Boston,B,20', 'Atlanta,D,50'], ['New York,5', 'Boston,10']]
For a file, you need to change it to:
with open("testfile.txt", "r") as f:
result = [[x for x in islice(parts, idx + 1, idx + 1 + int(line))]
for parts in [f.read().split("\n")]
for idx, line in enumerate(parts)
if line.isdigit()]
print(result)

You're definitely on the right track.
If you want to preserve the original list here, you don't actually have to remove integer i; you can just go on to the next item.
Code:
originalData = []
formattedData = []
with open("data.txt", "r") as f :
f = list(f)
originalData = f
i = 0
while i < len(f): # Iterate through every line
try:
n = int(f[i]) # See if line can be cast to an integer
originalData[i] = n # Change string to int in original
formattedData.append([])
for j in range(n):
i += 1
item = f[i].replace('\n', '')
originalData[i] = item # Remove newline char in original
formattedData[-1].append(item)
except ValueError:
print("File has incorrect format")
i += 1
print(originalData)
print(formattedData)

The following code will produce a list results which is equal to [Data1, Data2].
The code assumes that the number of entries specified is exactly the amount that there is. That means that for a file like this, it will not work.
2
New York,5
Boston,10
Seattle,30
The code:
# get the data from the text file
with open('filename.txt', 'r') as file:
lines = file.read().splitlines()
results = []
index = 0
while index < len(lines):
# Find the start and end values.
start = index + 1
end = start + int(lines[index])
# Everything from the start up to and excluding the end index gets added
results.append(lines[start:end])
# Update the index
index = end

Removing quotes from 2D array python

I am currently trying to execute code that evaluetes powers with big exponents without calculating them, but instead logs of them. I have a file containing 1000 lines. Each line contains two itegers separated by a comma. I got stuck at point where i tried to remove quotes from array. I tried many way of which none worked. Here is my code:
function from myLib called split() takes two argumanets of which one is a list and second is to how many elemts to split the original list. Then does so and appends smaller lists to the new one.
import math
import myLib
i = 0
record = 0
cmpr = 0
with open("base_exp.txt", "r") as f:
fArr = f.readlines()
fArr = myLib.split(fArr, 1)
#place get rid of quotes
print(fArr)
while i < len(fArr):
cmpr = int(fArr[i][1]) * math.log(int(fArr[i][0]))
if cmpr > record:
record = cmpr
print(record)
i = i + 1
This is how my Array looks like:
[['519432,525806\n'], ['632382,518061\n'], ... ['172115,573985\n'], ['13846,725685\n']]
I tried to find a way around the 2d array and tried:
i = 0
record = 0
cmpr = 0
with open("base_exp.txt", "r") as f:
fArr = f.readlines()
#fArr = myLib.split(fArr, 1)
fArr = [x.replace("'", '') for x in fArr]
print(fArr)
while i < len(fArr):
cmpr = int(fArr[i][1]) * math.log(int(fArr[i][0]))
if cmpr > record:
record = cmpr
print(i)
i = i + 1
But output looked like this:
['519432,525806\n', '632382,518061\n', '78864,613712\n', ...
And the numbers in their current state cannot be considered as integers or floats so this isnt working as well...:
[int(i) for i in lst]
Expected output for the array itself would look like this, so i can pick one of the numbers and work with it:
[[519432,525806], [632382,518061], [78864,613712]...
I would really apreciate your help since im still very new to python and programming in general.
Thank you for your time.

You can avoid all of your problems by simply using numpy's convenient loadtxt function:
import numpy as np
arr = np.loadtxt('p099_base_exp.txt', delimiter=',')
arr
array([[519432., 525806.],
[632382., 518061.],
[ 78864., 613712.],
...,
[325361., 545187.],
[172115., 573985.],
[ 13846., 725685.]])
If you need a one-dimensional array:
arr.flatten()
# array([519432., 525806., 632382., ..., 573985., 13846., 725685.])

This is your missing piece:
fArr = [[int(num) for num in line.rstrip("\n").split(",")] for line in fArr]
Here, rstrip("\n") will remove trailing \n character from the line and then the string will be split on , so that each string will be become a list and all integers in that line will become elements of that list but as a string. Then, we can call int() function on each list element to convert them into int data type.
Below code should do the job if you don't want to import an additional library.
i = 0
record = 0
cmpr = 0
with open("base_exp.txt", "r") as f:
fArr = f.readlines()
fArr = [[int(num) for num in line.rstrip("\n").split(",")] for line in fArr]
print(fArr)
while i < len(fArr):
cmpr = fArr[i][1] * math.log(fArr[i][0])
if cmpr > record:
record = cmpr
print(i)
i = i + 1

This snippet will transform your array to 1D array of integers:
from itertools import chain
arr = [['519432,525806\n'], ['632382,518061\n']]
new_arr = [int(i.strip()) for i in chain.from_iterable(i[0].split(',') for i in arr)]
print(new_arr)
Prints:
[519432, 525806, 632382, 518061]
For 2D output you can use this:
arr = [['519432,525806\n'], ['632382,518061\n']]
new_arr = [[int(i) for i in v] for v in (i[0].split(',') for i in arr)]
print(new_arr)
This prints:
[[519432, 525806], [632382, 518061]]

new_list=[]
a=['519432,525806\n', '632382,518061\n', '78864,613712\n',]
for i in a:
new_list.append(list(map(int,i.split(","))))
print(new_list)
Output:
[[519432, 525806], [632382, 518061], [78864, 613712]]
In order to flatten the new_list
from functools import reduce
reduce(lambda x,y: x+y,new_list)
print(new_list)
Output:
[519432, 525806, 632382, 518061, 78864, 613712]

Printing to a .csv file from a Random List

When I create a random List of numbers like so:
columns = 10
rows = 10
for x in range(rows):
a_list = []
for i in range(columns):
a_list.append(str(random.randint(1000000,99999999)))
values = ",".join(str(i) for i in a_list)
print values
then all is well.
But when I attempt to send the output to a file, like so:
sys.stdout = open('random_num.csv', 'w')
for i in a_list:
print ", ".join(map(str, a_list))
it is only the last row that is output 10 times. How do I write the entire list to a .csv file ?

In your first example, you're creating a new list for every row. (By the way, you don't need to convert them to strs twice).
In your second example, you print the last list you had created previously. Move the output into the first loop:
columns = 10
rows = 10
with open("random_num.csv", "w") as outfile:
for x in range(rows):
a_list = [random.randint(1000000,99999999) for i in range(columns)]
values = ",".join(str(i) for i in a_list)
print values
outfile.write(values + "\n")

Tim's answer works well, but I think you are trying to print to terminal and the file in different places.
So with minimal modifications to your code, you can use a new variable all_list
import random
import sys
all_list = []
columns = 10
rows = 10
for x in range(rows):
a_list = []
for i in range(columns):
a_list.append(str(random.randint(1000000,99999999)))
values = ",".join(str(i) for i in a_list)
print values
all_list.append(a_list)
sys.stdout = open('random_num.csv', 'w')
for a_list in all_list:
print ", ".join(map(str, a_list))

The csv module takes care of a bunch the the crap needed for dealing with csv files.
As you can see below, you don't need to worry about conversion to strings or adding line-endings.
import csv
columns = 10
rows = 10
with open("random_num.csv", "wb") as outfile:
writer = csv.writer(outfile)
for x in range(rows):
a_list = [random.randint(1000000,99999999) for i in range(columns)]
writer.writerow(a_list)

Sum of all numbers in a file

I have been fiddling round with this code for ages and cannot figure out how to make it pass the doctests. the output is always 1000 less than the corrects answer. is there a simple way to change this code so that it gives the desired output ??
my code is:
def sum_numbers_in_file(filename):
"""
Return the sum of the numbers in the given file (which only contains
integers separated by whitespace).
>>> sum_numbers_in_file("numb.txt")
19138
"""
f = open(filename)
m = f.readline()
n = sum([sum([int(x) for x in line.split()]) for line in f])
f.close()
return n
the values in the file are:
1000
15000
2000
1138

The culprit is:
m = f.readline()
when you are doing f.readline(), it is losing the 1000, which is not being considered in the list comprehension. Hence the error.
This should work:
def sum_numbers_in_file(filename):
"""
Return the sum of the numbers in the given file (which only contains
integers separated by whitespace).
>>> sum_numbers_in_file("numb.txt")
19138
"""
f = open(filename, 'r+')
m = f.readlines()
n = sum([sum([int(x) for x in line.split()]) for line in m])
f.close()
return n

You pull out the first line and store it in m. Then never use it.

You could use two for-loops in one generator expression:
def sum_numbers_in_file(filename):
"""
Return the sum of the numbers in the given file (which only contains
integers separated by whitespace).
>>> sum_numbers_in_file("numb.txt")
19138
"""
with open(filename) as f:
return sum(int(x)
for line in f
for x in line.split())
The generator expression above is equivalent to
result = []
for line in f:
for x in line.split():
result.append(int(x))
return sum(result)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: filling a List of objects from a .txt file - python

Related

Increment 2 different parameters separately in a for loop

How to separate different input formats from the same text file with Python

Removing quotes from 2D array python

Printing to a .csv file from a Random List

Sum of all numbers in a file

Categories

Resources