How to output integer arrays to file in python? - python

I have 3000000 ints' long array which I want to output to a file. How can I do that?
Also, is this
for i in range(1000):
for k in range(1000):
(r, g, b) = rgb_im.getpixel((i, k))
rr.append(r)
gg.append(g)
bb.append(b)
d.extend(rr)
d.extend(gg)
d.extend(bb)
a good practice to join array together?
All of the arrays are declared like this d = array('B')
EDIT:
Managed to output all int`s delimited by ' ' with this
from PIL import Image
import array
side = 500
for j in range(1000):
im = Image.open(r'C:\Users\Ivars\Desktop\RS\Shape\%02d.jpg' % (j))
rgb_im = im.convert('RGB')
d = array.array('B')
rr = array.array('B')
gg = array.array('B')
bb = array.array('B')
f = open(r'C:\Users\Ivars\Desktop\RS\ShapeData\%02d.txt' % (j), 'w')
for i in range(side):
for k in range(side):
(r, g, b) = rgb_im.getpixel((i, k))
rr.append(r)
gg.append(g)
bb.append(b)
d.extend(rr)
d.extend(gg)
d.extend(bb)
o = ' '.join(str(t) for t in d)
print('#', j, ' - ', len(o))
f.write(o)
f.close()

if you're using python >= 2.6 then you can use print functions from the future!
from __future__ import print_function
#your code
# This will print out a string representation of list to the file.
# If you need it formatted differently, then you'll have to construct the string yourself
print(d, file=open('/path/to/file.txt','w')
#you can join the list items with an empty string to get only the numbers
print("".join(d),file=('/path/to/file.txt','w'))
This has the side effect of turning print from a statement into a function, so you'll have to wrap whatever you want printed in ()

You want tofile(), which requires you to open a file object. See https://docs.python.org/2/library/array.html and https://docs.python.org/2/library/stdtypes.html#bltin-file-objects. Also, have you considered using NumPy?
import array
a = array.array('B')
b = array.array('B')
a.append(3)
a.append(4)
print a
print b
with open('c:/test.dat', 'w') as f:
a.tofile(f)
with open('c:/test.dat', 'r') as f:
b.fromfile(f, 2)
print b
EDIT: Based on your edit, you can use numpy with PIL and generate the array in a line or two, without looping. See, e.g., Conversion between Pillow Image object and numpy array changes dimension for example code.

Related

Removing quotes from 2D array python

I am currently trying to execute code that evaluetes powers with big exponents without calculating them, but instead logs of them. I have a file containing 1000 lines. Each line contains two itegers separated by a comma. I got stuck at point where i tried to remove quotes from array. I tried many way of which none worked. Here is my code:
function from myLib called split() takes two argumanets of which one is a list and second is to how many elemts to split the original list. Then does so and appends smaller lists to the new one.
import math
import myLib
i = 0
record = 0
cmpr = 0
with open("base_exp.txt", "r") as f:
fArr = f.readlines()
fArr = myLib.split(fArr, 1)
#place get rid of quotes
print(fArr)
while i < len(fArr):
cmpr = int(fArr[i][1]) * math.log(int(fArr[i][0]))
if cmpr > record:
record = cmpr
print(record)
i = i + 1
This is how my Array looks like:
[['519432,525806\n'], ['632382,518061\n'], ... ['172115,573985\n'], ['13846,725685\n']]
I tried to find a way around the 2d array and tried:
i = 0
record = 0
cmpr = 0
with open("base_exp.txt", "r") as f:
fArr = f.readlines()
#fArr = myLib.split(fArr, 1)
fArr = [x.replace("'", '') for x in fArr]
print(fArr)
while i < len(fArr):
cmpr = int(fArr[i][1]) * math.log(int(fArr[i][0]))
if cmpr > record:
record = cmpr
print(i)
i = i + 1
But output looked like this:
['519432,525806\n', '632382,518061\n', '78864,613712\n', ...
And the numbers in their current state cannot be considered as integers or floats so this isnt working as well...:
[int(i) for i in lst]
Expected output for the array itself would look like this, so i can pick one of the numbers and work with it:
[[519432,525806], [632382,518061], [78864,613712]...
I would really apreciate your help since im still very new to python and programming in general.
Thank you for your time.
You can avoid all of your problems by simply using numpy's convenient loadtxt function:
import numpy as np
arr = np.loadtxt('p099_base_exp.txt', delimiter=',')
arr
array([[519432., 525806.],
[632382., 518061.],
[ 78864., 613712.],
...,
[325361., 545187.],
[172115., 573985.],
[ 13846., 725685.]])
If you need a one-dimensional array:
arr.flatten()
# array([519432., 525806., 632382., ..., 573985., 13846., 725685.])
This is your missing piece:
fArr = [[int(num) for num in line.rstrip("\n").split(",")] for line in fArr]
Here, rstrip("\n") will remove trailing \n character from the line and then the string will be split on , so that each string will be become a list and all integers in that line will become elements of that list but as a string. Then, we can call int() function on each list element to convert them into int data type.
Below code should do the job if you don't want to import an additional library.
i = 0
record = 0
cmpr = 0
with open("base_exp.txt", "r") as f:
fArr = f.readlines()
fArr = [[int(num) for num in line.rstrip("\n").split(",")] for line in fArr]
print(fArr)
while i < len(fArr):
cmpr = fArr[i][1] * math.log(fArr[i][0])
if cmpr > record:
record = cmpr
print(i)
i = i + 1
This snippet will transform your array to 1D array of integers:
from itertools import chain
arr = [['519432,525806\n'], ['632382,518061\n']]
new_arr = [int(i.strip()) for i in chain.from_iterable(i[0].split(',') for i in arr)]
print(new_arr)
Prints:
[519432, 525806, 632382, 518061]
For 2D output you can use this:
arr = [['519432,525806\n'], ['632382,518061\n']]
new_arr = [[int(i) for i in v] for v in (i[0].split(',') for i in arr)]
print(new_arr)
This prints:
[[519432, 525806], [632382, 518061]]
new_list=[]
a=['519432,525806\n', '632382,518061\n', '78864,613712\n',]
for i in a:
new_list.append(list(map(int,i.split(","))))
print(new_list)
Output:
[[519432, 525806], [632382, 518061], [78864, 613712]]
In order to flatten the new_list
from functools import reduce
reduce(lambda x,y: x+y,new_list)
print(new_list)
Output:
[519432, 525806, 632382, 518061, 78864, 613712]

Appending rows to make matrix in python from text file

I have a text file with a matrix in this form: http://textuploader.com/d0qmb
Each integer must occupy its own spot in the matrix. I have written this code that allows me to print arrays for each row in the matrix, but I have no idea how to append each array to create a matrix.
import numpy as np
# rows, cols not used in code. Just for info
rows = 9
cols = 93
with open('bob.txt') as f:
while True:
i=0
str = f.readline()
str = str.strip()
d = list(str)
d = map(int, d)
if not str: break
print(d)
i += 1
import numpy as np
array = []
with open('bob.txt', 'r') as f:
for line in f:
array.append(array.append([int(i) for i in list(line) if i.isdigit()]))
numpy_array = np.array(array)
[int(i) for i in list(line) if i.isdigit()] is a list comprehension in python.
It's roughly the same thing as:
for character in line:
if character is:
cast this character to an int and append it to the list

2d array save python

I have a program that creates a 2d array in Python but how do I save it as a csv file, it is
value_a = int(input("Type in a value for a: "))
value_b = int(input("Now a value for b: "))
value_c = int(input("And a value for c: "))
d = value_a + value_b + value_c
result = [[value_a, value_b, value_c, d]] # put the initial values into the array
number_of_loops = int(input("type in the number of loops the program must execute: "))
def loops(a, b, c, n):
global result
for i in range(n):
one_loop = [] # assign an empty array for the result of one loop
temp_a = a
a = ((a + 1) * 2) # This adds 1 to a and then multiplies by 2
one_loop.append(str(a))
b = b * 2
one_loop.append(b)
c = (temp_a + b)
one_loop.append(c)
d = a + b + c
one_loop.append(d)
result.append(one_loop)
print(result)
loops(value_a, value_b, value_c, number_of_loops)
print(result)
It prints ok but how do I save the array as a csv file
Use csvwriter.writerows,
import csv
with open(filename, 'w') as f:
writer = csv.writer(f)
writer.writerows(result)
If you're able to use third-party libraries and you're going to be working with 2d (or more) arrays in Python, I'd recommend you use a library like numpy or pandas. Numpy includes a method to write out arrays as csv files called savetxt. Good luck!
Python comes with CSV writing and reading functionality. See The Python Standard Library » 13.1csv — CSV File Reading and Writing for fuller documentation, but here is a quick example taken from that page and adapted to your problem:
import csv
with open('eggs.csv', 'wb') as csvfile:
spamwriter = csv.writer(csvfile, delimiter=' ',
quotechar='|', quoting=csv.QUOTE_MINIMAL)
for row in results:
spamwriter.writerow(row)

How to create a big file quickly with Python

I have the following code for producing a big text file:
d = 3
n = 100000
f = open("input.txt",'a')
s = ""
for j in range(0, d-1):
s += str(round(random.uniform(0,1000), 3))+" "
s += str(round(random.uniform(0,1000), 3))
f.write(s)
for i in range(0, n-1):
s = ""
for j in range(0, d-1):
s += str(round(random.uniform(0,1000), 3))+" "
s += str(round(random.uniform(0,1000), 3))
f.write("\n"+s)
f.close()
But it seems to be pretty slow to even generate 5GB of this.
How can I make it better? I wish the output to be like:
796.802 691.462 803.664
849.483 201.948 452.155
144.174 526.745 826.565
986.685 238.462 49.885
137.617 416.243 515.474
366.199 687.629 423.929
Well, of course, the whole thing is I/O bound. You can't output the file
faster than the storage device can write it. Leaving that aside, there
are some optimizations that could be made.
Your method of building up a long string from several shorter strings is
suboptimal. You're saying, essentially, s = s1 + s2. When you tell
Python to do this, it concatenates two string objects to make a new
string object. This is slow, especially when repeated.
A much better way is to collect the individual string objects in a list
or other iterable, then use the join method to run them together. For
example:
>>> ''.join(['a', 'b', 'c'])
'abc'
>>> ', '.join(['a', 'b', 'c'])
'a, b, c'
Instead of n-1 string concatenations to join n strings, this does
the whole thing in one step.
There's also a lot of repeated code that could be combined. Here's a
cleaner design, still using the loops.
import random
d = 3
n = 1000
f = open('input.txt', 'w')
for i in range(n):
nums = []
for j in range(d):
nums.append(str(round(random.uniform(0, 1000), 3)))
s = ' '.join(nums)
f.write(s)
f.write('\n')
f.close()
A cleaner, briefer, more Pythonic way is to use a list comprehension:
import random
d = 3
n = 1000
f = open('input.txt', 'w')
for i in range(n):
nums = [str(round(random.uniform(0, 1000), 3)) for j in range(d)]
f.write(' '.join(nums))
f.write('\n')
f.close()
Note that in both cases, I wrote the newline separately. That should be
faster than concatenating it to the string, since I/O is buffered
anyway. If I were joining a list of strings without separators, I'd just
tack on a newline as the last string before joining.
As Daniel's answer says, numpy is probably faster, but maybe you don't
want to get into numpy yet; it sounds like you're kind of a beginner at
this point.
Using numpy is probably faster:
import numpy
d = 3
n = 100000
data = numpy.random.uniform(0, 1000,size=(n,d))
numpy.savetxt("input.txt", data, fmt='%.3f')
This could be a bit faster:
nlines = 100000
col = 3
for line in range(nlines):
f.write('{} {} {}\n'.format(*((round(random.uniform(0,1000), 3))
for e in range(col))))
or use string formatting:
for line in range(nlines):
numbers = [random.uniform(0, 1000) for e in range(col)]
f.write('{:6.3f} {:6.3f} {:6.3f}\n'.format(*numbers))
I guess its better if you want to use a infinite loop and want to make a so big file without limitation the better is use like that
import random
d = 3
n = 1000
f = open('input.txt', 'w')
for i in range(10**9):
nums = [str(round(random.uniform(0, 1000), 3)) for j in range(d)]
f.write(' '.join(nums))
f.write('\n')
f.close()
The code will not stopped while you click on ctr-c

savetxt save only last loop data

Can someone please explain?
import numpy
a = ([1,2,3,45])
b = ([6,7,8,9,10])
numpy.savetxt('test.txt',(a,b))
This script can save well the data. But when I am running it through a loop it can print all but cannot not save all. why?
import numpy
a = ([1,2,3,4,5])
b = ([6,7,8,9,10])
for i,j in zip(a,b):
print i,j
numpy.savetxt('test.txt',(i,j))
You overwrite the previous data each time you call numpy.savetext().
A solution, using a temporary buffer array :
import numpy
a = ([1,2,3,4,5])
b = ([6,7,8,9,10])
out = []
for i,j in zip(a,b):
print i,j
out.append( (i,j) )
numpy.savetxt('test.txt',out)
numpy.savetxt will overwrite the previously written file, so you only get the result of the last iteration.
The faster way will be to use open with
import numpy
a = ([1,2,3,4,5])
b = ([6,7,8,9,10])
with open('test.txt','wb') as ftext: #Wb if you want to create a new one,
for i,j in zip(a,b): #ab if you want to appen it. Her it's wb
print i,j
numpy.savetxt(ftext,(i,j))
It will be really faster with a large array
You should append (i,j) rather than overwriting previous ones
import numpy as np
a = np.array([1,2,3,4,5])
b = np.array([6,7,8,9,10])
np.savetxt('test.txt', np.column_stack((a, b)))

Categories

Resources