How to create numpy arrays automatically?

How to create numpy arrays automatically? - python

I wanted to create arrays by for loop to assign automatically array names.
But using a for loop, it didn't work and creating a dictionary with numpy.array() in it, does not work, too. Currently, I have no more ideas...
I am not really safe in handling with python.
import numpy as np
for file_name in folder:
file_name = np.array()
file_name.extend((blabla, blabla1))
I expected to get arrays with automatically assigned names, like file_name1, file_name2, ...
But I got the advice, "redeclared file_name defined above without usage" and the output was at line file_name = np.array()
TypeError: array() missing required argument 'object' (pos 1) ...

You can do it with globals() if you really want to use the strings as named variables.
globals()[filename] = np.array()
Example:
>>> globals()['test'] = 1
>>> test
1
Of course this populates the global namespace. Otherwise, you can use locals().

As #Mark Meyer said in comment, you should use dictionary (dict in Python) by setting file_name as key.
As per your error, when you create a numpy array, you should provide an iterable (ex. a list).
For example:
>>> folder = ['file1', 'file2']
>>> blabla = 0
>>> blabla1 = 1
>>> {f: np.array((blabla, blabla1)) for f in folder}
{'file1': array([0, 1]), 'file2': array([0, 1])}

Related

Dynamically adding functions to array columns

I'm trying to dynamically add function calls to fill in array columns. I will be accessing the array millions of times so it needs to be quick.
I'm thinking to add the call of a function into a dictionary by using a string variable
numpy_array[row,column] = dict[key[index containing function call]]
The full scope of the code I'm working with is too large to post here is an equivalent simplistic example I've tried.
def hello(input):
return input
dict1 = {}
#another function returns the name and ID values
name = 'hello'
ID = 0
dict1["hi"] = globals()[name](ID)
print (dict1)
but it literally activates the function when using
globals()[name](ID)
instead of copy pasting hello(0) as a variable into the dictionary.
I'm a bit out of my depth here.
What is the proper way to implement this?
Is there a more efficient way to do this than reading into a dictionary on every call of
numpy_array[row,column] = dict[key[index containing function call]]
as I will be accessing and updating it millions of times.
I don't know if the dictionary is called every time the array is written to or if the location of the column is already saved into cache.
Would appreciate the help.
edit
Ultimately what I'm trying to do is initialize some arrays, dictionaries, and values with a function
def initialize(*args):
create arrays and dictionaries
assign values to global and local variables, arrays, dictionaries
Each time the initialize() function is used it creates a new set of variables (names, values, ect) that direct to a different function with a different set of variables.
I have an numpy array which I want to store information from the function and associated values created from the initialize() function.
So in other words, in the above example hello(0), the name of the function, it's value, and some other things as set up within initialize()
What I'm trying to do is add the function with these settings to the numpy array as a new column before I run the main program.
So as another example. If I was setting up hello() (and hello() was a complex function) and when I used initialize() it might give me a value of 1 for hello(1).
Then if I use initialize again it might give me a value of 2 for hello(2).
If I used it one more time it might give the value 0 for the function goodbye(0).
So in this scenaro let's say I have an array
array[row,0] = stuff()
array[row,1] = things()
array[row,2] = more_stuff()
array[row,3] = more_things()
Now I want it to look like
array[row,0] = stuff()
array[row,1] = things()
array[row,2] = more_stuff()
array[row,3] = more_things()
array[row,4] = hello(1)
array[row,5] = hello(2)
array[row,6] = goodbye(0)
As a third, example.
def function1():
do something
def function2():
do something
def function3():
do something
numpy_array(size)
initialize():
do some stuff
then add function1(23) to the next column in numpy_array
initialize():
do some stuff
then add function2(5) to the next column in numpy_array
initialize():
do some stuff
then add function3(50) to the next column in numpy_array
So as you can see. I need to permanently append new columns to the array and feed the new columns with the function/value as directed by the initialize() function without manual intervention.
So fundamentally I need to figure out how to assign syntax to an array column based upon a string value without activating the syntax on assignment.
edit #2
I guess my explanations weren't clear enough.
Here is another way to look at it.
I'm trying to dynamically assign functions to an additional column in a numpy array based upon the output of a function.
The functions added to the array column will be used to fill the array millions of times with data.
The functions added to the array can be various different function with various different input values and the amount of functions added can vary.
I've tried assigning the functions to a dictionary using exec(), eval(), and globals() but when using these during assignment it just instantly activates the functions instead of assigning them.
numpy_array = np.array((1,5))
def some_function():
do some stuff
return ('other_function(15)')
#somehow add 'other_function(15)' to the array column.
numpy_array([1,6] = other_function(15)
The functions returned by some_function() may or may not exist each time the program is run so the functions added to the array are also dynamic.

I'm not sure this is what the OP is after, but here is a way to make an indirection of functions by name:
def make_fun_dict():
magic = 17
def foo(x):
return x + magic
def bar(x):
return 2 * x + 1
def hello(x):
return x**2
return {k: f for k, f in locals().items() if hasattr(f, '__name__')}
mydict = make_fun_dict()
>>> mydict
{'foo': <function __main__.make_fun_dict.<locals>.foo(x)>,
'bar': <function __main__.make_fun_dict.<locals>.bar(x)>,
'hello': <function __main__.make_fun_dict.<locals>.hello(x)>}
>>> mydict['foo'](0)
17
Example usage:
x = np.arange(5, dtype=int)
names = ['foo', 'bar', 'hello', 'foo', 'hello']
>>> np.array([mydict[name](v) for name, v in zip(names, x)])
array([17, 3, 4, 20, 16])

Closing files with dask's from_array function

I happened to use Dask's "from_array" method. As in the docs at https://docs.dask.org/en/latest/array-creation.html, I do as follows:
>>> import h5py
>>> f = h5py.File('myfile.hdf5') # HDF5 file
>>> d = f['/data/path'] # Pointer on on-disk array
>>> x = da.from_array(d, chunks=(1000, 1000))
But in this example, do you agree that I should close the hdf5 file after processing the data?
If yes, it may be useful to add a feature to Dask array to allow to just pass the file pointer and the dataset key in order to include a routine in Dask array that would close the source file, if any, when the dask array object is destroyed.
I know that a good way to proceed would be like this:
>>> import h5py
>>> with h5py.File('myfile.hdf5') as f: # HDF5 file
>>> d = f['/data/path'] # Pointer on on-disk array
>>> x = da.from_array(d, chunks=(1000, 1000))
But sometimes it is not really handy. For example, in my code, I have a function that returns a dask array from a filepath with some sanity checks in between, a bit like :
>>> import h5py
>>> function get_dask_array(filepath, key)
>>> f = h5py.File(filepath) # HDF5 file
>>> # ... some sanity checks here
>>> d = f[key] # Pointer on on-disk array
>>> # ... some sanity checks here
>>> return da.from_array(d, chunks=(1000, 1000))
In this case, I find it ugly to return the file pointer as well and keep it aside for the duration of the processing, before closing it.
Any suggestion on how I should do?
Thank you in advance for your answers,
Regards,
Edit: for now I am using a global variables inside a package as follows:
#atexit.register
def clean_files():
for f in SOURCE_FILES:
if os.path.isfile(s):
f.close()

How to find types of all attributes of an object in Python?

I have a tensor
x = torch.tensor([1, 2, 3])
I did this
len(dir(x))
which gave this,
464
I want to know how many of these 464 attributes are builtin_function_or_method, or method, or any other type.
how do I list the type of the attributes of a tensor?

help(x) generates some basic documentation on whatever you pass in. It'll tell you the type of the object, attributes, methods on it, etc.

Usually, the attributes you are not supposed to access start with _ or __. So, [att for att in dir(x) if not att.startswith('_')]
If you want to exclude functions too, add and not callable(att) to the condition.

this is what I did to get types of all attributes of a tensor
import modules, create tensor
import torch
from collections import defaultdict
x = torch.tensor([1., 2., 3.])
below list comprehension gives a list of attributes along with their types
a = [(f'x.{i}', type(getattr(x, i))) for i in dir(x)]
using defaultdict, made a dictionary, which stores attributes according to type.
e = defaultdict(list)
for i, j in a.items():
e[j].append(i)

Global Scope when accessing array element inside function

When I assign a value into an array the scope of the variable remain local (see loc()).
However if I access the element of an array the scope becomes global ( see glob())
import numpy as np
M = np.array([1])
def loc():
M = 2
return 0
def glob():
M[0] = 3
return 0
loc()
print M
>>> [1]
glob()
print M
>>> [3]
Why does this happen ? How can i locally modify the elements of an array without modifying the array globally? I need to have a loop inside my function changing one element at a time.

You're mixing several things here.
First of all, M = 2 creates a local variable named M (you can see it in locals()) and prevents you from accessing the original M later on (although you're not doing it... But just to make a point). That's sometimes referred to as "shadowing".
Second of all, the np.array is a mutable object (the opposite of an immutable object), and changes to it will reflect in any reference to it. What you have in your glob function is a reference to M.
You can look at an np.array as a piece of memory that has many names, and if you changed it, the changes will be evident no matter what name you're using to access it. M[0] is simply a reference to a specific part of this memory. This reflects the object's "state".
If you'd do something like:
M = np.array([1])
def example()
another_name_for_M = M
M = 2
another_name_for_M[0] = 2
you would still see the global M changing, but you're using a new name to access it.
If you would use a string, a tuple, a frozenset and the likes, which are all immutable objects that can not be (easily) changed, you wouldn't be able to actually change their state.
Now to your question, if you don't want the function to mutate the array just send a copy of it using np.copy, and not the actual one:
import numpy as np
my_array = np.array([1])
def array_mutating_function(some_array):
some_array[0] = 1337
print some_array # prints [1337]
# send copy to prevent mutating the original array
array_mutating_function(np.copy(my_array))
print my_array # prints [1]
This will effectively make it immutable on the outer scope, since the function will not have a reference to it unless it's using it's name on the outer scope, which is probably not a good idea regardless.
If the function should never change any array, move the copy to be made on inside the function no matter what array is sent, preventing it from changing any array that was sent to it:
def array_mutating_function(some_array):
some_array = np.copy(some_array)
some_array[0] = 1337

SImply explaining.:
cannot update a global varaible inside a funcion unless access it as global inside function.
But it can modify
Check:
import numpy as np
M = np.array([1])
def loc():
global M
M = 2
return 0
def glob():
M[0] = 3
return 0
loc()
print M
>>>2

Export function that names the export file after the input variable

I'm looking to get a function to export Numpy arrays, but to use the name of the variable input to the function as the name of the exported file. Something like:
MyArray = [some numbers]
def export(Varb):
return np.savetxt("%s.dat" %Varb.name, Varb)
export(MyArray)
that will output a file called 'MyArray.dat' filled with [some numbers]. I can't work out how to do the 'Varb.name' bit. Any suggestions?
I'm new to Python and programming so I hope there is something simple I've missed!
Thanks.

I don't recommend such a code style, but you can do it this way:
import copy
myarray = range(4)
for k, v in iter(copy.copy(locals()).items()):
if myarray == v:
print k
This gives myarray as output. To do this in a function useful for exports use:
import copy
def export_with_name(arg):
""" Export variable's string representation with it's name as filename """
for k, v in iter(copy.copy(globals()).items()):
if arg == v:
with open(k, 'w') as handle:
handle.writelines(repr(arg))
locals() and globals() both give dictionaries holding the variable names as keys and the variable values as values.
Use the function the following way:
some_data = list(range(4))
export_with_name(some_data)
gives a file called some_data with
[0, 1, 2, 3]
as content.
Tested and compatible with Python 2.7 and 3.3

You can't. Python objects don't know what named variables happen to be referencing them at any particular time. By the time the variable has been dereferenced and sent to the function, you don't know where it came from. Consider this bit of code:
map(export, myvarbs)
Here, the varbs were in some sort of container and didn't even have a named variable referencing them.

import os
import inspect
import re
number_array = [8,3,90]
def varname(p):
for line in inspect.getframeinfo(inspect.currentframe().f_back)[3]:
m = re.search(r'\bvarname\s*\(\s*([A-Za-z_][A-Za-z0-9_]*)\s*\)', line)
if m:
return m.group(1)
def export(arg):
file_name = "%s.dat" % varname(arg)
fd = open(file_name, "w")
fd.writelines(str(arg))
fd.close()
export(number_array)
Refer How can you print a variable name in python? to get more details about def varname()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to create numpy arrays automatically? - python

You can do it with globals() if you really want to use the strings as named variables. globals()[filename] = np.array() Example: >>> globals()['test'] = 1 >>> test 1 Of course this populates the global namespace. Otherwise, you can use locals().

Related

Dynamically adding functions to array columns

Closing files with dask's from_array function

How to find types of all attributes of an object in Python?

Global Scope when accessing array element inside function

Export function that names the export file after the input variable

Categories

Resources