I am running a function developed by Esri to get list of values in a integer column of a spatial table (however, the same behaviour is observed even when running the function on a non-spatial table). According to the help, I should get NumPy structured array. After running the function, I have a numpy array. I run print in this format:
in_table = r"C:\geodb101#server.sde\DataTable" #
data = arcpy.da.TableToNumPyArray(in_table, "Field3")
print data
Which gives me back this in IDE (copy/pasted from IDE interpreter):
[(20130825,) (20130827,) (20130102,)]
I am running:
allvalues = data.tolist()
and getting:
[(20130825,), (20130827,), (20130102,)]
Same result when running data.reshape(len(data)).tolist() as suggested in comments.
Running type() lets me know that in the first case it is <type 'numpy.ndarray'> and in the second case <type 'list'>. I am expecting to get my output list in another format [20130825, 20130827, 20130102]. What am I doing wrong or what else should I do to get the output list in the specified format?
I have a possible approach, but I'm not 100% sure it will work, as I can't figure out how you got tuples into an array (when I tried to create an array of tuples, it looks like the tuples got converted to arrays). In any case, give this a shot:
my_list = map(lambda x: x[0], my_np_array_with_tuples_in_it)
This assumes you're dealing specifically with the single element tuples you describe above. And like I said, when I tried to recreate your circumstances, numpy did some conversion moves that I don't fully understand (not really a numpy expert).
Hope that helps.
Update: Just saw the new edits. Not sure if my answer applies anymore.
Update 2: Glad that worked, here's a bit of elaboration.
Lambda is basically just an inline function, and is a construct common in a lot of languages. It's essentially a temporary, anonymous function. You could have just as easily done something like this:
def my_main_func():
def extract_tuple_value(tup):
return tup[0]
my_list = map(extract_tuple_value, my_np_array_with_tuples_in_it)
But as you can see, the lambda version is more concise. The "x" in my initial example is the equivalent of "tup" in the more verbose example.
Lambda expressions are generally limited to very simple operations, basically one line of logic, which is what is returned (there is no explicit return statement).
Update 3: After chatting with a buddy and doing some research, list comprehension is definitely the way to go (see Python List Comprehension Vs. Map).
From acushner's comment below, you can definitely go with this instead:
my_list = [tup[0] for tup in my_np_array_with_tuples_in_it]
Related
So I'm trying to use the map function with a lambda to write each item of a list to a txt file on a new line
map(lambda x: text_file.write(f"{x}\n"), itemlist_with_counts_formatted)
I understand that map returns a map object, but I don't need the return value.
What I want is for the map function to compute the lambda, which adds "\n" to the end of each item in the given list.
I thought that map should do this (compute the function (lambda appends "\n") using arguments from the iterable) but nothing gets output to the txt file.
For clarity, I can totally do this with a list comprehension but I wanted to learn how to use map (and properly anonymous lambdas), so am looking for help solving it using these two functions specifically (if possible).
map(lambda x: text_file.write(f"{x}\n"), itemlist_with_counts_formatted)
I have also tried it without the f string, using just x + "\n" but this doesn't work either
Yes the txt file is open, and yes I can get it to work using other methods, the problem is exclusive to how I'm using map or how I'm using lambda, which must be wrong in some way. I've been doing this for 6 weeks so its probably something stupid but I've tried to figure it out myself and i just can't and I've checked but can't find anything on here - appreciate any help I can get.
You should really not use map for this task.
It looks fancy, but this is the same as using list comprehensions for side effects. It's considered bad practice.
[print(i) for i in range(3)]
Which should be replaced with:
for i in range(3):
print(i)
In you case, use:
for item in itemlist_with_counts_formatted:
text_file.write(f"item\n")
why your code did not work:
map returns a generator, nothing is evaluated until something consumes the generator. You would need to do:
list(map(lambda x: text_file.write(f"{x}\n"), itemlist_with_counts_formatted))
But, again, don't, this is useless, less efficient and less explicit.
But I really want a one-liner!
Then use:
text_file.write('\n'.join(itemlist_with_counts_formatted))
NB. unlike the other alternatives in this answer, this one does not add a trailing '\n' in the end of the file.
I really, really, want to use map:
text_file.writelines(map(lambda x: f'{x}\n', itemlist_with_counts_formatted))
i think that the problem is that this use of the map function is a bit unproper. As said in the documentation the map function returns a generator for the results iterable, while the write function is not returning anything. This might brake something during the map internals.
I'd suggest you to use map only to add line end and then use the writeline function on the resulting generator, something like:
text_file.writelines(map(lambda x: f"{x}\n", itemlist_with_counts_formatted))
(Not tested)
I have a config file, in which items can be single element or list.
pct_accepted=0.75
pct_rejected=0.35, 0.5
Upon reading back, they will all be in string,
config['pct_accepted']='0.75'
config['pct_rejected']=['0.35', '0.5']
Is there a clean method of converted them to float other than having to check whether they are scalar or list
My attempt for now is :
for k in ['pct_accepted','pct_rejected']:
if isinstance(config[k], list) :
config[k]=[float(item) for item in config[k]]
if isinstance(config[k], string) :
config[k]=float(config[k])
Doesn't look so neat.
Since you included the numpy tag:
In [161]: np.array('12.23',float).tolist()
Out[161]: 12.23
In [162]: np.array(['12.23','12.23'],float).tolist()
Out[162]: [12.23, 12.23]
short, sweet and overkill!
There's no clean way, simply because the conversion is not valid on a list: something has to look at the data type. You can hide that in a function, but it's still there.
You can shorten the code a bit by using the available broadcast routines. Something such as map(float, config[k]) will perhaps make it look a little better to you.
You can also store the type in a variable, and test the variable twice, rather than using two isinstance calls. This saves a few characters, and doesn't scale well, but it works nicely for simple applications.
below startswith query runs fine in my PyCharm environment (Python 2.7):
df['starts_with'] = map(lambda x: df.startswith('Wash'), df['CTYNAME'])
When running the same in a Jupyter Notebook I am receiving below value in my 'starts_with' column:
'<map object at 0x7fbfe6954470>'
I understand that it might be a purely Jupyter issue, however, is there a different approach to this query to get around the error in Jupyter? 'starts_with' shall be used for a boolean mask in a next step.
Best,
P
It's probably not an error. The map you're accessing is not the python's builtin map, but rather some implementation which instead of ready result returns an object which implements everything you'd expect from an iterable list, but does some optimisation magic underneath.
If you want it to be a list, you can try:
list(map(....))
Otherwise, you could try iterating it and see if you get a result you expect.
Sorry for troubling you, figured it out myself while editing my initial inquiry: I just concatenated the query with my previous line for boolean masking.
print(map(lambda x: df.startswith('Wash'), df['CTYNAME'])[PREVIOUS_DF])
I have an existing dict that has keys but no values. I would like to populate the values by iterating over two lists at the same time like so:
for (pair,name) in enumerate(zip([[0,1],[0,2],[0,3],[1,2],[1,3],[2,3]], ['pair1','pair2','pair3','pair4','pair5','pair6'])):
my_dict[tuple(name)] = pair
However I get the error: unhashable type: list.
So it seems my attempt to cast the list as a tuple doesn't work. I choose tuple because, according to what I read from other posts is a better way to go.
Can someone adjust this method to work as desired? I'm also open to other solutions.
Update
I will take the blame for not putting my whole function in the post. I thought being more concise would make things easier to understand, but in the end some important details were overlooked. Sorry for that. I'm working with numpy and sklearn Here is my whole function:
pair_names = ['pair1','pair2','pair3','pair4','pair5','pair6']
pair_dict = {p:[] for p in pair_names}
for (pair,key) in zip([[0,1],[0,2],[0,3],[1,2],[1,3],[2,3]], ['pair1','pair2','pair3','pair4','pair5','pair6']):
x = iris.data[:,pair]
y = iris.target
clf = DecisionTreeClassifier().fit(x,y)
decision_boundaries = decision_areas(clf,[0,7,0,3])
pair_dict[key] = decision_boundaries
Going on the suggestions from the answers to this question so far, I removed enumerate and simply used zip. Unfortunately, now on the line clf = DecisionTreeClassifier().fit(x,y) I get an error:number of samples does not match number of labels. Which I find odd, because I didnt change the sample size at all. My only guess is it has something to do with enumerate or zip -- because that is the only difference from the original function from the documentation example
Maybe what you want is:
{ tuple(x):y for (x,y) in zip([[0,1],[0,2],[0,3],[1,2],[1,3],[2,3]], ['pair1','pair2','pair3','pair4','pair5','pair6'])}
This may seem like an odd question but why doesn't python by default "iterate" through a single object by default.
I feel it would increase the resilience of for loops for low level programming/simple scripts.
At the same time it promotes sloppiness in defining data structures properly though. It also clashes with strings being iterable by character.
E.g.
x = 2
for a in x:
print(a)
As opposed to:
x = [2]
for a in x:
print(a)
Are there any reasons?
FURTHER INFO: I am writing a function that takes a column/multiple columns from a database and puts it into a list of lists. It would just be visually "nice" to have a number instead of a single element list without putting type sorting into the function (probably me just being OCD again though)
Pardon the slightly ambiguous question; this is a "why is it so?" not an "how to?". but in my ignorant world, I would prefer integers to be iterable for the case of the above mentioned function. So why would it not be implemented. Is it to do with it being an extra strain on computing adding an __iter__ to the integer object?
Discussion Points
Is an __iter__ too much of a drain on machine resources?
Do programmers want an exception to be thrown as they expect integers to be non-iterable
It brings up the idea of if you can't do it already, why not just let it, since people in the status quo will keep doing what they've been doing unaffected (unless of course the previous point is what they want); and
From a set theory perspective I guess strictly a set does not contain itself and it may not be mathematically "pure".
Python cannot iterate over an object that is not 'iterable'.
The 'for' loop actually calls inbuilt functions within the iterable data-type which allow it to extract elements from the iterable.
non-iterable data-types don't have these methods so there is no way to extract elements from them.
This Stack over flow question on 'if an object is iterable' is a great resource.
The problem is with the definition of "single object". Is "foo" a single object (Hint: it is an iterable with three strings)? Is [[1, 2, 3]][0] a single object (It is only one object, with 3 elements)?
The short answer is that there is no generalizable way to do it. However, you can write functions that have knowledge of your problem domain and can do conversions for you. I don't know your specific case, but suppose you want to handle an integer or list of integers transparently. You can create your own iterator:
def col_iter(item):
if isinstance(item, int):
yield item
else:
for i in item:
yield i
x = 2
for a in col_iter(x):
print a
y = [1,2,3,4]
for a in col_iter(y):
print a
The only thing that i can think of is that python for loops are looking for something to iterate through not just a value. If you think about it what would the value of "a" be? if you want it to be the number 2 then you don't need the for loop in the first place. If you want it to go through 1, 2 or 0, 1, 2 then you want. for a in range(x): not positive if that's the answer you're looking for but it's what i got.