Python : Adding data to list - python

I am learning lists and trying to create a list and add data to it.
mylist=[]
mylist[0]="hello"
This generates Error.
Why cant we add members to lists like this, like we do with arrays in javascript.
Since these are also dynamic and we can add as many members and of any data type to it.
In javascript this works:
var ar=[];
ar[0]=333;
Why this dosent work in Python and we only use append() to add to list.

mylist[0] = 'hello' is syntactic sugar for mylist.__setitem__(0, 'hello').
As per the docs for object.__setitem__(self, key, value):
The same exceptions should be raised for improper key values as for
the __getitem__() method.
The docs for __getitem__ states specifically what leads to IndexError:
if value outside the set of indexes for the sequence (after any
special interpretation of negative values), IndexError should be
raised.
As to the purpose behind this design decision, one can write several chapters to explain why list has been designed in this way. You should familiarise yourself with Python list indexing and slicing before making judgements on its utility.

Lists in Python are fundamentally different to arrays in languages like C. You do not create a list of a fixed size and assign elements to indexes in it. Instead you either create an empty list and append elements to it, or use a list-comprehension to generate a list from a type of expression.
In your case, you want to add to the end, so you must use the .append method:
mylist.append('hello')
#["hello"]
And an example of a list comprehension:
squares = [x**2 for x in range(10)]
#[1,4,9,16,25,36,49,64,81,100]

Related

How can I pass each element of a set to a function?

I have a set with multiple tuples: set1 = {(1,1),(2,1)} for example.
Now I want to pass each tuple of the set to a method with this signature: process_tuple(self, tuple).
I am doing it with a for loop like this:
for tuple in set1:
process_tuple(tuple)
Is there a better way to do it?
Your question is basically "how can I loop without using a loop". While it's possible to do what you're asking with out an explicit for loop, the loop is by far the clearest and best way to go.
There are some alternatives, but mostly they're just changing how the loop looks, not preventing it in the first place. If you want to collect the return values from the calls to your function in a list, you can use a list comprehension to build the list at the same time as you loop:
results = [process_tuple(tuple) for tuple in set1]
You can also do set or dict comprehensions if those seem useful to your specific needs. For example, you could build a dictionary mapping from the tuples in your set to their processed results with:
results_dict = {tuple: process_tuple(tuple) for tuple in set1}
If you don't want to write out for tuple in set1 at all, you could use the builtin map function to do the looping and passing of values for you. It returns an iterator, which you'll need to fully consume to run the function over the full input. Passing the map object to list sometimes makes sense, for instance, to convert inputs into numbers:
user_numbers = list(map(int, input("Enter space-separated integers: ").split()))
But I'd also strongly encourage you to think of your current code as perhaps the best solution. Just because you can change it to something else, doesn't mean you should.

Why python for loops don't default to one iteration for single objects

This may seem like an odd question but why doesn't python by default "iterate" through a single object by default.
I feel it would increase the resilience of for loops for low level programming/simple scripts.
At the same time it promotes sloppiness in defining data structures properly though. It also clashes with strings being iterable by character.
E.g.
x = 2
for a in x:
print(a)
As opposed to:
x = [2]
for a in x:
print(a)
Are there any reasons?
FURTHER INFO: I am writing a function that takes a column/multiple columns from a database and puts it into a list of lists. It would just be visually "nice" to have a number instead of a single element list without putting type sorting into the function (probably me just being OCD again though)
Pardon the slightly ambiguous question; this is a "why is it so?" not an "how to?". but in my ignorant world, I would prefer integers to be iterable for the case of the above mentioned function. So why would it not be implemented. Is it to do with it being an extra strain on computing adding an __iter__ to the integer object?
Discussion Points
Is an __iter__ too much of a drain on machine resources?
Do programmers want an exception to be thrown as they expect integers to be non-iterable
It brings up the idea of if you can't do it already, why not just let it, since people in the status quo will keep doing what they've been doing unaffected (unless of course the previous point is what they want); and
From a set theory perspective I guess strictly a set does not contain itself and it may not be mathematically "pure".
Python cannot iterate over an object that is not 'iterable'.
The 'for' loop actually calls inbuilt functions within the iterable data-type which allow it to extract elements from the iterable.
non-iterable data-types don't have these methods so there is no way to extract elements from them.
This Stack over flow question on 'if an object is iterable' is a great resource.
The problem is with the definition of "single object". Is "foo" a single object (Hint: it is an iterable with three strings)? Is [[1, 2, 3]][0] a single object (It is only one object, with 3 elements)?
The short answer is that there is no generalizable way to do it. However, you can write functions that have knowledge of your problem domain and can do conversions for you. I don't know your specific case, but suppose you want to handle an integer or list of integers transparently. You can create your own iterator:
def col_iter(item):
if isinstance(item, int):
yield item
else:
for i in item:
yield i
x = 2
for a in col_iter(x):
print a
y = [1,2,3,4]
for a in col_iter(y):
print a
The only thing that i can think of is that python for loops are looking for something to iterate through not just a value. If you think about it what would the value of "a" be? if you want it to be the number 2 then you don't need the for loop in the first place. If you want it to go through 1, 2 or 0, 1, 2 then you want. for a in range(x): not positive if that's the answer you're looking for but it's what i got.

Defining multiple objects based on size of two lists (python)

I am trying to find a way of creating objects based on the size of two lists. I have to create an object for each combination of indices of the two lists, i.e. if both lists is of the length 3, 9 new objects should be created and defined.
The lists can be of rather large lengths and it would make the script a lot nicer if I did not have to use an if loop to go through all possible combinations.
A first I thought I could do the following:
for i in range(len(list1)):
for j in range(len(list2):
Name_of_Object+[i]+[j] = (object definition)
But this is not possible and I get the following error:
SyntaxError: can't assign to operator
But is there a way of creating objects based on indices of a list?
Best,
Martin
(I am using the Canopy environment to do my python programming.)
Why not define these objects in a list and then you can access individual variables as li[i][j]
li = []
for i in range(len(list1)):
tempLi = []
for j in range(len(list2)):
tempLi.append((object definition))
li.append(tempLi)
we can't concatenate to an operator but we can use it an array. that is
Name_of_Object[i][j] = (object definition)
Maybe you could do your assignment with an exec statement
exec("Name_of_Object{0}{1} = object declaration".format(i,j))
But I don't think this is a good idea because you won't be able to call your objects without an exec statement further in your program unless you specificaly want Name_of_Object01 or Name_of_Object02,...
for instance if you need to loop over your instances each time you want to do something with it you will need to write:
exec("Name_of_Object{0}{1}.method(...)".format(i,j))
So I think you should you use a multidimensional array

Adding Elements from a List of Lists to a Set?

I'm attempting to add elements from a list of lists into a set. For example if I had
new_list=[['blue','purple'],['black','orange','red'],['green']]
How would I receive
new_set=(['blue','purple'],['black','orange','red'],['green'])
I'm trying to do this so I can use intersection to find out what elements appear in 2 sets. I thought this would work...
results=set()
results2=set()
for element in new_list:
results.add(element)
for element in new_list2:
results2.add(element)
results3=results.intersection(results2)
but I keep receiving:
TypeError: unhashable type: 'list'
for some reason.
Convert the inner lists to tuples, as sets allow you to store only hashable(immutable) objects:
In [72]: new_list=[['blue','purple'],['black','orange','red'],['green']]
In [73]: set(tuple(x) for x in new_list)
Out[73]: set([('blue', 'purple'), ('black', 'orange', 'red'), ('green',)])
How would I receive
new_set=(['blue','purple'],['black','orange','red'],['green'])
Well, despite the misleading name, that's not a set of anything, that's a tuple of lists. To convert a list of lists into a tuple of lists:
new_set = tuple(new_list)
Maybe you wanted to receive this?
new_set=set([['blue','purple'],['black','orange','red'],['green']])
If so… you can't. A set cannot contain unhashable values like lists. That's what the TypeError is telling you.
If this weren't a problem, all you'd have to do is write:
new_set = set(new_list)
And anything more complicated you write will have exactly the same problem as just calling set, so there's no tricky way around it.
Of course you can have a set of tuples, since they're hashable. So, maybe you wanted this:
new_set=set([('blue','purple'),('black','orange','red'),('green')])
That's easy too. Assuming your inner lists are guaranteed to contain nothing but strings (or other hashable values), as in your example it's just:
new_set = set(map(tuple, new_list))
Or, if you use a sort-based set class, you don't need hashable values, just fully-ordered values. For example:
new_set = sortedset(new_list)
Python doesn't come with such a thing in the standard library, but there are some great third-party implementations you can install, like blist.sortedset or bintrees.FastRBTree.
Of course sorted-set operations aren't quite as fast as hash operations in general, but often they're more than good enough. (For a concrete example, if you have 1 million items in the list, hashing will make each lookup 1 million times faster; sorting will only make it 50,000 times faster.)
Basically, any output you can describe or give an example of, we can tell you how to get that, or that it isn't a valid object you can get… but first you have to tell us what you actually want.
By the way, if you're wondering why lists aren't hashable, it's just because they're mutable. If you're wondering why most mutable types aren't hashable, the FAQ explains that.
Make the element a tuple before adding it to the set:
new_list=[['blue','purple'],['black','orange','red'],['green']]
new_list2=[['blue','purple'],['black','green','red'],['orange']]
results=set()
results2=set()
for element in new_list:
results.add(tuple(element))
for element in new_list2:
results2.add(tuple(element))
results3=results.intersection(results2)
print results3
results in:
set([('blue', 'purple')])
Set elements have to be hashable.
for adding lists to a set, instead use tuple
for adding sets to a set, instead use frozenset

length of Python list when list has a single value

Okay I concede that I didn't ask the question very well. I will update my question to be more precise.
I am writing a function that takes a list as an argument. I want to check the length of the list so I can loop through the list.
The problem that I have is when the list has only one entry, len(myList) returns the length of that entry (the length of the string) and not the length of the list which should be == 1.
I can fix this if I force the argument to be parsed as a single value list ['val']. But I would prefer my API to allow the user to parse either a value or a list of values.
example:
def myMethod(self,dataHandle, data,**kwargs):
comment = kwargs.get('comment','')
_dataHandle= list()
_data = list()
_dataHandle.append(dataHandle)
_data.append(data)
for i in range(_dataHandle):
# do stuff.
I would like to be able to call my method either by
myMethod('ed', ed.spectra,comment='down welling irradiance')
or by
myMethod(['ed','lu'] , [ed.spectra,lu.spectra] , comments = ['downwelling', upwelling radiance'])
Any help would be greatly appreciated. Might not seem like a big deal to parse ['ed'], but it breaks the consistency of my API so far.
The proper python syntax for a list consisting of a single item is [ 'ed' ].
What you're doing with list('ed') is asking python to convert 'ed' to a list. This is a consistent metaphor in python: when you want to convert something to a string, you say str(some_thing). Any hack you'd use to make list('ed') return a list with just the string 'ed' would break python's internal metaphors.
When python sees list(x), it will try to convert x to a list. If x is iterable, it does something more or less equivalent to this:
def make_list(x):
ret_val = []
for item in x:
ret_val.append(item)
return ret_val
Because your string 'ed' is iterable, python will convert it to a list of length two: [ 'e', 'd' ].
The cleanest idiomatic python in this case might be to have your function accept a variable number of arguments, so instead of this
def my_func(itemList):
...
you'd do this
def my_func(*items):
...
And instead of calling it like this
my_func(['ed','lu','lsky'])
You'd call it like this:
my_func('ed', 'lu', 'lsky')
In this way you can accept any number of arguments, and your API will be nice and clean.
You can ask if your variable is a list:
def my_method(my_var):
if isinstance(my_var, list):
for my_elem in my_var:
# do stuff with my_elem
else: # my_var is not iterable
# do stuff with my_var
EDIT: Another option is to try iterating over it, and if it fails (raises and exception) you assume is a single element:
def my_method(my_var):
try:
for my_elem in my_var:
# do stuff with my_elem
except TypeError: # my_var is not iterable
# do_stuff with my_var
The good thing about this second options is that it will work not only for lists, as the first one, but with anything that is iterable (strings, sets, dicts, etc.)
You do actually need to put your string in a list if you want it to be treated like a list
EDIT
I see that at some point there was a list in front of the string. list, contrary to what you may think, doesn't create a list of one item. It calls __iter__ on the string object and iterates over each item. Thus it makes a list of characters.
Hopefully this makes it clearer:
>>> print(list('abc'))
['a', 'b', 'c']
>>> print(list(('abc',)))
['abc']
list('ed') does not create a list containing a single element, 'ed'. list(x) in general does not create a list containing a single element, x. In fact, if you had been using numbers rather than strings (or anything else non-iterable), this would have been blindingly obvious to you:
>>> list('ed')
['e', 'd']
>>> list(3)
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
list(3)
TypeError: 'int' object is not iterable
>>
So you are in fact passing a list with multiple elements to your method, which is why len is returning greater than 1. It's not returning the length of the first element of the list.
For your method to allow passing either a single item or a list, you'd have to do some checking to see if it's a single item first, and if it is create a list containing it with myVar = [myVar], then run your loop.
However this sort of API is tricky to implement and use, and I would not recommend it. The most natural way to check if you've been given a collection or an item is see if myVar is iterable. However this fails for strings, which are iterable. Strings unfortunately straddle the boundry between a collection and an individual data item; we very very often use them as data items containing a "chunk of text", but they are also collections of characters, and Python allows them to be used as such.
Such an API also is likely to cause you to one day accidentally pass a list that you're thinking of as a single thing and expecting the method to treat it as a single thing. But it's a list, so suddenly the code will behave differently.
It also raises questions about what you do with other data types. A dictionary is not a list, but it can be iterated. If you pass a dictionary as myVar, will it be treated as a list containing a single dictionary, or will it iterate over the keys of the dictionary? How about a tuple? What about a custom class implementing __iter__? What if the custom class implementing __iter__ is trying to be "string-like" rather than "list-like"?
All these questions lead to surprises if the caller guesses/remembers wrongly. Surprises when programming lead to bugs. IMHO, it's better to just live with the extra two characters of typing ([ and ]), and have your API be clean and simple.
I run into this same problem frequently. Building a list from an empty list, as you are doing with the "_dataHandle= list()" line, is common in Python because we don't reserve memory in advance. Therefore, it is often the case that the state of the list will transition from empty, to one element, to multiple elements. As you found, Python treats the indexing different for one element vs. multiple elements. If you can use list comprehension, then the solution can be simple. Instead of:
for i in range(_dataHandle):
use:
for myvar in _dataHandle:
In this case, if there is only one element, the loop only iterates once as you would expect.

Categories

Resources