Python: Splitting lists into multiple lists

Python: Splitting lists into multiple lists - python

In python, Suppose I have a list:
instruments = ['apd', 'dd', 'dow', 'ecl']
How can I split these lists so that it will create:
apd[]
dd[]
dow[]
ecl[]
Thanks for the help.

You would do this:
dictionaries = {i:[] for i in instruments}
and you you would refer to each list this way:
dictionaries['apd']
dictionaries['dd']
dictionaries['dow']
dictionaries['ecl']
This is considered much better practice than actually having the lists in the current namespace, as it would both be polluting and unpythonic.
mshsayem has the method to place the lists in the current scope, but the question is, what benefits do you get from putting them in your current scope?
Standard use cases:
you already know the names of the items, and want to refer to them directly by name, i.e. apd.append
you don't know the names yet, but you'll use eval or locals to get the lists, i.e. eval('apd').append or locals()['apd'].append
Both can be satisfied using dictionaries:
dictionaries['<some name can be set programatically or using a constant>'].append

Try this:
instruments = ['apd', 'dd', 'dow', 'ecl']
l_dict = locals()
for i in instruments:
l_dict[i] = []
This will create apd,dd,dow,ecl lists in the local scope.
Snakes and Cofee's idea is better though.

Related

Generating a list from several dataframes

I have several data frames named data1, data2, data3, data4, ... data100. How can I store them in a list so that I'm able to plot them with a for loop.
Thanks in advance.

The problem you are experiencing is a symptom of using numbered variables.
You can avoid the problem entirely by using a list of DataFrames (e.g. data)
instead of using numbered variables (data1, data2, data3, etc.)
The trick is to avoid creating the numbered variables in the first place.
If you have code of the form
data1 = ...
data2 = ...
data3 = ...
Try to replace it with something like
data = []
data.append(...)
data.append(...)
data.append(...)
or better yet, use a list comprehension or a for-loop to define data. For more specific suggestions, show us the code that defines the numbered variables.
Then you could loop over the DataFrames with
for df in data:
df.plot(...)
If for some reason you can not prevent (someone else?) from defining the numbered variables, then you could use globals() (or locals()) to access the numbered variables programmatically:
g = globals()
data = [g['data{}'.format(i)] for i in range(1, 101)]
globals() returns a dictionary whose keys are string representations of names
in the global namespace. The associated values are the Python objects bound to
those names. Thus, you can use globals() to look up the values bound to
variable names based on the string representation of those variable names.
Use locals() if the variable names are defined in the local (rather than global) namespace.
Still, try to avoid using numbered variables. This use of globals() is merely a workaround for trouble someone else is causing. Using string formatting to look up variable names is not great programming style when simple integer indexing (of a list) should suffice. The best solution is to convince that someone to stop using numbered variables and to instead deliver the values in a list.

Variable Variables in Python

I have a list of n strings. For example, strings = ('path1','path2',path3')
I want to create n variables that are equal to functions on these strings. For example:
s1=pygame.mixer.Sound('path1')
s2=pygame.mixer.Sound('path2')
s3=pygame.mixer.Sound('path3')`
I've looked this up a few times before and answers always seem to refer to dictionaries. I am not too familiar with dictionaries although I know their basic function. I don't know how I would use a dictionary to accomplish this.

The problem with dynamically creating variables is: how do you plan on referring them in your code? You'll need to have some abstracted mechanism for dealing with 0..n objects, so you might as well store them in a data type that can deal with collections. How you store them depends on what you want to do with them. The two most obvious choices are list and dict.
Generally, you'll use a list if you want to deal with them sequentially:
paths = ['path1', 'path2', 'path3']
sounds = [ pygame.mixer.Sound(path) for path in paths ]
# play all sounds sequentially
for sound in sounds:
sound.play()
Whereas dict is used if you have some identifier you want to use to refer to the items:
paths = ['path1', 'path2', 'path3']
sounds = { path: pygame.mixer.Sound(path) for path in paths }
# play a specific sound
sounds[name].play()

You don't need to use a dictionary. Use map.
s = map(pygame.mixer.Sound, strings)
The above statement will call pygame.mixer.Sound with each of the strings in strings as arguments, and return the results as a list. Then you can access the variables like you would access any item from a list.
s1 = s[0] # based on your previous definition

The idea is you use a dictionary (or a list) instead of doing that. The simplest way is with a list:
sounds = [pygame.mixer.Sound(path) for path in strings]
You then access them as sounds[0], sounds[1], sounds[2], etc.

List comprehension : is there a concise way to refer to the initial expression in the if condition?

I am trying to write a pretty simple list comprehension of the form
[initial-expression for name in collection if condition(initial-expression)]
But I am facing a case where intial expression is embedding some 'advanced' logic I do not want to duplicate in the if condition.
Verbose solution
At this point, I wrote :
[alias for alias in [initial-expression for name in collection]
if condition(alias)]
Since the initial expression (in the outermost list comprehension) is the identity, it seems overkill.
Is there a common way to refer to an initial expression in the if condition using some symbolic name ?

Yup, it's called filter and map :)
filter(condition, map(lambda name: initial-expression, collection))

In practice, you should go with your "verbose" solution. There is one improvement I would make for the sake of code clarity and efficiency.
Change:
mylist = [alias for
alias in [initial_expression for
name in collection]
if condition(alias)]
to
aliases = (initial_expression for name in collection)
mylist = [alias for alias in aliases if condition(alias)]
or
aliases = (initial_expression for name in collection)
mylist = list(filter(condition, aliases))

You can essentially create aliasing in a list comprehension by adding a generator that iterates over a single-element collection. Like so:
[alias for name in collection for alias in [initial-expression] if condition(alias)]
It's not any less verbose than your example, but it's basically equivalent to what you'd get if python allowed something like let alias = initial-expression in a list comprehension; it doesn't change the structure of the list comprehension the way your example does.

what is meant by parallel lists in Python

i have just started using python and i can't figure out what is meant by parallel list
any info would be great . i think it is just using two list to store info

"Parallel lists" is a variation on the term "parallel array". The idea is that instead of having a single array/list/collection of records (objects with attributes, in Python terminology) you have a separate array/list/collection for each field of a conceptual record.
For example, you could have Person records with a name, age, and occupation:
people = [
Person(name='Bob', age=38, occupation=PROFESSIONAL_WEASEL_TRAINER),
Person(name='Douglas', age=42, occupation=WRITER),
# etc.
]
or you could have "parallel lists" for each attribute:
names = ['Bob', 'Douglas', ...]
ages = [38, 42, ...]
occupations = [PROFESSIONAL_WEASEL_TRAINER, WRITER, ...]
Both of these approaches store the same information, but depending on what you're doing one may be more efficient to deal with than the other. Using a parallel collection can also be handy if you want to sort of "annotate" a given collection without actually modifying the original.
(Parallel arrays were also really common in languages that didn't support proper records but which did support arrays, like many versions of BASIC for 8-bit machines.)

The term 'parallel lists' doesn't exist. Maybe you're talking about iterating through two lists in parallel. Then it means you iterate both in the same time. For more read "how can I iterate through two lists in parallel in Python?".

If you're trying to iterate over corresponding items in two or more lists, see itertools.izip (or just use the zip builtin if you're using Python 3).

The only time I've seen this term in use was when I was using Haskell. See here:
http://www.haskell.org/ghc/docs/5.00/set/parallel-list-comprehensions.html
Essentially the python equivalent is:
[(x,y) for x in range(1,3) for y in range(1,3)]
However you can just use zip/izip for this.

It sometimes refers to two lists whose elements are in correspondence. See this SO question.

LIst Comprehensions: References to the Components

In sum: I need to write a List Comprehension in which i refer to list that is being created by the List Comprehension.
This might not be something you need to do every day, but i don't think it's unusual either.
Maybe there's no answer here--still, please don't tell me i ought to use a for loop. That might be correct, but it's not helpful. The reason is the problem domain: this line of code is part of an ETL module, so performance is relevant, and so is the need to avoid creating a temporary container--hence my wish to code this step in a L/C. If a for loop would work for me here, i would just code one.
In any event, i am unable to write this particular list comprehension. The reason: the expression i need to write has this form:
[ some_function(s) for s in raw_data if s not in this_list ]
In that pseudo-code, "this_list" refers to the list created by evaluating that list comprehension. And that's why i'm stuck--because this_list isn't built until my list comprehension is evaluated, and because this list isn't yet built by the time i need to refer to it, i don't know how to refer to it.
What i have considered so far (and which might be based on one or more false assumptions, though i don't know exactly where):
doesn't the python interpreter have
to give this list-under-construction
a name? i think so
that temporary name is probably taken
from some bound method used to build
my list ('sum'?)
but even if i went to the trouble to
find that bound method and assuming
that it is indeed the temporary name
used by the python interpreter to
refer to the list while it is under
construction, i am pretty sure you
can't refer to bound methods
directly; i'm not aware of such an
explicit rule, but those methods (at
least the few that i've actually
looked at) are not valid python
syntax. I'm guessing one reason why
is so that we do not write them into
our code.
so that's the chain of my so-called reasoning, and which has led me to conclude, or at least guess, that i have coded myself into a corner. Still i thought i ought to verify this with the Community before turning around and going a different direction.

There used to be a way to do this using the undocumented fact that while the list was being built its value was stored in a local variable named _[1].__self__. However that quit working in Python 2.7 (maybe earlier, I wasn't paying close attention).
You can do what you want in a single list comprehension if you set up an external data structure first. Since all your pseudo code seemed to be doing with this_list was checking it to see if each s was already in it -- i.e. a membership test -- I've changed it into a set named seen as an optimization (checking for membership in a list can be very slow if the list is large). Here's what I mean:
raw_data = [c for c in 'abcdaebfc']
seen = set()
def some_function(s):
seen.add(s)
return s
print [ some_function(s) for s in raw_data if s not in seen ]
# ['a', 'b', 'c', 'd', 'e', 'f']
If you don't have access to some_function, you could put a call to it in your own wrapper function that added its return value to the seen set before returning it.
Even though it wouldn't be a list comprehension, I'd encapsulate the whole thing in a function to make reuse easier:
def some_function(s):
# do something with or to 's'...
return s
def add_unique(function, data):
result = []
seen = set(result) # init to empty set
for s in data:
if s not in seen:
t = function(s)
result.append(t)
seen.add(t)
return result
print add_unique(some_function, raw_data)
# ['a', 'b', 'c', 'd', 'e', 'f']
In either case, I find it odd that the list being built in your pseudo code that you want to reference isn't comprised of a subset of raw_data values, but rather the result of calling some_function on each of them -- i.e. transformed data -- which naturally makes one wonder what some_function does such that its return value might match an existing raw_data item's value.

I don't see why you need to do this in one go. Either iterate through the initial data first to eliminate duplicates - or, even better, convert it to a set as KennyTM suggests - then do your list comprehension.
Note that even if you could reference the "list under construction", your approach would still fail because s is not in the list anyway - the result of some_function(s) is.

As far as I know, there is no way to access a list comprehension as it's being built.
As KennyTM mentioned (and if the order of the entries is not relevant), then you can use a set instead. If you're on Python 2.7/3.1 and above, you even get set comprehensions:
{ some_function(s) for s in raw_data }
Otherwise, a for loop isn't that bad either (although it will scale terribly)
l = []
for s in raw_data:
item = somefunction(s)
if item not in l:
l.append(item)

Why don't you simply do:[ some_function(s) for s in set(raw_data) ]
That should do what you are asking for. Except when you need to preserve the order of the previous list.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.