I have a function that takes as an argument either a list of objects or a single object. I then want to loop through the elements of the list or operate on the single object if it is not a list.
Below, I use numpy.atleast_1d().tolist() to ensure that a loop works whether or not the argument is a list or a single object. However, I am not sure if converting the object to a numpy array and then to a list may cause any unintended changes to the object.
Is there a way to ensure the argument is transformed into a list if it is not a list? I have two possible solutions in a simple example below, but wanted to know if there are any better ones.
import numpy as np
def printer1(x):
for xi in np.atleast_1d(x).tolist():
print(xi)
def printer2(x):
if type(x) != list:
x = [x]
for xi in x:
print(xi)
x1 = 'a'
x2 = ['a','b','c']
printer1(x1)
printer1(x2)
printer2(x1)
printer2(x2)
I'm using Python 2.7
In your function you can add check for array. I think this is one way to do it. You dont even need to use numpy for this.
def foo(x):
x = [x] if not isinstance(x, list) else x
printx # or do whatever you want to do
# or
for value in x:
print value
foo('a')
foo(['a','b'])
output:
['a']
a
['a', 'b']
a
b
To ensure that the element will be a list even that it has only one element, declare its value inside square brackets:
foo = ['stringexample']
foo2 = ['a','b']
for foos in foo:
print (foos)
for foos2 in foo2:
print (foos2)
This way, even that 'foo' has only a single string, it will still operate as a list with only one element.
Also, you could try this:
declare a empty list
use youremptylist.extend(incoming value)
It will iterate a new list for each incoming value, even that it is a single one
As Roni is saying, you can use this:
def printer(x):
finalList = []
finalList.extend(x)
print finalList
if x is a single value, it will be added to the finalList, if x is a list, it will be joined to finalList and you can iterate throught it.
If you want loopable things mostly untouched and non loopables behave like a 1-element list you could do something like:
def forceiter(x):
return getattr(x,"__iter__",lambda:(x,))()
Demo:
for x in [1,[2],range(3),"abc",(),{3:3,4:"x"}, np.logspace(0,3,4)]:
print(x,end=" --> ")
for i in forceiter(x):
print(i,end=" ")
print()
# 1 --> 1
# [2] --> 2
# range(0, 3) --> 0 1 2
# abc --> a b c
# () -->
# {3: 3, 4: 'x'} --> 3 4
# [ 1. 10. 100. 1000.] --> 1.0 10.0 100.0 1000.0
Related
Trying to change all 5's into 100's. I know you should use list comprehension but why doesn't this work? Someone can explain theoretically? Thank you.
d = [5,1,1,1,5]
def f1(seq):
for i in seq:
if i==5:
i = 100
return seq
print (f1(d))
This line:
i = 100
Gives the local variable i, which was originally assigned that value in seq, the value 100.
To change the value in the sequence, you could do:
for index, object in enumerate(seq):
if object == 5:
seq[index] = 100
Enumerate returns two objects each time it is called on a sequence, the index as a number, and the object itself.
See the docs on lists and (Python 2) enumerate.
You could have also written:
for index in range(len(seq)):
if seq[index] == 5:
seq[index] = 100
Which you may prefer, but is sometimes considered less clean.
The Python assignment operator binds a value to a name. Your loop for i in seq binds a value from seq to the local name i on every iteration. i = 100 then binds the value 100 to i. This does not affect the original sequence, and the binding will be changed again in the next iteration of the loop.
You can use enumerate to list the indices along with the values of seq and perform the binding that way:
def f1(seq):
for n, i in enumerate(seq):
if i == 5:
seq[n] = 100
return seq
Even simpler may be to just iterate over the indices:
def f2(seq):
for n in range(len(seq)):
if seq[n] == 5:
seq[n] = 100
return seq
The options shown above will modify the sequence in-place. You do not need to return it except for convenience. There are also options for creating a new sequence based on the old one. You can then rebind the variable d to point to the new sequence and drop the old one.
The easiest and probably most Pythonic method would be using a list comprehension:
d = [5, 1, 1, 1, 5]
d = [100 if x == 5 else x for x in d]
You can also use map:
d = list(map(lambda x: 100 if x == 5 else x, d))
The output of map is a generator, so I have wrapped it in list to retain the same output type. This would not be necessary in Python 2, where map already returns a list.
Take the following example:
def f1(seq):
for i in seq:
if i==5:
i = 100
# at this point (assuming i was 5), i = 100 but seq is still [3,5,7]
# because i is not a reference to the item but the value *copied* from the list
...
f1([3,5,7])
You could instead loop through the indices and set the value at that index in the list:
d = [5,1,1,1,5]
def f1(seq):
for i in range(len(seq)):
if seq[i]==5:
seq[i] = 100
return seq
print(f1(d))
# [100,1,1,1,100]
You should update the element at the list, like that:
def f1(seq):
for i in range(len(seq)): # run through the indexes of the list
if seq[i]==5: # check whether seq at index i is 5
seq[i] = 100 # update the list at the same index to 100
return seq
i is a new variable created inside the loop, therefore it's not the same reference as the element inside the list.
NOTE:
Note that list is a mutable object, therefore changing seq inside the function will affect the list even outside the function.
You can read more about mutable and immutable in here
I'm trying to take the values from the previous function and use in another function. This is my first programming class and language, and i'm totally lost.
I figured out how to take the variables from astlist and put them into the function distance, but now Python is telling me I can't use these variables in an equation because they're in a list now? Is that what it's saying?
I'm also just printing the lists to see if they are running. These are two of my functions, and the functions are both defined in my main function.
I'm taking these lists and eventually putting them into files, but I need to figure out why the equation isn't working first. Thanks!
def readast():
astlist = []
for j in range(15):
list1 = []
for i in range(3):
x = random.randint(1,1000)
y = random.randint(1,1000)
z = random.randint(1,1000)
list1.append([x,y,z])
astlist.append(list1)
print(astlist)
return astlist
def distance(astlist):
distlist = []
for row in range(len(astlist)):
x, y, z = astlist[row]
x1 = x**2
y2 = y**2
z2 = z**2
equation = math.sqrt(x+y+z)
distlist.append(equation)
print(distlist)
return distlist
The variable astlist is a list. You're adding list1 to it several times which is also a list. But you're also adding a list to list1 each time: list1.append([x,y,z]). So ultimately astlist is a list containing multiple lists which each contain a list with three integers.
So when you write x,y,z=astlist[row] the variables x, y and z are actually lists, not integers. This means you're trying to compute x**2 but x is a list, not a number. This is why Python is giving you an error message as ** doesn't support raising a list to a power.
I'm not sure what you're trying to accomplish with all these lists but you should change the code so that you're only trying to raise numbers to the power of two and not lists.
There are a few problems here:
Firstly the loop at the top of readast() sets list1 to [] 15 times - I'm not sure what you're trying to do here. If you are trying to generate 15 sets of x,y,z coordinates then it is the second range - in your example the range(3)
- that you need to change.
Then you keep adding lists of [x,y,z] to (the same) list1, then adding the whole of list1 to astlist. However, Python actually stores a pointer to the list rather than a copy so when you add items to list1, it adds items to list1 whereever list1 is included in another list:
In this example the random numbers are replaced with sequential numbers for clarity (the first random number is 1, the second 2 and so on):
After first cycle of loop:
list1: [[1,2,3]]
astlist: [[[1,2,3]]]
After second cycle of loop:
list1: [[1,2,3],[4,5,6]]
astlist: [[[1,2,3],[4,5,6]],[[1,2,3],[4,5,6]]]
and so on
As you can see, list1 is now a list of lists, and astlist is now a list of duplicates of list1 (a list of lists of lists)
list1 is probably redundant and you probably want just
astlist.append([x,y,z])
in the first bit.
In the second function, you use
for row in range(len(astlist)):
x,y,z=astlist[row]
...
but actually the following would be better:
for row in astlist:
x,y,z=row
...
or even:
for x,y,z in astlist:
...
as for loops in Python iterate over members of a sequence (or other iterable value) rather being just a simple counter. What you are doing with the range(len(astlist)) construct is actually generating a list [1,2,3...] and iterating over that.
If you particularly need a numerical index then you can use the enumerate function which returns a series of (index,value) pairs that you can iterate over thus:
for i,value in enumerate(['apple','banana','cherry']):
print 'value {} is {}'.format(i,value)
value 0 is apple
value 1 is ball
value 2 is cherry
Hope this helps
I will refer to the specific type error (TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'), for some reason arrays are iterable objects for operations like addition: sum(array), but, power: array**2, pow(array,2). You can solve this with some extra steps as follow:
x1 = [j**2 for j in x]
also I recommend to use sum function
sum(x,y,z)
remember all this is to avoid the error message that you were referring to
that way you apply the power of 2 to each element in the array, getting a new array and avoiding the error message that you were asking help for. It seems to me that you are looking to get a normalization of your data using norm L2, if that is true, well, I think you are missing half of it.
how do I create tuple from two randomly generated lists with separate function ? zip function will create only tuple of those two arrays as whole but I need the numbers to be coupled like (1,2),(3,4). Thanks.
import random
def ars():
arr1 = []
arr2 = []
for i in range(10):
x = random.randrange(100)
arr1.append(x)
for j in range(10):
y = random.randrange(100)
arr2.append(y)
return(arr1,arr2)
x = ars()
print x
y = zip(ars())
print y
zip function accepts multiple iterables as its arguments, so you simply have to unpack the values from the returned tuple with * (splat operator):
y = zip(*ars())
With zip(([1], [2])) only one iterable is submitted (that tuple).
In zip(*([1], [2])) you unpack 2 lists from tuple, so zip receives 2 iterables.
You can avoid having to zip by using map
def ars(arr_len, rand_max):
return [map(random.randrange, [rand_max]*2) for x in range(arr_len)]
call ars like: ars(10,100), if you really need tuples instead of lists, wrap the map statement in a tuple() function.
I'm trying to parse a tuple of the form:
a=(1,2)
or
b=((1,2), (3,4)...)
where for a's case the code would be:
x, y = a
and b would be:
for element in b:
x, y = element
is there an fast and clean way to accept both forms? This is in a MIDI receive callback
(x is a pointer to a function to run, and y is intensity data to be passed to a light).
# If your input is in in_seq...
if hasattr(in_seq[0], "__iter__"):
# b case
else:
# a case
This basically checks to see if the first element of the input sequence is iterable. If it is, then it's your second case (since a tuple is iterable), if it's not, then it's your first case.
If you know for sure that the inputs will be tuples, then you could use this instead:
if isinstance(in_seq[0], tuple):
# b case
else:
# a case
Depending on what you want to do, your handling for the 'a' case could be as simple as bundling the single tuple inside a larger tuple and then calling the same code on it as the 'b' case, e.g...
b_case = (a_case,)
Edit: as pointed out in the comments, a better version might be...
from collections import Iterable
if isinstance(in_seq[0], Iterable):
# ...
The right way to do that would be:
a = ((1,2),) # note the difference
b = ((1,2), (3,4), ...)
for pointer, intensity in a:
pass # here you do what you want
def __init__(self,emps=str(""),l=[">"]):
self.str=emps
self.bl=l
def fromFile(self,seqfile):
opf=open(seqfile,'r')
s=opf.read()
opf.close()
lisst=s.split(">")
if s[0]==">":
lisst.pop(0)
nlist=[]
for x in lisst:
splitenter=x.split('\n')
splitenter.pop(0)
splitenter.pop()
splitstring="".join(splitenter)
nlist.append(splitstring)
nstr=">".join(nlist)
nstr=nstr.split()
nstr="".join(nstr)
for i in nstr:
self.bl.append(i)
self.str=nstr
return nstr
def getSequence(self):
print self.str
print self.bl
return self.str
def GpCratio(self):
pgenes=[]
nGC=[]
for x in range(len(self.lb)):
if x==">":
pgenes.append(x)
for i in range(len(pgenes)):
if i!=len(pgenes)-1:
c=krebscyclus[pgenes[i]:pgenes[i+1]].count('c')+0.000
g=krebscyclus[pgenes[i]:pgenes[i+1]].count('g')+0.000
ratio=(c+g)/(len(range(pgenes[i]+1,pgenes[i+1])))
nGC.append(ratio)
return nGC
s = Sequence()
s.fromFile('D:\Documents\Bioinformatics\sequenceB.txt')
print 'Sequence:\n', s.getSequence(), '\n'
print "G+C ratio:\n", s.GpCratio(), '\n'
I dont understand why it gives the error:
in GpCratio for x in range(len(self.lb)): AttributeError: Sequence instance has no attribute 'lb'.
When i print the list in def getSequence it prints the correct DNA sequenced list, but i can not use the list for searching for nucleotides. My university only allows me to input 1 file and not making use of other arguments in definitions, but "self"
btw, it is a class, but it refuses me to post it then.. class called Sequence
Looks like a typo. You define self.bl in your __init__() routine, then try to access self.lb.
(Also, emps=str("") is redundant - emps="" works just as well.)
But even if you correct that typo, the loop won't work:
for x in range(len(self.bl)): # This iterates over a list like [0, 1, 2, 3, ...]
if x==">": # This condition will never be True
pgenes.append(x)
You probably need to do something like
pgenes=[]
for x in self.bl:
if x==">": # Shouldn't this be != ?
pgenes.append(x)
which can also be written as a list comprehension:
pgenes = [x for x in self.bl if x==">"]
In Python, you hardly ever need len(x) or for n in range(...); you rather iterate directly over the sequence/iterable.
Since your program is incomplete and lacking sample data, I can't run it here to find all its other deficiencies. Perhaps the following can point you in the right direction. Assuming a string that contains the characters ATCG and >:
>>> gene = ">ATGAATCCGGTAATTGGCATACTGTAG>ATGATAGGAGGCTAG"
>>> pgene = ''.join(x for x in gene if x!=">")
>>> pgene
'ATGAATCCGGTAATTGGCATACTGTAGATGATAGGAGGCTAG'
>>> ratio = float(pgene.count("G") + pgene.count("C")) / (pgene.count("A") + pgene.count("T"))
>>> ratio
0.75
If, however, you don't want to look at the entire string but at separate genes (where > is the separator), use something like this:
>>> gene = ">ATGAATCCGGTAATTGGCATACTGTAG>ATGATAGGAGGCTAG"
>>> genes = [g for g in gene.split(">") if g !=""]
>>> genes
['ATGAATCCGGTAATTGGCATACTGTAG', 'ATGATAGGAGGCTAG']
>>> nGC = [float(g.count("G")+g.count("C"))/(g.count("A")+g.count("T")) for g in genes]
>>> nGC
[0.6875, 0.875]
However, if you want to calculate GC content, then of course you don't want (G+C)/(A+T) but (G+C)/(A+T+G+C) --> nGC = [float(g.count("G")+g.count("C"))/len(g)].