Jupyter iPython Notebook and Command Line yield different results - python

I have the following Python 2.7 code:
def average_rows2(mat):
'''
INPUT: 2 dimensional list of integers (matrix)
OUTPUT: list of floats
Use map to take the average of each row in the matrix and
return it as a list.
Example:
>>> average_rows2([[4, 5, 2, 8], [3, 9, 6, 7]])
[4.75, 6.25]
'''
return map(lambda x: sum(x)/float(len(x)), mat)
When I run it in my browser using iPython notebook, I get the following output:
[4.75, 6.25]
However, when I run the code's file on Command Line (Windows), I get the following error:
>python -m doctest Delete.py
**********************************************************************
File "C:\Delete.py", line 10, in Delete.average_rows2
Failed example:
average_rows2([[4, 5, 2, 8], [3, 9, 6, 7]])
Expected:
[4.75, 6.25]
Got:
<map object at 0x00000228FE78A898>
**********************************************************************
Why does the command line toss an error? Is there a better way to structure my function?

It seems like your command line is running Python 3. The builtin map returns a list in Python 2, but an iterator (a map object) in Python 3. To turn the latter into a list, apply the list constructor to it:
# Python 2
average_rows2([[4, 5, 2, 8], [3, 9, 6, 7]]) == [4.75, 6.25]
# => True
# Python 3
list(average_rows2([[4, 5, 2, 8], [3, 9, 6, 7]])) == [4.75, 6.25]
# => True

Related

How do I fix this syntax error with list slicing

I am trying to use list splicing to rotate a value in a list but I can't figure out why my brackets are not closing. the issue in question is on line 3. It is throwing an invalid syntax error saying that "[" is not closed
code is below
def rotate_list(data, amount):
result = []
result.append(data = [amount:])
return result
print(rotate_list([1,2,3,4,5,6,7,8,9],5)) # [5, 6, 7, 8, 9, 1, 2, 3, 4]
print(rotate_list([1,2,3,4,5,6,7,8,9],9)) # [1, 2, 3, 4, 5, 6, 7, 8, 9]
[amount:] is not syntactically correct, as you are missing the list in which to iterate. Try rewriting your function as
def rotate_list(data, amount):
return data[amount:] + data[:amount]

Errors while importing Operator (Python)

I am a little confused after a couple attempts while importing Operator and receiving errors. Along with a couple of examples, I've shared a python doc link for reference below.
What I'm expecting to happen below is that operator will run the product and multiply 3 * 4 in the data list which the answer will start [3, 12....] then multiply 12 by the next element '6' to give, [3, 12, 72...]. However importing Operator here isn't working as expected?
The Output I'm expecting for this problem is:
[3, 12, 72, 144, 144, 1296, 0, 0, 0, 0]
Running the below code in PythonTutor.com gives me an Error:
ImportError: cannot import name 'operator'
from itertools import operator
data = [3, 4, 6, 2, 1, 9, 0, 7, 5, 8]
list(accumulate(data, operator.mul))
I've gotten the same type of error running this in Jupyter notebook:
ImportError Traceback (most recent call last)
<ipython-input-1-bc61652bebb8> in <module>
----> 1 from itertools import operator
2
3 data = [3, 4, 6, 2, 1, 9, 0, 7, 5, 8]
4 list(accumulate(data, operator.mul))
ImportError: cannot import name 'operator' from 'itertools' (unknown location)
I've spelled check about 100 times and I've ran these on both PythonTutor and Jupyter NB, and both are giving me errors - can this be an issue with itertools?
Below is from The Python Docs. I'm using the first case:
operator.mul(a, b)
I'll share for your reference: Here
----> operator.mul(a, b)
operator.__mul__(a, b)
Return a * b, for a and b numbers.
Why isn't this working, and how can I fix it?
operator is its own module, not part of itertools:
import itertools
import operator
Note that itertools.accumulate doesn't modify the iterable it is given. It returns a new object which you are not using above. Consider assigning it to a new variable:
data = [3, 4, 6, 2, 1, 9, 0, 7, 5, 8]
accumulated_list = list(itertools.accumulate(data, operator.mul))

How to (log) transform *args arguments without losing structure

I am attempting to apply statistical tests to some datasets with variable numbers of groups. This causes a problem when I try to perform a log transformation for said groups while maintaining the ability to perform the test function (in this case scipy's kruskal()), which takes a variable number of arguments, one for each group of data.
The code below is an idea of what I want. Naturally stats.kruskal([np.log(i) for i in args]) does not work, as kruskal() does not expect a list of arrays, but one argument for each array. How do I perform log transformation (or any kind of alteration, really), while still being able to use the function?
import scipy.stats as stats
import numpy as np
def t(*args):
test = stats.kruskal([np.log(i) for i in args])
return test
a = [11, 12, 4, 42, 12, 1, 21, 12, 6]
b = [1, 12, 4, 3, 14, 8, 8, 6]
c = [2, 2, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8]
print(t(a, b, c))
IIUC, * in front of the list you are forming while calling kruskal should do the trick:
test = stats.kruskal(*[np.log(i) for i in args])
Asterisk unpacks the list and passes each entry of the list as arguments to the function being called i.e. kruskal here.

Dataframe with fixed length (over writing)

I write a code that generates a mass amount of data in each round. So, I need to only store data for the last 10 rounds. How can I create a dataframe which erases the oldest object when I add a need object (over-writing)? The order of observations -from old to new- should be maintained. Is there any simple function or data format to do this?
Thanks in advance!
You could use this function:
def ins(arr, item):
if len(arr) < 10:
arr.insert(0, item)
else:
arr.pop()
arr.insert(0, item)
ex = [1, 2, 3, 4, 5, 6, 7, 8, 9]
ins(ex, 'a')
print(ex)
# ['a', 1, 2, 3, 4, 5, 6, 7, 8, 9]
ins(ex, 'b')
print(ex)
# ['b', 'a', 1, 2, 3, 4, 5, 6, 7, 8]
In order for this to work you MUST pass a list as argument to the function ins(), so that the new item is inserted and the 10th is removed (if there is one).
(I considered that the question is not pandas specific, but rather a way to store a maximum amount of items in an array)

Piping a pipe-delimited flat file into python for use in Pandas and Stats

I have searched a lot, but haven't found an answer to this.
I am trying to pipe in a flat file with data and put into something python read and that I can do analysis with (for instance, perform a t-test).
First, I created a simple pipe delimited flat file:
1|2
3|4
4|5
1|6
2|7
3|8
8|9
and saved it as "simpledata".
Then I created a bash script in nano as
#!/usr/bin/env python
import sys
from scipy import stats
A = sys.stdin.read()
print A
paired_sample = stats.ttest_rel(A[:,0],A[:,1])
print "The t-statistic is %.3f and the p-value is %.3f." % paired_sample
Then I save the script as pairedttest.sh and run it as
cat simpledata | pairedttest.sh
The error I get is
TypeError: string indices must be integers, not tuple
Thanks for your help in advance
Are you trying to call this?:
paired_sample = stats.ttest_rel([1,3,4,1,2,3,8], [2,4,5,6,7,8,9])
If so, you can't do it the way you're trying. A is just a string when you read it from stdin, so you can't index it the way you're trying. You need to build the two lists from the string. The most obvious way is like this:
left = []
right = []
for line in A.splitlines():
l, r = line.split("|")
left.append(int(l))
right.append(int(r))
print left
print right
This will output:
[1, 3, 4, 1, 2, 3, 8]
[2, 4, 5, 6, 7, 8, 9]
So you can call stats.ttest_rel(left, right)
Or to be really clever and make a (nearly impossible to read) one-liner out of it:
z = zip(*[map(int, line.split("|")) for line in A.splitlines()])
This will output:
[(1, 3, 4, 1, 2, 3, 8), (2, 4, 5, 6, 7, 8, 9)]
So you can call stats.ttest_rel(*z)

Categories

Resources