Making a histogram in Python

Making a histogram in Python - python

I have been trying to make a histogram using the data given in survey below.
#represents the "information" (the summarized data)
ranking = [0,0,0,0,0,0,0,0,0,0,0]
survey = [1,5,3,4,1,1,1,2,1,2,1,3,4,5,1,7]
for i in range(len(survey)):
ranking[survey[i]]+=1
#create histogram
print("\nCreating a histogram from values: ")
print("%3s %5s %7s"%("Element", "Value", "Histogram"))
for i in range(len(ranking)):
print("%7d %5d %-s"%(i+1, ranking[i+1], "*" * ranking[i+1]))
Here is exactly what the shell displays when I run my code:
Creating a histogram from values:
Element Value Histogram
1 7 *******
2 2 **
3 2 **
4 2 **
5 2 **
6 0
7 1 *
8 0
9 0
10 0
Traceback (most recent call last):
File "C:\folder\file23.py", line 17, in <module>
print("%7d %5d %-s"%(i+1, ranking[i+1], "*" * ranking[i+1]))
IndexError: list index out of range
My expected output is the above thing just without the traceback.
The shell is displaying the right thing, I'm just unsure of the error message. How can I fix this?

When i reaches its highest value, len(ranking) - 1, your use of ranking[i+1] is clearly "out of range"! Use range(len(ranking) - 1 in the for loop to avoid the error.
The counting can be simplified, too:
import collections
ranking = collections.Counter(survey)
for i in range(min(ranking), max(ranking)+1):
print("%7d %5d %-s"%(i, ranking[i], "*" * ranking[i]))
Here you need min and max because Counter is mapping-like, not sequence-like. But it will still work fine (and you can use range(0, max(ranking)+1) if you prefer!-)

Your index i is incremented up to len(ranking)-1, which is the last valid index in ranking, but you're trying to access ranking[i+1], hence the IndexError.
Fix:
for i in range(len(ranking)-1):

Related

Data analysis - MD analysis Python

I am seeing this error, need help on this!
warnings.warn("Failed to guess the mass for the following atom types: {}".format(atom_type))
Traceback (most recent call last):
File "traj_residue-z.py", line 48, in
protein_z=protein.centroid()[2]
IndexError: index 2 is out of bounds for axis 0 with size 0

The problem was solved through a discussion in the mailing list thread https://groups.google.com/g/mdnalysis-discussion/c/J8oJ0M9Rjb4/m/kSD2jURODQAJ
In brief: The traj_residue-z.py script contained the line
protein=u.select_atoms('resid 1-%d' % (nprotein_res))
It turned out that the selection 'resid 1-%d' % (nprotein_res) would not select anything because the input GRO file started with resid 1327
1327LEU N 1 2.013 3.349 8.848 0.4933 -0.2510 0.2982
1327LEU H1 2 1.953 3.277 8.893 0.0174 0.1791 0.3637
1327LEU H2 3 1.960 3.377 8.762 0.6275 -0.5669 0.1094
...
and hence the selection of resids starting at 1 did not match anything. This produced an empty AtomGroup protein.
The subsequent centroid calculation
protein_z=protein.centroid()[2]
failed because for an empty AtomGroup, protein.centroid() returns an empty array and so trying to get the element at index 2 raises IndexError.
The solution (thanks to #IAlibay) was to
either change the selection string 'resid 1-%d' to accommodate start and stop resids, or
to just select the first nprotein_res residues protein = u.residues[:nprotein_res].atoms by slicing the ResiduesGroup.

Using If else staments in constraints for MIP solver OR-tools in Python

I'm using the MIP Solver of OR Tools for Python and I have stubled on a problem to declare a constraint. The constraint in question, which is represented by the image below, is about the proportion between male and female Animals (a):
So far I tried two ways to do this:
for t in times:
solver.Add(
solver.Sum(
if a in animais_male:
[n_animals[(a,t)]*sell[(a, t)]
elif a in animais_female:
-proportion_max*n_animals[(a,t)]*sell[(a, t)]
for a in animais
]) <= 0)
This one returned this error:
File "<ipython-input-20-95a1db1c4418>", line 5
if a in animals_male:
^
SyntaxError: invalid syntax
And the other solution I tried was:
for t in dias_considerados:
solver.Add(
solver.Sum(
[n_animals[(a,t)]*sell[(a, t)]
for a in animals_male
-proportion_max*n_animals[(a,t)]*sell[(a, t)]
for a in animals_female
]) <= 0)
And I got this error:
TypeError Traceback (most recent call last)
<ipython-input-21-2dfedfd19682> in <module>()
4 [n_animals[(a,t)]*sell[(a, t)]
5 for a in animals_male
----> 6 -proportion_max*n_animals[(a,t)]*sell[(a, t)]
7 for a in animals_female
8 ]) <= 0)
TypeError: 'SumArray' object is not iterable
n_animals[(a,t)] is the number of animals, represented as n_(a,t) in the image and sell[(a,t)] is the I_(a,t) binary decision variable in the image.
So if anyone could help I would apreciate! If there is a need for me to better explain my situation I am available to do so.

How to calculate the verification digit of the Tax ID in the country of Paraguay (calcular digito verificador del RUC)

In the country of Paraguay (South America) each taxpayer has a Tax ID (called RUC: Registro Único del Contribuyente) assigned by the government (Ministerio de Hacienda, Secretaría de Tributación).
This RUC is a number followed by a verification digit (dígito verificador), for example 123456-0. The government tells you the verification digit when you request your RUC.
Is there a way for me to calculate the verification digit based on the RUC? Is it a known formula?
In my case, I have a database of suppliers and customers, collected over the years by several employees of the company.
Now I need to run checks to see if all the RUCs were entered correctly or if there are typing mistakes.
My preference would be a Python solution, but I'll take whatever solutions I get to point me in the right direction.
Edit: This is a self-answer to share knowledge that took me hours/days to find. I marked this question as "answer your own question" (don't know if that changes anything).

The verification digit of the RUC is calculated using formula very similar (but not equal) to a method called Modulo 11; that is at least the info I got reading the following tech sites (content is in Spanish):
https://www.yoelprogramador.com/funncion-para-calcular-el-digito-verificador-del-ruc/
http://groovypy.wikidot.com/blog:02
https://es.wikipedia.org/wiki/C%C3%B3digo_de_control#M.C3.B3dulo_11
I analyzed the solutions provided in the mentioned pages and ran my own tests against a list of RUCs and their known verification digits, which led me to a final formula that returns the expected output, but which is DIFFERENT from the solutions in the mentioned links.
The final formula I got to calculate the verification digit of the RUC is shown in this example (80009735-1):
Multiply each digit of the RUC (without considering the verification digit) by a factor based on the position of the digit within the RUC (starting from the right side of the RUC) and sum all the results of these multiplications:
RUC: 8 0 0 0 9 7 3 5
Position: 7 6 5 4 3 2 1 0
Multiplications: 8x(7+2) 0x(6+2) 0x(5+2) 0x(4+2) 9x(3+2) 7x(2+2) 3x(1+2) 5x(0+2)
Results: 72 0 0 0 45 28 9 10
Sum of results: 164
Divide the sum by 11 and use the remainder of the division to determine the verification digit:
If the remainder is greater than 1, the the verification digit is 11 - remainder
If the remainder is 0 or 1, the the verification digit is 0
In out example:
Sum of results: 164
Division: 164 / 11 ==> quotient 14, remainder 10
Verification digit: 11 - 10 ==> 1
Here is my Python version of the formula:
def calculate_dv_of_ruc(input_str):
# assure that we have a string
if not isinstance(input_str, str):
input_str = str(input_str)
# try to convert to 'int' to validate that it contains only digits.
# I suspect that this is faster than checking each char independently
int(input_str)
the_sum = 0
for i, c in enumerate(reversed(input_str)):
the_sum += (i + 2) * int(c)
base = 11
_, rem = divmod(the_sum, base)
if rem > 1:
dv = base - rem
else:
dv = 0
return dv
Testing this function it returns the expected results, raising errors when the input has other characters than digits:
>>> calculate_dv_of_ruc(80009735)
1
>>> calculate_dv_of_ruc('80009735')
1
>>> calculate_dv_of_ruc('80009735A')
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "<input>", line 8, in calculate_dv_of_ruc
ValueError: invalid literal for int() with base 10: '80009735A'

Rolling apply function must be real number, not Nonetype

I'm trying to use rolling and apply function to print window
but I got the error says
File "pandas/_libs/window.pyx", line 1649, in pandas._libs.window.roll_generic
TypeError: must be real number, not NoneType
My code is following
def print_window(window):
print(window)
print('==================')
def example():
df = pd.read_csv('window_example.csv')
df.rolling(5).apply(print_window)
My data is like
number sum mean
1 1 1
2 3 1.5
3 6 2
4 10 2.5
5 15 3
6 20 4
How should I slove this error?
I didn't find similar questions on this error
Thanks !

This behavior appeared in pandas=1.0.0. The function of the apply is now expected to return a single value to affect the corresponding column with.
https://pandas.pydata.org/pandas-docs/version/1.0.0/reference/api/pandas.core.window.rolling.Rolling.apply.html#pandas.core.window.rolling.Rolling.apply
A workaround for your code would be :
def print_window(window):
print(window)
print('==================')
return 0
def example():
df = pd.read_csv('window_example.csv')
df.rolling(5).apply(print_window)

How can I create a decremented numberPyramid(num) in Python?

I'm trying to create a pyramid that looks like the picture below(numberPyramid(6)), where the pyramid isn't made of numbers but actually a black space with the numbers around it. The function takes in a parameter called "num" and which is the number of rows in the pyramid. How would I go about doing this? I need to use a for loop but I'm not sure how I implement it. Thanks!
666666666666
55555 55555
4444 4444
333 333
22 22
1 1

def pyramid(num_rows, block=' ', left='', right=''):
for idx in range(num_rows):
print '{py_layer:{num_fill}{align}{width}}'.format(
py_layer='{left}{blocks}{right}'.format(
left=left,
blocks=block * (idx*2),
right=right),
num_fill=format((num_rows - idx) % 16, 'x'),
align='^',
width=num_rows * 2)
This works by using python's string format method in an interesting way. The spaces are the string to be printed, and the number used as the character to fill in the rest of the row.
Using the built-in format() function to chop off the leading 0x in the hex string lets you build pyramids up to 15.
Sample:
In [45]: pyramid(9)
999999999999999999
88888888 88888888
7777777 7777777
666666 666666
55555 55555
4444 4444
333 333
22 22
1 1
Other pyramid "blocks" could be interesting:
In [52]: pyramid(9, '_')
999999999999999999
88888888__88888888
7777777____7777777
666666______666666
55555________55555
4444__________4444
333____________333
22______________22
1________________1
With the added left and right options and showing hex support:
In [57]: pyramid(15, '_', '/', '\\')
ffffffffffffff/\ffffffffffffff
eeeeeeeeeeeee/__\eeeeeeeeeeeee
dddddddddddd/____\dddddddddddd
ccccccccccc/______\ccccccccccc
bbbbbbbbbb/________\bbbbbbbbbb
aaaaaaaaa/__________\aaaaaaaaa
99999999/____________\99999999
8888888/______________\8888888
777777/________________\777777
66666/__________________\66666
5555/____________________\5555
444/______________________\444
33/________________________\33
2/__________________________\2
/____________________________\

First the code:
max_depth = int(raw_input("Enter max depth of pyramid (2 - 9): "))
for i in range(max_depth, 0, -1):
print str(i)*i + " "*((max_depth-i)*2) + str(i)*i
Output:
(numpyramid)macbook:numpyramid joeyoung$ python numpyramid.py
Enter max depth of pyramid (2 - 9): 6
666666666666
55555 55555
4444 4444
333 333
22 22
1 1
How this works:
Python has a built-in function named range() which can help you build the iterator for your for-loop. You can make it decrement instead of increment by passing in -1 as the 3rd argument.
Our for loop will start at the user supplied max_depth (6 for our example) and i will decrement by 1 for each iteration of the loop.
Now the output line should do the following:
Print out the current iterator number (i) and repeat it itimes.
Figure out how much white space to add in the middle.
This will be the max_depth minus the current iterator number, then multiply that result by 2 because you'll need to double the whitespace for each iteration
Attach the whitespace to the first set of repeated numbers.
Attach a second set of repeated numbers: the current iterator number (i) repeated itimes
When your print characters, they can be repeated by following the character with an asterisk * and the number of times you want the character to be repeated.
For example:
>>> # Repeats the character 'A' 5 times
... print "A"*5
AAAAA

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Making a histogram in Python - python

Your index i is incremented up to len(ranking)-1, which is the last valid index in ranking, but you're trying to access ranking[i+1], hence the IndexError. Fix: for i in range(len(ranking)-1):

Related

Data analysis - MD analysis Python

Using If else staments in constraints for MIP solver OR-tools in Python

How to calculate the verification digit of the Tax ID in the country of Paraguay (calcular digito verificador del RUC)

Rolling apply function must be real number, not Nonetype

How can I create a decremented numberPyramid(num) in Python?

Categories

Resources