Counting the number of rows filled with tuples - python

As part of parsing a PDB file, I've extracted a set of coordinates (x, y, z) for particular atoms that I want to exist as floats. However, I also need to know how many sets of coordinates I have extracted.
Below is my code through the coordinate extraction, and what I thought would produce the count of how many sets of three coordinates I've extracted.
When using len(coordinates), I unfortunately get back that each set of coordinates contains 3 tuples (the x, y, and z coordinates.
Any insight into how to properly count the number of sets would be helpful. I'm quite new to Python and am still in the stage of being unsure about if I am even asking this correctly!
from sys import argv
with open(argv[1]) as pbd:
print()
for line in pbd:
if line[:4] == 'ATOM':
atom_type = line[13:16]
if atom_type == "CA" or "N" or "C":
x = float(line[31:38])
y = float(line[39:46])
z = float(line[47:54])
coordinates = (x, y, z)
# printing (coordinates) gives
# (36.886, 53.177, 21.887)
# (38.323, 52.817, 21.996)
# (38.493, 51.553, 22.83)
# (37.73, 51.314, 23.77)
print(len(coordinates))
# printing len(coordinates)) gives
# 3
# 3
# 3
# 3
Thank you for any insight!

If you want to count the number of specific atoms in your file, try this one
from sys import argv
with open(argv[1]) as pbd:
print()
atomCount = 0
for line in pbd:
if line[:4] == 'ATOM':
atom_type = line[13:16]
if atom_type == "CA" or "N" or "C":
atomCount += 1
print(atomCount)
What it does is basically, you traverse your whole pbd file and check the type of each atom(seems fourth column in your data). Each time you encounter your desired atom types you increase a counter variable by 1.

Your coordinates variable is a tuple, tuples are ordered and unchangeable. Use lists is better.
coordinates=[]
for ....:
coordinates.append([x,y,z])
len(coordinates) # should be 4 I guess.

Related

Program to draw line when y coordinates match or z coordinates

I am working on Ironpython in Revit application.
This is the code below I was trying in python. Help would be appreciated.
From the list of points, there is a first point and second point. I have created functions for them.
The script should check if the y coordinates are same and draw line if true.
Its not working and returning unexpected error - new line error.
`The inputs to this node will be stored as a list in the IN variables.`
points = IN[0]
`# Place your code below this line`
lines = []
def fp(x)
firstpoint = points[x]
return firstpoint
def sp(x)
secondpoint = points[x+1]
return secondpoint
x = 0
while x <= points.Count:
if (fp(x).Y == sp(x).Y) or (fp(x).Z == sp(x).Z):
setlines = Line.ByStartPointEndPoint(fp(x), sp(x))
lines.append(setlines)
x = x + 1
`# Assign your output to the OUT variable.`
OUT = lines
As #itprorh66 points out, there's really not enough info here to definitively answer your question, but one issue is you're incorrectly comparing what I assume are floats.
fp(x).Y == sp(x).Y
Instead of comparing for direct equality, you'll need to compare for equality within a tolerance. Here is some discussion on how to do that, What is the best way to compare floats for almost-equality in Python?

How do you make a list of numpy.float64?

I am using python. I made this numpy.float64 and this shows the Chicago Cubs' win times by decades.
yr1874to1880 = np.mean(wonArray[137:143])
yr1881to1890 = np.mean(wonArray[127:136])
yr1891to1900 = np.mean(wonArray[117:126])
yr1901to1910 = np.mean(wonArray[107:116])
yr1911to1920 = np.mean(wonArray[97:106])
yr1921to1930 = np.mean(wonArray[87:96])
yr1931to1940 = np.mean(wonArray[77:86])
yr1941to1950 = np.mean(wonArray[67:76])
yr1951to1960 = np.mean(wonArray[57:66])
yr1961to1970 = np.mean(wonArray[47:56])
yr1971to1980 = np.mean(wonArray[37:46])
yr1981to1990 = np.mean(wonArray[27:36])
yr1991to2000 = np.mean(wonArray[17:26])
yr2001to2010 = np.mean(wonArray[7:16])
yr2011to2016 = np.mean(wonArray[0:6])
I want to put them together but I don't know how to. I tried for the list but it did not work. Does anyone know how to put them together in order to put them in the graph? I want to make a scatter graph with matplotlib. Thank you.
So with what you've shown, each variable you're setting becomes a float value. You can make them into a list by declaring:
list_of_values = [yr1874to1880, yr1881to1890, ...]
Adding all of the declared values to this results in a list of floats. For example, with just the two values above added:
>>>print list_of_values
[139.5, 131.0]
So that should explain how to obtain a list with the data from np.mean(). However, I'm guessing another question being asked is "how do I scatter plot this?" Using what is provided here, we have one axis of data, but to plot we need another (can't have a graph without x and y). Decide what the average wins is going to be compared against, and then that can be iterated over. For example, I'll use a simple integer in "decade" to act as the x axis:
import matplotlib.pyplot as plt
decade = 1
for i in list_of_values:
y = i
x = decade
decade += 1
plt.scatter(x, y)
plt.show()

Python: Plot step function for true/false signals

I have a Python dictionary containing for each variable a tuple with an array of points in time and an array of numbers (1/0) representing the Boolean values that the variable holds at a certain point in time. For example:
dictionary["a"] = ([0,1,3], [1,1,0])
means that the variable "a" is true at both point in time 0 and 1, at point in time 2 "a" holds an arbitrary value and at point in time 3 it is false.
I would like to generate a plot using matplotlib.pyplot that will look somehow like this:
I already tried something like:
import matplotlib.pyplot as plt
plt.figure(1)
graphcount = 1
for x in dictionary:
plt.subplot(len(dictionary), 1, graphcount)
plt.step(dictionary[x][0], dictionary[x][1])
plt.xlabel("time")
plt.ylabel(x)
graphcount += 1
plt.show()
but it does not give me the right results. For example, if dictionary["a"] = ([2], [1]) no line is shown at all. Can someone please point me in the right direction on how to do this? Thank you!
According to your description the line should start at the first point and end at the last point. If the first and last points are the same then your line will be made of only one point. In order to see a line with only one point you need to use a visible marker.
Regarding the location of the jumps, the docstring says:
where: [ ‘pre’ | ‘post’ | ‘mid’ ]
If ‘pre’ (the default), the interval from x[i] to x[i+1] has level y[i+1].
If ‘post’, that interval has level y[i].
If ‘mid’, the jumps in y occur half-way between the x-values.
So I guess you want 'mid'.
dictionary = {}
dictionary['a'] = ([0,1,3], [1,1,0])
dictionary['b'] = ([2], [1])
plt.figure(1)
graphcount = 1
for x in dictionary:
plt.subplot(len(dictionary), 1, graphcount)
plt.step(dictionary[x][0], dictionary[x][1], 'o-', where='mid')
plt.xlabel("time")
plt.ylabel(x)
graphcount += 1
plt.show()

Free up numbers from string

I have a very annoying output format from a program for my x,y,r values, namely:
circle(201.5508,387.68505,2.298685) # text={1}
circle(226.21442,367.48613,1.457215) # text={2}
circle(269.8067,347.73605,1.303065) # text={3}
circle(343.29599,287.43024,6.5938) # text={4}
is there a way to get the 3 numbers out into an array without doing manual labor?
So I want the above input to become
201.5508,387.68505,2.298685
226.21442,367.48613,1.457215
269.8067,347.73605,1.303065
343.29599,287.43024,6.5938
if you mean that the circle(...) construct is the output you want to parse. Try something like this:
import re
a = """circle(201.5508,387.68505,2.298685) # text={1}
circle(226.21442,367.48613,1.457215) # text={2}
circle(269.8067,347.73605,1.303065) # text={3}
circle(343.29599,287.43024,6.5938) # text={4}"""
for line in a.split("\n"):
print [float(x) for x in re.findall(r"\d+(?:\.\d+)?", line)]
Otherwise, you might mean that you want to call circle with numbers taken from an array containing 3 numbers, which you can do as:
arr = [343.29599,287.43024,6.5938]
circle(*arr)
A bit unorthodox, but as the format of your file is valid Python code and there are probably no security risks regarding untrusted code, why not just simply define a circle function which puts all the circles into a list and execute the file like:
circles = []
def circle(x, y, r):
circles.append((x, y, r))
execfile('circles.txt')
circles is now list containing triplets of x, y and r:
[(201.5508, 387.68505, 2.298685),
(226.21442, 367.48613, 1.457215),
(269.8067, 347.73605, 1.303065),
(343.29599, 287.43024, 6.5938)]

Read file elements into 3 different arrays

I have a file that is space delimited with values for x,y,x. I need to visualise the data so I guess I need so read the file into 3 separate arrays (X,Y,Z) and then plot them. How do I read the file into 3 seperate arrays I have this so far which removes the white space element at the end of every line.
def fread(f=None):
"""Reads in test and training CSVs."""
X = []
Y = []
Z = []
if (f==None):
print("No file given to read, exiting...")
sys.exit(1)
read = csv.reader(open(f,'r'),delimiter = ' ')
for line in read:
line = line[:-1]
I tried to add something like:
for x,y,z in line:
X.append(x)
Y.append(y)
Z.append(z)
But I get an error like "ValueError: too many values to unpack"
I have done lots of googling but nothing seems to address having to read in a file into a separate array every element.
I should add my data isn't sorted nicely into rows/columns it just looks like this
"107745590026 2 0.02934046648 0.01023879368 3.331810236 2 0.02727724425 0.07867902517 3.319272757 2 0.01784882881"......
Thanks!
EDIT: If your data isn't actually separated into 3-element lines (and is instead one long space-separated list of values), you could use python list slicing with stride to make this easier:
X = read[::3]
Y = read[1::3]
Z = read[2::3]
This error might be happening because some of the lines in read contain more than three space-separated values. It's unclear from your question exactly what you'd want to do in these cases. If you're using python 3, you could put the first element of a line into X, the second into Y, and all the rest of that line into Z with the following:
for x, y, *z in line:
X.append(x)
Y.append(y)
for elem in z:
Z.append(elem)
If you're not using python 3, you can perform the same basic logic in a slightly more verbose way:
for i, elem in line:
if i == 0:
X.append(elem)
elif i == 1:
Y.append(elem)
else:
Z.append(elem)

Categories

Resources