Python script has to be run twice to execute - python

I have written a python script to gather and analyze data from a .csv file and then plot it using the matplotlib.pyplot module. I'm using numpy.genfromtext() to gather the data.
The first time I run the file, nothing happens. I get the console message:
>>> runfile('C:/my_filepath/thing.py')
and nothing more. If I run the file again, then it executes, prints the stuff, the plot comes up etc:
runfile('C:/my_filepath/thing.py')
~the stuff it's supposed to print~
More info: This problem only occurs on my laptop computer, which leads me to believe it has something to do with my matplotlib installation (mysterious to me because I installed an identical Anaconda package on my desktop and there is no issue). On the laptop the plot window is separate, and on the desktop the plot displays in the console. Maybe that's relevant.
Has anyone had this issue?
edit2: This only happens if I try to run the program multiple times in the same console. If I run the script in a fresh console it works fine, then you have to run it twice for every execution. If I close the matplotlib window, kill the console, and open a new one, I can execute it fine every time.
edit: here is a working example which exhibits the odd behavior. don't make fun of my code - I learned python over a weekend
import re
import numpy as np
from numpy import genfromtxt as gft
import matplotlib.pyplot as plt
def getnames(Pattern):
#construct the list of files in the directory
CSVList = []
for FileName in os.listdir():
if re.compile(Pattern).search(FileName):
CSVList.append(FileName)
print(len(CSVList), 'files found.')
#sort the list by the integer values of the last 4 digits of the filename
CSVList.sort(key=lambda x: int(x.rsplit('.')[0][-4:]))
return CSVList
def xy_extract(Data):
x = np.array([row[0] for row in Data])
y = np.array([row[1] for row in Data])
return x, y
CSVList = getnames('(?i)\.csv$')
print(CSVList)
#plt.figure(figsize=(10,5))
plt.xlim(799,3999)
for Filename in CSVList:
Data = gft(Filename, delimiter=',',skip_header=2)
x, y = xy_extract(Data)
plt.plot(x, y,label=Filename)
plt.show()

Related

Numpy is installed and has been running but suddenly I get missing module error despite reinstalling anaconda and purging numpy before re-installation

I was using numpy in a simple python script that calls and runs another program in a loop. It was working just fine for several days, then I needed to go somewhere so I tried to cancel the process, I was lazy and ended up just closing the terminal. The next time I tried to run it I started getting the following error:
(base) cb27g11#cb27g11-Precision-7510:~/Development/Ciara$ sudo ipython3 Test_script.py
ModuleNotFoundError Traceback (most recent call last)
/home/cb27g11/Development/Ciara/Test_script.py in ()
1 import subprocess
2 import fileinput
----> 3 import numpy as np
4
5 tan_beta = [20, 25, 30, 35, 40] #Array of tan(beta) values
ModuleNotFoundError: No module named 'numpy'
But numpy is definitely still installed. I use anaconda and numpy is in the files for that. If I start a shell in the terminal and import numpy there's no problem, and "which python" returns:
/home/cb27g11/anaconda3/bin/python
So it seems as though the path is intact.
I've tried completely uninstalling numpy and anaconda, purging my computer of any files related to numpy and then reinstalling both to no success.
I then installed numpy via pip3 and I've been trawling through environment and anaconda files to see if theres a broken path but I'm struggling to find anything to fix this.
Just in case there's something wrong with my little script I'll add it below :
import fileinput
import numpy as np
tan_beta = [20, 25, 30, 35, 40] #Array of tan(beta) values
sin_bma = np.linspace(0.9, 1.1, 22)#Array of sin(beta-alpha) values
hcm = 250 # Mass of charged Higgses in GeV
textfilepath = '/home/cb27g11/models/mg_run.txt' #path to txt file madgraph will use
Process = 'AD_noCharge'
def the_main_event():
with open('/home/cb27g11/Development/Ciara/mg_run_basic.txt','r') as old_card:
text =old_card.read() #stores string of old_card ready for editing
for i in range(0, len(tan_beta)):
tb = tan_beta[i] # loops over tan(beta) values
for j in range(0, len(sin_bma)):
sbma = sin_bma[j] # loops over sin(beta-alpha) values
make_input(tb, sbma, hcm, text)
run_madgraph()
def make_input(Tb, Sbma, Hcm, Text):
# inputs are the value of tan_beta, the value of sin(beta-alpha) values, the desired mass for the charged higgses and a string of text
with open(textfilepath, 'w') as new_card:
#simulation card, the .txt file that gets fed to madgraph
sim_card = Text.replace('TBV', str(Tb))
sim_card = sim_card.replace('SBMAV', str(Sbma))
sim_card = sim_card.replace('HCMV', str(Hcm))
sim_card = sim_card.replace('NAME', str(Process)+'_' + str(Tb) +'_' + str(Sbma))
new_card.write(sim_card) # saves new txt file for madgraph in ~/models (as this is linked to docker as an input)
def run_madgraph():
# starts madgraph in docker with a volume type output linked to $HOME/output on laptop and mount type input linked to $HOME/models, then tells it to run mg5_aMC using mg_run.txt as a file for it.
subprocess.run('sudo docker run -t -i -v $HOME/outputs:/var/MG_outputs --mount type=bind,source=$HOME/models,target=/app hfukuda/madgraph /home/hep/MG5_aMC_v2_6_3_2/bin/mg5_aMC /app/mg_run.txt', shell=True)
the_main_event()```

Function does not finish executing in `hist` function only on second time

In Python DataFrame, Im trying to generate histogram, it gets generated the first time when the function is called. However, when the create_histogram function is called second time it gets stuck at h = df.hist(bins=3, column="amount"). When I say "stuck", I mean to say that it does not finish executing the statement and the execution does not continue to the next line but at the same time it does not give any error or break out from the execution. What is exactly the problem here and how can I fix this?
import matplotlib.pyplot as plt
...
...
def create_histogram(self, field):
df = self.main_df # This is DataFrame
h = df.hist(bins=20, column="amount")
fileContent = StringIO()
plt.savefig(fileContent, dpi=None, facecolor='w', edgecolor='w',
orientation='portrait', papertype=None, format="png",
transparent=False, bbox_inches=None, pad_inches=0.5,
frameon=None)
content = fileContent.getvalue()
return content
Finally I figured this out myself.
Whenever I executed the function I was always getting the following log message but I was ignoring it due to my lack of awareness.
Backend TkAgg is interactive backend. Turning interactive mode on.
But then I realised that may be its running in interactive mode (which was not my purpose). So, I found out that there is a way to turn it off, which is given below.
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
And this fixed my issue.
NOTE: the use should be called immediately after importing matplotlib in the sequence given here.

Why wont my code execute both in Terminal & on spyder IDE?

I'm analysing this data set using ML techniques in Python3.5 on sypder IDE (Ubuntu OS) and my program is supposed to work fine (matches perfectly with tutorial program) but it does nothing when run - nothing gets printed or returned. The console of spyder IDE displays the following and does nothing after that:
runfile('/media/username/Laniakea/Projects/Training/SPYDER/classifier/sk_classifier.py', wdir='/media/username/Laniakea/Projects/Training/SPYDER/classifier')
I used to get this when a new program starts to run, and the output would follow but here, I get nothing. My program:
from sklearn import svm
import pandas as pd
import numpy as np
df_pickled_train2 = pd.read_pickle('df_train.pickle')
df_pickled_test2 = pd.read_pickle('df_test.pickle')
df_pickled_train2_y = pd.read_pickle('df_train_y.pickle')
df_pickled_test2_y = pd.read_pickle('df_test_y.pickle')
X = np.array(df_pickled_train2)
y = np.array(df_pickled_train2_y)
X_test = np.array(df_pickled_test2)
y_test = np.array(df_pickled_test2_y)
clf = svm.SVC(kernel='linear')
clf.fit(X,y.ravel())
print(clf.score(X_test,y_test))
print("Done")
If you want to see how the pickles get created (and this program runs fine - it even prints out the final line "Done" or anything else I want it to print):
import pandas as pd
import numpy as np
df_train = pd.read_csv('Adult-Incomes/train-labelled-final-variables-condensed-coded-countries-removed-unlabelled-income-to-the-left-relabelled-copy.csv')
df_test = pd.read_csv('Adult-Incomes/test-final-variables-cleaned-coded-copy-unlabelled.csv')
df_train_no_y = df_train.drop('Income',1)
df_test_no_y = df_test.drop(df_test.columns[0],axis=1)
df_train_y = pd.DataFrame(df_train['Income'])
df_train_y.to_pickle('df_train_y.pickle')
df_test_y = df_test[df_test.columns[0]]
df_test_y.to_pickle('df_test_y.pickle')
df_test_no_y.to_pickle('df_test.pickle')
df_train_no_y.to_pickle('df_train.pickle')
print ("DONE")
PS: Even if run from the Terminal, it simply executes but does nothing. Meaning, in terminal, the cursor would go to the next line and print out the output before prompting for another command right, but here, it simply stays there. It's not even hung, as the cursor blinks and computer is not hung. It feels like, the code somehow sends the executor into a limbo.
P.P.S: I even suspected that it is running a complex algo, genuinely requiring time and left it over night. Nothing happened even then.
Can someone tell me why my program wont run or display anything?

Pyplot "cannot connect to X server localhost:10.0" despite ioff() and matplotlib.use('Agg')

I have a piece of code which gets called by a different function, carries out some calculations for me and then plots the output to a file. Seeing as the whole script can take a while to run for larger datasets and since I may want to analyse multiple datasets at a given time I start it in screen then disconnect and close my putty session and check back on it the next day. I am using Ubuntu 14.04. My code looks as follows (I have skipped the calculations):
import shelve
import os, sys, time
import numpy
import timeit
import logging
import csv
import itertools
import graph_tool.all as gt
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
plt.ioff()
#Do some calculations
print 'plotting indeg'
# Let's plot its in-degree distribution
in_hist = gt.vertex_hist(g, "in")
y = in_hist[0]
err = numpy.sqrt(in_hist[0])
err[err >= y] = y[err >= y] - 1e-2
plt.figure(figsize=(6,4))
plt.errorbar(in_hist[1][:-1], in_hist[0], fmt="o",
label="in")
plt.gca().set_yscale("log")
plt.gca().set_xscale("log")
plt.gca().set_ylim(0.8, 1e5)
plt.gca().set_xlim(0.8, 1e3)
plt.subplots_adjust(left=0.2, bottom=0.2)
plt.xlabel("$k_{in}$")
plt.ylabel("$NP(k_{in})$")
plt.tight_layout()
plt.savefig("in-deg-dist.png")
plt.close()
print 'plotting outdeg'
#Do some more stuff
The script runs perfectly happily until I get to the plotting commands. To try and get to the root of the problem I am currently running it in putty without screen and with no X11 applications. The ouput I get is the following:
plotting indeg
PuTTY X11 proxy: unable to connect to forwarded X server: Network error: Connection refused
: cannot connect to X server localhost:10.0
I presume this is caused by the code trying to open a window but I thought that by explicitely setting plt.off() that would be disabled. Since it wasn't I followed this thread (Generating matplotlib graphs without a running X server ) and specified the backend, but that didn't solve the problem either. Where might I be going wrong?
The calling function calls other functions too which also use matplotlib. These get called only after this one but during the import statement their dependecies get loaded. Seeing as they were loaded first they disabled the subsequent matplotlib.use('Agg') declaration. Moving that declaration to the main script has solved the problem.

SPM Dicom Convert in python (Ipython/ Nipype)

I am new to python or more specifically ipython. I have been running through the steps to run what should be a very simple Dicom Conversion in a statistical package called SPM for an MRI image file as described by NiPype. I can't get it to run and was wondering what I was doing wrong. I am not getting an error message, instead, there is no file change or output. It just hangs. Does anyone have any idea what I might be doing wrong? It's likely that I am missing something very simple here (sorry :(
import os
from pylab import *
from glob import glob
from nipype.interfaces.matlab import MatlabCommand as mlab
mlab.set_default_paths('/home/orkney_01/s1252042/matlab/spm8')
from nipype.interfaces.spm.utils import DicomImport as di
os.chdir('/sdata/images/projects/ASD_MM/1/datafiles/restingstate_files')
filename = "reststate_directories.txt"
restingstate_files_list = [line.strip() for line in open(filename)]
for x in restingstate_files_list:
os.chdir( x )
y = glob('*.dcm')
conversion = di(in_files = y))
print(res.outputs)
You are creating a DicomImport interface, but you are not actually running it. You should have res = di.run().
Also, you are best to tell the interface where to run using di.base_dir = '/some/path' before running.
Finally, you may also want to print the contents of restingstate_files_list to check you are finding the DICOM directories correctly.

Categories

Resources