Process finished with exit code -1073741819 (0xC0000005) - Rpy2 - python

I have searched a lot for this error, on stack overflow and other websites but I cannot seem to find a solution to my problem.
Basically, I have a program that is in python, and I am using python's module rpy2 for communicating with some R functions, from python.
The problem is that when I run the code, sometimes, but not always I encounter this error. I am on windows. Sometimes when I restart my PC this code runs more exercises, but then eventually this error pops up again. What should I do ?
I have python 3.6.7, with PyCharm 2018.3.3. However I doubt the problem is from PyCharm because when I run my program from the cmd the same thing happens, except that the program halts directly without notifying me with the message "Process finished with exit code -1073741819 (0xC0000005)". This message only appears in PyCharm, but still.
I have rpy2 version 2.9.5
Code description
I do know, relatively, which part of the code is doing this, but I cannot optimize it more. In other words, In this part of the code, inside cross validation, I am over populating each of the train and validation sets in a certain way, and in order to do that, I am combining both X_train and y_train back into one data frame, overpopulating this data frame, and then getting back the updated, overpopulated, X_train and y_train, and performing my analysis on these overpopulated ones. I think combining both into numpy arrays into a pandas dataframe and then un-combining back is creating this memory error. Also its important to note that this is happening in each fold, and I'm doing a 10-folds-10-repeats cross validation. However, even when I run this on a Desktop PC rather than on my laptop the same thing happens, knowing that I have plenty of GBs left on my own laptop. I am doubting this is a python/rpy2 error ??
Code snippet
# I am calling this function inside each fold
df_combined = self.prepare_data(X_train, y_train)
and then after calling prepare_data() I do as follows:
# THE apply_f1(), apply_f2(), apply_f3(), and apply_f4() ARE THE FUNCTIONS
# THAT USE rpy2 INTERNALLY
if self.f1:
X_train_inner, y_train_inner = self.apply_f1(df_combined)
elif self.f2:
X_train_inner, y_train_inner = self.apply_f2(df_combined)
elif self.f3:
X_train_inner, y_train_inner = self.apply_f3(df_combined)
else:
X_train_inner, y_train_inner = self.apply_f4(df_combined)
The prepare_data() function:
def prepare_data(self, X_train, y_train):
'''
concatenates X_train_inner and y_train_inner into one, and make them a data frame
so we are able to process the data frame by SMOGN, RandUnder, GN, or SMOTER
'''
# reshape + rename
X_train_samp = X_train
y_train_samp = y_train.reshape(-1, 1)
# combine two numpy arrays together into one numpy array
combined = np.concatenate((X_train_samp, y_train_samp), axis=1)
# transform X_train + y_train into a pandas dataframe
column_names = self.other + [self.target_variable]
df_combined = pd.DataFrame(combined, columns=column_names)
# convert the combined pandas dataframe to R Data.Frame
df_combined = pandas2ri.py2ri(df_combined)
return df_combined

I have had this same error message "Process finished with exit code -1073741819 (0xC0000005)" with PyCharm 2021.1.
It happened because I selected Python 3.9 as an interpreter, while PyCharm was actually trying to use Python 3.10. And actually I had only Python 3.8 installed.
As far as I am concerned, the error disappeared after I selected Python 3.8 as an interpreter.

Related

Some lines of code in Jupyter notebook become too slow

I always run code on Jupyter, and everything is good, but recently it takes a lot of time to execute some code, as example :
corr = train.corr()
highest_corr = corr[corr.index[abs(corr["claim"])>0.006]]
or even just :
train = pd.read_csv('Desktop\\train.csv')
So what is the problem? and how to solve it?
PS: I saw in another place that I should not use large values for :
pd.set_option('display.max_column', 120)
So I removed it, but it remains the same problem.

No console output using Keras model.fit() function

I'm following this tutorial to perform time series classifications using Transformers with Keras and TensorFlow. I'm using Windows 10 and the PyDev Eclipse plugin. Unfortunately, my program stops and the console output is completely blank every time I run the following code:
n_classes = len(np.unique(y_train))
input_shape = np.array(x_trainScaled).shape[0:]
model = build_model(n_classes,input_shape,head_size=256,num_heads=4,ff_dim=4,num_transformer_blocks=4,mlp_units=[128],mlp_dropout=0.4,dropout=0.25)
model.compile(loss="sparse_categorical_crossentropy",optimizer=keras.optimizers.Adam(learning_rate=1e-4),metrics=["sparse_categorical_accuracy"])
print(model.summary())
callbacks = [keras.callbacks.EarlyStopping(patience=100, restore_best_weights=True)]
model.fit(x_trainScaled,y_train,validation_split=0.2,epochs=200,batch_size=64,callbacks=callbacks)
pathToModel = 'my/path/to/model/'
model.save(pathToModel)
Even previous warnings or print statements are completely erased and I have no idea what's going on. If I comment the model.fit(...) statement out, the program terminates and crashes with an error message resulting from a model.predict(...) call.
Any help is highly appreciated.
The solution was to transform the input data and labels to numpy arrays first. Thus, calling the fit function as follows:
model.fit(np.array(x_trainScaled),np.array(y_train),validation_split=0.2,epochs=200,batch_size=64,callbacks=callbacks)
worked perfectly fine for me, as opposed to:
model.fit(x_trainScaled,y_train,validation_split=0.2,epochs=200,batch_size=64,callbacks=callbacks)

Process finished with exit code -1073740940 (0xc0000374) using Scikit-learn KernelPCA

First of all, I tried to perform dimensionality reduction on my n_samples x 53 data using scikit-learn's Kernel PCA with precomputed kernel. The code worked without any issues when I tried using 50 samples at first. However, when I increased the number of samples into 100, suddenly I got the following message.
Process finished with exit code -1073740940 (0xC0000374)
Here's the detail of what I want to do:
I want to obtain the optimum value of kernel function hyperparameter in my Kernel PCA function, defined as the following.
from sklearn.decomposition.kernel_pca import KernelPCA as drm
from somewhere import costfunction
from somewhere_else import customkernel
def kpcafun(w,X):
# X is sample
# w is hyperparam
n_princomp = 2
drmodel = drm(n_princomp,kernel='precomputed')
k_matrix = customkernel (X,X,w)
transformed_x = drmodel.fit_transform(k_matrix)
cost = costfunction(transformed_x)
return cost
Therefore, to optimize the hyperparams I used the following code.
from scipy.optimize import minimize
# assume that wstart and optimbound are already defined
res = minimize(kpcafun, wstart, method='L-BFGS-B', bounds=optimbound, args=(X))
The strange thing is when I tried to debug the first 10 iterations of the optimization process, nothing strange has happened all values of the variables seemed normal. But, when I turned off the breakpoints and let the program continue the message appeared without any error notification.
Does anyone know what might be wrong with my code? Or anyone has some tips to resolve a problem like this?
Thanks

Why wont my code execute both in Terminal & on spyder IDE?

I'm analysing this data set using ML techniques in Python3.5 on sypder IDE (Ubuntu OS) and my program is supposed to work fine (matches perfectly with tutorial program) but it does nothing when run - nothing gets printed or returned. The console of spyder IDE displays the following and does nothing after that:
runfile('/media/username/Laniakea/Projects/Training/SPYDER/classifier/sk_classifier.py', wdir='/media/username/Laniakea/Projects/Training/SPYDER/classifier')
I used to get this when a new program starts to run, and the output would follow but here, I get nothing. My program:
from sklearn import svm
import pandas as pd
import numpy as np
df_pickled_train2 = pd.read_pickle('df_train.pickle')
df_pickled_test2 = pd.read_pickle('df_test.pickle')
df_pickled_train2_y = pd.read_pickle('df_train_y.pickle')
df_pickled_test2_y = pd.read_pickle('df_test_y.pickle')
X = np.array(df_pickled_train2)
y = np.array(df_pickled_train2_y)
X_test = np.array(df_pickled_test2)
y_test = np.array(df_pickled_test2_y)
clf = svm.SVC(kernel='linear')
clf.fit(X,y.ravel())
print(clf.score(X_test,y_test))
print("Done")
If you want to see how the pickles get created (and this program runs fine - it even prints out the final line "Done" or anything else I want it to print):
import pandas as pd
import numpy as np
df_train = pd.read_csv('Adult-Incomes/train-labelled-final-variables-condensed-coded-countries-removed-unlabelled-income-to-the-left-relabelled-copy.csv')
df_test = pd.read_csv('Adult-Incomes/test-final-variables-cleaned-coded-copy-unlabelled.csv')
df_train_no_y = df_train.drop('Income',1)
df_test_no_y = df_test.drop(df_test.columns[0],axis=1)
df_train_y = pd.DataFrame(df_train['Income'])
df_train_y.to_pickle('df_train_y.pickle')
df_test_y = df_test[df_test.columns[0]]
df_test_y.to_pickle('df_test_y.pickle')
df_test_no_y.to_pickle('df_test.pickle')
df_train_no_y.to_pickle('df_train.pickle')
print ("DONE")
PS: Even if run from the Terminal, it simply executes but does nothing. Meaning, in terminal, the cursor would go to the next line and print out the output before prompting for another command right, but here, it simply stays there. It's not even hung, as the cursor blinks and computer is not hung. It feels like, the code somehow sends the executor into a limbo.
P.P.S: I even suspected that it is running a complex algo, genuinely requiring time and left it over night. Nothing happened even then.
Can someone tell me why my program wont run or display anything?

Python - wxGrid.DeleteCols() : Segmentation Fault

I am working with xyPython, specifically a wx.Grid and when I attempt to delete a column from the grid the program crashes and terminal says "Segmentation Error"
dataGrid.CreateGrid(30, 20)
...
dataGrid.DeleteCols()
That is pretty much the code. I can delete rows, just not columns.
If I remove the delete column line it works fine.
According to the following thread, you may have to set the column labels yourself or you could get an error:
https://groups.google.com/forum/?fromgroups=#!topic/wxpython-users/IpARv4wVoqw
Then in this newer thread, it's said that you may have to call IncRef: https://groups.google.com/forum/?fromgroups=#!topic/wxpython-users/I1kndNvIEEQ
http://www.wxpython.org/docs/api/wx.grid.Grid-class.html
there are parameters for DeleteCols(self, pos, numCols, updateLabels)

Categories

Resources