I'm writing a machine learning project for some fun but I have run into an interesting error that I can't seem to fix. I'm using Sklearn (LinearSVC, train_test_split), numpy, and a few other small libraries like collections.
The project is a comment classifier - You put in a comment, it spits out a classification. The problem I'm running into is that a Memory Error (Unable to allocate 673. MiB for an array with shape (7384, 11947) and data type float64) when doing a train_test_split to check the classifier accuracy, specifically when I call model.fit.
There are 11947 unique words that my program finds, and I have a large training sample (14,769), but I've never had an issue where I run out of RAM. The problem is, I'm not running out of RAM. I have 32 GB, but the program ends up using less than 1gb before it gives up.
Is there something obvious I'm missing?
Related
I am trying to learn deep learning, especially how to do 'speaker diarization (github link).
I built an environment on Ubuntu to run a speaker diarization project on github.
However, I faced a 'Killed' message during generating embeddings for speaker.
I could find that error had occurred with below code (From here)
np.savez('training_data', train_sequence=train_sequence, train_cluster_id=train_cluster_id)
After I reduce the epochs, I found that training_data.npz is about 500MB (much smaller than my RAM) and somehow generating embeddings is normally finished with few epochs.
But after this, I faced same error on loading this small training_data.npz for training.
train_data = np.load('./ghostvlad/training_data.npz')
To summarize, I don't understand why my system can't save or load this small file.
(P.S. Sorry for my poor English)
I'm trying to use the python dlib.train_shape_predictor function to train using a very large set of images (~50,000).
I've created an xml file containing the necessary data, but it seems like train_shape_predictor loads all the referenced images into RAM before it starts training. This leads to the process getting terminated because it uses over 100gb of RAM. Even trimming down the data set uses over 20gb (machine only has 16gb physical memory).
Is there some way to get train_shape_predictor to load images on demand, instead of all at once?
I'm using python 3.7.2 and dlib 19.16.0 installed via pip on macOS.
I posted this as an issue on the dlib github and got this response from the author:
It's not reasonable to change the code to cycle back and forth between disk and ram like that. It will make training very slow. You should instead buy more RAM, or use smaller images.
As designed, large training sets need tons of RAM.
I'm trying to train a model (implementation of a research paper) on K80 GPU with 12GB memory available for training. The dataset is about 23 GB and after data extraction, it shrinks to 12GB for the training script.
At about 4640th step (max_steps being 500,000), I receive the following error saying Resource Exhausted and the script stops soon after that. -
The memory usage at the beginning of the script is:
I went through a lot of similar questions and found that reducing the batch-size might help but I have reduced the batch-size to 50 and the error persists. Is there any other solution except switching to a more powerful GPU?
This does not look like a GPU Out Of Memory (OOM) error but more like you ran out of space on your local drive to save the checkpoint of your model.
Are you sure that you have enough space on your disk or that the folder you save to doesn't have a quotta?
I randomly encounter the same error whenever I run XGBoost model (both the normal run and grid search). The error message says this:
H2OConnectionError: Local server has died unexpectedly. RIP.
I don't know what happens, I tried to change versions but didn't work. I'm currently using the version 3.18.0.5. Does anyone have any idea what is happening? Thanks in advance
The only time I've seen this happen is when H2O runs out of memory. Please check that you have enough memory -- an H2O cluster should have at least 4x the amount of RAM as the dataset you're trying to train a model on (data size on disk).
I have large datasets of 2-3 GB. I am using (nltk) Naive bayes classifier using the data as train data. When I run the code for small datasets, it runs fine but when run for large datasets it runs for a very long time(more than 8 hours) and then crashes without much of an error. I believe it is because of memory issue.
Also, after classifying the data I want the classifier dumped into a file so that it can be used later for testing data. This process also takes too much time and then crashes as it loads everything into memory first.
Is there a way to resolve this?
Another question is that is there a way to parallelize this whole operation i.e. parallelize the classification of this large dataset using some framework like Hadoop/MapReduce?
I hope you must increase the memory dynamically to overcome this problem. I hope this link will helps you
Python Memory Management
Parallelism in Python