I have a lot of csv file containing approximately 1000 rows and 2 columns where the data looks like this:
21260.35679 0.008732499
21282.111 0.008729349
21303.86521 0.008721652
21325.61943 0.008708224
These two are the features where the output will be a device name. Each csv file is data from a specific device of different times and there are also many devices. What I am trying to do is train the data and then classify the device name using CNN. If there is any incoming data outside of the trained observation, it should be classified as anomaly.
I am trying to convert those values to image matrix so that I can use CNN to train this data. But I what I am concerned about is, the second columns contains value less than 1 or and close to zero and the value is also float. If I convert it to integer it becomes zero and if all the values becomes zero then it doesn't make any sense.
How to solve this? And is it even possible to use CNN on these datasets?
From your description, your problem seems to be a sequence classification.
You have many temporal sequences. Each sequence has the same quantity of 2D elements and is associated to a device. Given a sequence as input, you want to predict the corresponding device.
This kind of temporal dependencies are better captured by RNNs. I would suggest giving a look at LSTM .
Related
I have a dataset of images and their two regression values in the CSV file. For example, "img1.jpg" has two numeric values "x_cord" and "y_cord" stored in annotation.csv. I want to train my neural network with images and these two values from CSV file. But I'm not able to load them. Please someone give a solution to load both of them together and give them as input to the neural network.
Many thanks.
I'm not able to load them together. I have tried flow_from_dataframe, but it only takes one numeric value so I don't know how to load multiple numeric values with an image.
I would like to implement a deep neural network in Python (preferably PyTorch, but TensorFlow is also possible) which predicts the next location and the time of the arrival at that location. For the raw data I have a csv file with a sequence of three values: latitude, longitude, and time:
39.984702,116.318417,2008-10-23,02:53:04
39.984683,116.31845,2008-10-23,02:53:10
39.984686,116.318417,2008-10-23,02:53:15
...
The number of such rows is around 100 000. So, here is my question. How should I split the data, normalize it and transform it, in order to feed it into the DNN (preferably GRU or LSTM, but as I read CNN are also possible) and receive as an output a predicted location and time of arrival?
Based on my current research, what should be done is to split the data into sequences (of n length), normalize the values, maybe even change the format of the time (for sure not feeding it as a string), and treat the last value in the sequence as a label during the teaching of the DNN.
A simple code would be really helpful, with my problems of understanding the different dimensions of the input and outputs for the NNs.
Just a tip, for the time I would transform it into an Epoch Unix Timestamp.
I am working on a prediction problem where a user is having access to multiple Target and each access is having separate row. Below is the data
df=pd.DataFrame({"ID":[12567,12567,12567,12568,12568],"UnCode":[LLLLLLL,LLLLLLL,LLLLLLL,KKKKKK,KKKKKK],
"CoCode":[1000,1000,1000,1111,1111],"CatCode":[1,1,1,2,2],"RoCode":["KK","KK","KK","MM","MM"],"Target":[12,4,6,1,6]
})
**Here ID is unique but can be repeated if user has accessed multiple targets and target can be repeated as well if accessed by different ID's**
I have converted this data to OHE and used for prediction using binary relevance, where my X is constant and target is varying.
Problem I am facing with this approach is the data becomes sparse and number of features in my original data are around 1300.
Can someone suggest me whether this approach is correct or not and what other methods/approach I can use in this type of problem. Also is this problem can be treated as multilabel classification?
Below is the input data for model
I do not understand how model.predict(...) works on a a time series forecasting problem. I usually use it with a CNN and it is pretty straight forward but for time series I don't understand what it returns.
For example I am currently doing an exercise where I have to forecast the power consumption based on data using LTSM, I succeeded to train my model but when I want to know what the power cusumption will be tomorrow (so no data except past ones) I don't know what input to use.
Traditional ML algorithms, which you might be more used to, generally expect the data in a 2D structure like this:
For sequential data, such as a stream of timed events associated with each user, it’s also possible to create a lagged 2D dataset, where the history of different features for different IDs is aligned into single rows, with this structure:
This can be a good way to work because once your data is in the correct shape you can use it with fast to set up and train models. However, models using features engineered using this approach generally don’t have any capacity to “learn” anything about the natural sequence of the data. To something like a tree-based ensemble model receiving this format, feature 1 at time t and time t-1 in the example above are treated completely independently and this can severely limit the model’s predictive power.
There are types of deep learning architecture specifically designed for modelling sequence data called recurrent neural nets (RNN). Two of the most popular cells to use in these are long short term memory (LSTM) and gated recurrent units (GRU). There’s a good post on how to understand how LSTM cells work here, but the TL;DR is they have a structure that allows them to learn from sequences of data.
Cells like LSTM expect a 3D tensor of input data. We arrange it so that one axis has the data features along it, the second axis has the sequence steps (like time ticks) and the third axis has each of the different examples we want to predict a single "y" value for stacked along it. Using the same type of dataset as the lagged example above, it would look something like this:
The ability to learn patterns in sequences of data like this is particularly beneficial for both time series and text data, which are naturally ordered.
To return to your original question, when you want to predict something in your test set you'll need to pass it sequences represented just like the ones it was trained in (this is a reasonably good rule of supervised learning in general). For example, if the data is trained like the last example above, you'll need to pass it a 2D example for each ID you want to make a prediction for.
You should explore the way the original training data is represented and make sure you understand it well, as you'll need to create the same shape of data to make predictions. X_train.shape is a great place to start, if you have your training data in a pandas dataframe or numpy arrays, to see what the dimensionality is, and then you can inspect entries along each axis until you get a good feel for the data it contains.
I am currently struggeling with the Problem of combining static features with a sequence of input data within a batch.
I have two channels of input data, one is processed via a convolutional neural network (eg. vgg-16 or comparable) and outputs a feature map.
My second input channel contains a list (with variable length) of input data.
Each single entry of that list and the calculated feature map should be fed into a classifier.
I know that I can use a TimeDistributed Wrapper to process sequences of data, but that only partially solves my problem:
The calculation of the feature map in the first input channel is costly and should only be performed once per batch
As the list in the second channel has a variable length, I cannot make use of a repeat Layer to to duplicate the feature map properly, additionally I run into memory problems as I cannot hold several hundreds (or thousand) copies of the feature map in gpu memory
What is the best way to properly combine static data (ones per batch) with a sequence of data?