Using R Language interface for TensorFlow, it is possible to program Deep Neural Networks using tfestimators library. I provide one simple example of how to create a Classifier.
The problem is to guess the way the Sensex will move based on the following parameters:
- Historical prices of the Sensex
- Historical prices of the Nifty
- Historical prices of Gold
- The date on which the above 3 prices were recorded
- The day of the date
- The month of the above date
- The year of the above date
- The day of the week of the date
- The Julian day of the date
The Sensex Move will be reported as 0 if the Sensex Goes Up as compared to the previous day index, 1 if Sensex Goes Down as compared to the previous day index and 2 if Sensex remains the same as compared to the previous day index.
Before we start programming the predictor, let us look at the libraries that we require.
if("lubridate" %in% rownames(installed.packages()) == FALSE) {install.packages("lubridate")} library(lubridate) if("tensorflow" %in% rownames(installed.packages()) == FALSE) {install.packages("tensorflow"); library(tensorflow); install_tensorflow()} library(tensorflow) if("tfestimators" %in% rownames(installed.packages()) == FALSE) {install.packages("tfestimators")} library(tfestimators)
Next, let us load the data and have a look at the data.
df <- read.csv(file="PriceIndex.csv", header=TRUE, sep=",") df$ObsDate <- as.Date(df$ObsDate, "%Y-%m-%d") df$ObsMonth <- month(df$ObsDate) df$YearNum <- year(df$ObsDate) df$WeekDay <- as.POSIXlt(df$ObsDate)$wday df$ObsDay <- day(df$ObsDate) df$ObsJulianDay <- as.numeric(format(df$ObsDate, "%j"))
The first few records in the data.
The last few records in the data.
We need to create the labels for the Sensex Movement based on the historic data. These labels will be used to train the model.
for(i in 1:nrow(df)) { if(i > 1) { if(is.na(df[i,6]) | is.na(df[i-1,6])) {sensexMovement <- 2} else if(df[i,6] > df[i-1,6]) {sensexMovement <- 0} else if(df[i,6] < df[i-1,6]) {sensexMovement <- 1} else {sensexMovement <- 2} } else { sensexMovement <- 2 } df[i,]$sensexMove <- sensexMovement } df$sensexMove <- as.factor(as.integer(df$sensexMove))
Now, we start building the model.
First, we need to set up the input function. The input function defines the independent variables and the dependent variables. In our case, the independent variable is the value of the Sensex. The dependent variables are values of Nifty, Gold, Date, Year, Month, Weekday and Julian Day.
sensex_input_fn <- function(data, num_epochs = 1) { input_fn(data, features = c("ObsJulianDay", "ObsMonth", "ObsDay", "WeekDay", "YearNum", "SENSEX", "NIFTY", "GOLD"), response = "sensexMove", batch_size = 32, num_epochs = num_epochs) }
Next, we need to set up the feature columns. Feature Columns can be Categorical or Numeric. We use only Numeric columns in this example.
sensex_cols <- feature_columns( column_numeric("ObsJulianDay"), column_numeric("ObsMonth"), column_numeric("ObsDay"), column_numeric("WeekDay"), column_numeric("YearNum"), column_numeric("NIFTY"), column_numeric("SENSEX"), column_numeric("GOLD") )
Next, we need to set up the model. We set up a Deep Neural Network Classifier. model_dir parameter specifies where the model will be stored on the disk. The stored model can be retrieved at a later point of time to make predictions using the same.
We can set up the hidden layers of the Neural Network using the hidden_units parameter. In this example, I have set up 3 layers of hidden layers, each containing 8 neurons.
modelSensex <- dnn_classifier(hidden_units = c(8,8,8), feature_columns = sensex_cols, n_classes = 3, model_dir='./Move')
The complete syntax for the dnn_regressor function is as follows.
dnn_classifier(hidden_units, feature_columns, model_dir = NULL,
n_classes = 2L, weight_column = NULL, label_vocabulary = NULL,
optimizer = "Adagrad", activation_fn = "relu", dropout = NULL,
input_layer_partitioner = NULL, config = NULL)
Next, we set up the data for training and testing. We use 90% of the available data for training and 10% of the available data for testing.
cutOfDate <- "2018-04-10" dfModel <- subset(df, OBSDATE <= as.Date(cutOfDate, "%Y-%m-%d")) indicesSensex <- sample(1:nrow(dfModel), size = 0.90 * nrow(dfModel)) trainSensex <- df[indicesSensex, ] testSensex <- df[-indicesSensex, ] # train the model modelSensex %>% train(sensex_input_fn(trainSensex, num_epochs = 10))
The following factor levels of 'sensexMove' have been encoded: - '0' => 0 - '1' => 1 - '2' => 2 2018-05-13 19:38:55.117949: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA [\] Training -- loss: 20.01, step: 542
Once the model has been trained, we can evaluate it.
modelSensex %>% evaluate(sensex_input_fn(testSensex))
The following factor levels of 'sensexMove' have been encoded: - '0' => 0 - '1' => 1 - '2' => 2 [/] Evaluating -- loss: 2.31, step: 8# A tibble: 1 x 4 average_loss loss global_step accuracy <dbl> <dbl> <dbl> <dbl> 1 1.11 31.3 542. 0.327
Once the model has been created, it is persisted on the disk in the directory defined by the parameter model_dir.
We can write the following code to load the model if already available or generate the model if it is not created yet.
# Assume the Sensex Model exists generateSensexModel <- 0 # Check if the Sensex Model exists if(generateSensexModel == 1) { modelSensex <- dnn_classifier(hidden_units = c(8,8,8), feature_columns = sensex_cols, n_classes = 3, model_dir='./Move') cutOfDate <- "2018-04-10" dfModel <- subset(df, OBSDATE <= as.Date(cutOfDate, "%Y-%m-%d")) indicesSensex <- sample(1:nrow(dfModel), size = 0.90 * nrow(dfModel)) trainSensex <- df[indicesSensex, ] testSensex <- df[-indicesSensex, ] # train the model modelSensex %>% train(sensex_input_fn(trainSensex, num_epochs = 10)) # Evaluate the model modelSensex %>% evaluate(sensex_input_fn(testSensex)) # Save the model saved_model_dir <- model_dir(modelSensex) save(saved_model_dir, file="SensexModelDir.rds") }
Now that the model has been created, it can be used to predict the value of the independent variable.
# Form the Prediction Parameters ObsJulianDay <- c(as.numeric(format(TodayDate, "%j"))) ObsMonth <- c(month(TodayDate)) ObsDay <- c(day(TodayDate)) WeekDay <- c(as.POSIXlt(TodayDate)$wday) YearNum <- c(year(TodayDate)) Nifty <- c(YesterdayRecord$Nifty) Gold <- c(YesterdayRecord$Gold) obs <- data.frame(ObsJulianDay, ObsMonth, ObsDay, WeekDay, YearNum, Gold, Nifty) # Predict movement <- (modelSensex %>% predict(sensex_input_fn(obs))) movement <- as.numeric(paste(unlist(movement['class_ids']), collapse=''))
We can check how our model performed with the Test Data.
For this, we can create the prediction against each of the test data points.
movement <- (modelSensex %>% predict(sensex_input_fn(testSensex))) movePreds <- unlist(movement['class_ids']) fMovePreds <- as.factor(movePreds) fOriginal <- as.factor(testSensex$sensexMove)
Now, we can check the Confusion Matrix and see the performance.
install.packages("caret") install.packages("e1071") library(caret) library(e1071) confusionMatrix(data=fMovePreds, reference=fOriginal, positive = NULL, dnn = c("Prediction", "Reference"))
Confusion Matrix and Statistics Reference Prediction 0 1 2 0 13 23 15 1 20 16 15 2 55 31 38 Overall Statistics Accuracy : 0.2965 95% CI : (0.2377, 0.3606) No Information Rate : 0.3894 P-Value [Acc > NIR] : 0.9986 Kappa : -0.039 Mcnemar's Test P-Value : 2.676e-06 Statistics by Class: Class: 0 Class: 1 Class: 2 Sensitivity 0.14773 0.2286 0.5588 Specificity 0.72464 0.7756 0.4557 Pos Pred Value 0.25490 0.3137 0.3065 Neg Pred Value 0.57143 0.6914 0.7059 Prevalence 0.38938 0.3097 0.3009 Detection Rate 0.05752 0.0708 0.1681 Detection Prevalence 0.22566 0.2257 0.5487 Balanced Accuracy 0.43618 0.5021 0.5073
Request you to kindly leave a reply.