Finance Train

↧

Measure Model Performance in R Using ROCR Package

November 13, 2019, 9:21 pm

R’s ROCR package can be used for evaluating and visualizing the performance of classifiers / fitted models. It is helpful for estimating performance measures and plotting these measures over a range of...

View Article

Create a Confusion Matrix in R

November 13, 2019, 9:32 pm

A confusion matrix is a tabular representation of Actual vs Predicted values. As you can see, the confusion matrix avoids “confusion” by measuring the actual and predicted values in a tabular format....

View Article

Credit Risk Modelling – Case Study- Lending Club Data

November 14, 2019, 3:57 am

To build a good model, it is important to use high quality data. For the purpose of this course, we will use the loan data available From LendingClub’s website. LendingClub is a US peer-to-peer lending...

View Article

Explore Financial Data in R

November 14, 2019, 5:32 pm

Now that we have the data file in our working directory, we can load it in our R session and start exploring it. Use the following command to load the data into R. The “stringsAsFactors = FALSE”...

View Article

Explore Loan Data in R – Loan Grade and Interest Rate

November 14, 2019, 9:11 pm

There is no set path to how one would go about analyzing a data set. Typically, a data scientist would spend quite some time exploring and observing the data to understand it well. Let’s look at some...

View Article

Credit Risk Modelling – Required R Packages

November 14, 2019, 9:15 pm

During our analysis, we will make use of various R packages. So, let’s look at what these packages are and let’s install and load them in R. Dplyr ‘Dplyr’ provides a set of tools for efficiently...

View Article

Loan Data – Training and Test Data Sets

November 14, 2019, 9:24 pm

For building the model, we will divide our data into two different data sets, namely training and testing datasets. The model will be built using the training set and then we will test it on the...

View Article

Data Cleaning in R – Part 1

November 14, 2019, 9:31 pm

Discarding Attributes LendingClub also provides a data dictionary that contains details of all attributes of out dataset. We can use that dictionary to understand more about the data columns we have...

View Article

Data Cleaning in R – Part 2

November 14, 2019, 11:03 pm

Attributes with Zero Variance Datasets can sometimes contain attributes (predictors) that have near-zero variance, or may have just one value. Such variables are considered to have less predictor...

View Article

Data Cleaning in R – Part 3

November 15, 2019, 12:43 am

Default by States We take a look at default rate for each state. We filter out states that have too small number of loans(less than 1000): Order States by Default Rate We can order states by default...

View Article

Advanced Concept of Risk-reward Ratio in Trading

November 15, 2019, 8:09 pm

Naïve traders always thinking by following the simple concept of risk-reward ratio, they can make huge money. Things don’t work like this in the real world however. Most of the time, the new traders in...

View Article

5 Factors That Influence the Stock Market – Explained

November 15, 2019, 8:57 pm

While the success of a trader relies mostly on their abilities to anticipate market changes and act upon them, the stock market is known for being fairly volatile. For tens of years now, experts have...

View Article

Data Cleaning in R – Part 5

November 15, 2019, 10:19 pm

Numeric Features Let’s look at all numeric features we have left. We will transform annual_inc, revol_bal, avg_cur_bal, bc_open_to_buy by dividing them by funded_amnt (amount of loan). We can now...

View Article

Remove Dimensions By Fitting Logistic Regression

November 15, 2019, 10:24 pm

We will use the preProcess function from the caret package to center and scale (Normalize) the data. The scale transform calculates the standard deviation for an attribute and divides each value by...

View Article

Create a Function and Prepare Test Data in R

November 16, 2019, 11:13 pm

When we build the model, we will need the same set of columns in the test data also and will also need to apply all the same transformations that we have done to the test_data also. Kept Columns Create...

View Article

Building Credit Risk Model

November 16, 2019, 11:17 pm

The loan data typically have a higher proportion of good loans. We can achieve high accuracy just by labeling all loans as Fully Paid. For our test data, we gain 70.3% accuracy by just following the...

View Article

Logistic Regression Model in R

November 17, 2019, 5:41 pm

To build our first model, we will tune Logistic Regression to our training dataset. First we set the seed (to any number. we have chosen 100) so that we can reproduce our results. Then we create a...

View Article

Support Vector Machine (SVM) Model in R

November 17, 2019, 5:47 pm

A support vector machine (SVM) is a supervised learning technique that analyzes data and isolates patterns applicable to both classification and regression. The classifier is useful for choosing...

View Article

Random Forest Model in R

November 18, 2019, 6:11 am

Now, we will tune RandomForest model. Like SVM, we tune parameter based on 5% downsampling data. The procedure is exactly the same as for SVM model. Below we have reproduced the code for Random Forest...

View Article

Extreme Gradient Boosting in R

November 18, 2019, 6:18 am

Extreme Gradient Boosting has a very efficient implementation. Unlike SVM and RandomForest, we can tune parameter using the whole downsampling set. We focus on varying Ridge & Lasso regularization...

View Article