**What determines the interest rate on your loan? This is an analysis of about 110,000 consumer loans from a peer-to-peer lending marketplace called Prosper Loans (Prosper.com)**

**Analysis performed using R programming language.**

## Distributions of Single Variables

Lets look at the variables individually to get a sense of them.

### Loan Amount

The median loan amount is $6500. We also notice peaks at $10000, $15000 and $20000. This is likely because people tend to borrow in round amounts.

### Borrower Rate

Let's look at the overall profile of interest rates offered to the borrower population.

We observe a peak at 18% (which is the mean) and another one at 32%. One could venture a guess that this is a flat rate offered to risky borrowers.

### Lender Yield

`## Min. 1st Qu. Median Mean 3rd Qu. Max. ## -0.010 0.124 0.173 0.183 0.240 0.492`

We observe that the distribution of Lender Yield closely resembles the borrower rate but slightly lower (The difference is accounted for by various fees and collection charges)

### Estimated Lender Loss

About a quarter of loans register a loss (29084 out of 113937) and the median loss (when it occurs) is 7.2%

### Borrower Rating

Borrower Ratings appear to be normally distributed. The largest fraction are rated “C”

### Income Range

Income range is normally distributed. The majority of borrowers earn between $25000 and $75000. The median monthly income is at $4667. Interestingly there is little difference between income ranges when it comes to borrower rating.

### Debt to Income Ratio

Debt to Income Ratio of the best borrowers (AA) tapers off sharply at 25%. Also borrowers with the lowest income have the biggest spread in debt to income ratio.

### MonthlyLoanPayments

The highest frequency of monthly payments is at $200. Higher income ranges tend to have a bigger spread of monthly payments. Better rated customers have higher monthly payments.

### Number of Investors

Fewer investors for riskier borrowers.

###

###

### Borrower Ratings vs. Loan Amounts

###
Higher Quality borrowers are able to borrow higher amounts.

### Borrower Ratings vs. Interest Rates

The highest rated customers (AA) get better interest rates. “HR” rated borrowers pay the most interest.

### Employment and Interest Rates

Borrowers that are “Employed” or “Full-Time” pay lower interest rates. “Self-Employed” category pays the highest interest rates.

### Occupations and Intrerst Rates

Borrowers in professional occupations pay less interest. Laborers pay the most.

## Interest Rate Profile by Income

Let's look at the overall profile of interest rates offered, by Income.

It is interesting to see that borrowers with the highest income pay the least interest and vice versa.

## Interest Rate Profile by Employment Status

Let's look at the overall profile of interest rates offered, by Employment Status

The distribution of interest rates across the categories is largely uniform.

### Effective Yield and Borrower Quality

Interestingly, borrowers that are riskier than average (HR,E,D) produce more effective return for the investor than better quality borrowers.

### Borrower Quality and Losses

Let's look at the overall profile of Estimated Loss

There are multiple peaks in the distribution, but most losses occur at 8%. Unsurprisingly, the riskiest borrowers (E,HR) produce more loss than safer borrowers (AA,A)

### Income Ranges and Interest Rates

There is no difference between income ranges when it comes to interest rates. It is purely a function of borrower quality.

### Income Range vs. Monthly Payments

Not surprisingly, borrowers with higher income pay more in monthly payments. But the spread in montly payments appears to be independent of the borrower's rating.

### Interest Rates vs. Borrower Ratings

As it should be, highest quality borrowers pay the least interest.

### Prosper Ratings and Credit Scores

###
Credit Scores (from ratings agencies) agrees with the customer rating provided by Prosper.

### Borrower Rating vs. Open Credit Lines

###
The number of open credit lines appears to be independent of the quality of the borrower

### Borrower Rating vs. Bank Card Utilization

###
Interestingly, A and B rated borrowers have the highest bank card utilization. It is possible that the other borrowers tend to have debt that is not credit card debt.

### Loan Amount and Income Range by Rating

###
Not surprisingly, people with higher income borrow more money.

### Number of Investors vs. Customer Rating per Income Range

###
It is interesting to see that the lenders have no preference for borrowers of higher quality or with better income.

### Income Ranges vs. Loan Amounts

###
It looks like the loan amounts are roughly proportional to the income ranges of borrowers. Variance in loan amounts is partly explained by the quality of the borrower.

### Borrower Ratings vs. Interest Rates

###
As we already saw, higher quality borrowers get better rates.

### Credit Scores vs. Interest Rates

###

Between the range of 650 - 850, There are differences between income ranges when it comes to interest rates. Unsurprisingly, higher rated borrowers have higher credit scores and get better rates.

### Borrower Ratings vs. Effective Yields

###
As we already saw, riskier borrowers produce better yields with higher spreads. There's vitually no difference between income ranges in yield rates.

## Loss Rate Profile by Income

###
Let's look at the overall profile of Loss Rates, by Income.
Unemployed borrowers produce the most loss, unsurprisingly.

## Loss Rate Profile by Employment Status

###
Let's look at the overall profile of lost rates, by Employment Status
Once again, Unemployed borrowers produce the most loss.

### Borrower Ratings vs. Loss

###
Not surprisingly riskier customers produce more losses. The distribution is independent of income range.

### Percent Funded vs. Borrower Rating

###
Virtually all loans get funded regardless of borrower quality.

### Loan Amounts vs. Borrower Rates per Borrower Quality, faceted by Employment Status

###
While higher quality borrowers get better rates, “Employed” borrower rates have the least variation in their rates, followed “Full-Time”.

### Loan Amounts vs. Borrower Rates per Borrower Quality, faceted by Income

###
It appears that Income doest not matter when it comes to borrower rates. It's more of a function of borrower quality.

It appears that Income doest not matter when it comes to borrower rates. It's more of a function of borrower quality.

### Credit Scores and Interest Rates

### Prosper Score and Interest Rates

###

```
##
## CORRELATIONS
## ============
## - correlation type: pearson
## - correlations shown only when both variables are numeric
##
## ProsperScore BorrowerRate
## ProsperScore . -0.650
## BorrowerRate -0.650 .
```

Prosper Score and Interest Rates are negatively correlated at -.654

```
##
## CORRELATIONS
## ============
## - correlation type: pearson
## - correlations shown only when both variables are numeric
##
## ProsperScore BorrowerRate
## ProsperScore . -0.650
## BorrowerRate -0.650 .
```

### Delinquency and Interest Rates

```
##
## CORRELATIONS
## ============
## - correlation type: pearson
## - correlations shown only when both variables are numeric
##
## LoanCurrentDaysDelinquent BorrowerRate
## LoanCurrentDaysDelinquent . 0.136
## BorrowerRate 0.136 .
```

There is a very week relationship between days delinquent and interest rates.

### Loan Amount and Interest Rates

```
##
## CORRELATIONS
## ============
## - correlation type: pearson
## - correlations shown only when both variables are numeric
##
## LoanOriginalAmount BorrowerRate
## LoanOriginalAmount . -0.329
## BorrowerRate -0.329 .
```

Loan amounts and interest rates are moderately correlated at -.415

### Monthly Income and Interest Rates

```
##
## CORRELATIONS
## ============
## - correlation type: pearson
## - correlations shown only when both variables are numeric
##
## StatedMonthlyIncome BorrowerRate
## StatedMonthlyIncome . -0.089
## BorrowerRate -0.089 .
```

No relationship between income and interest rates.

### Debt to Income Ratio and Interest Rates

```
##
## CORRELATIONS
## ============
## - correlation type: pearson
## - correlations shown only when both variables are numeric
##
## DebtToIncomeRatio BorrowerRate
## DebtToIncomeRatio . 0.063
## BorrowerRate 0.063 .
```

Weak relationship between Debt to Income ratio and interest rate

### Credit Score and Estimated Loss

```
##
## CORRELATIONS
## ============
## - correlation type: pearson
## - correlations shown only when both variables are numeric
##
## CreditScore EstimatedLoss
## CreditScore . -0.511
## EstimatedLoss -0.511 .
```

Moderate negative correlation between credit scores and loss.

### Prosper Score and Estimated Loss

```
##
## CORRELATIONS
## ============
## - correlation type: pearson
## - correlations shown only when both variables are numeric
##
## ProsperScore EstimatedLoss
## ProsperScore . -0.674
## EstimatedLoss -0.674 .
```

Moderate negative correlation between Prosper Scores and loss, although better than pure credit scores.

### Delinquency and Estimated Loss

```
##
## CORRELATIONS
## ============
## - correlation type: pearson
## - correlations shown only when both variables are numeric
##
## LoanCurrentDaysDelinquent EstimatedLoss
## LoanCurrentDaysDelinquent . 0.195
## EstimatedLoss 0.195 .
```

Weak correlation between delinquency and loss.

### Loan Amount and Estimated Loss

```
##
## CORRELATIONS
## ============
## - correlation type: pearson
## - correlations shown only when both variables are numeric
##
## LoanOriginalAmount EstimatedLoss
## LoanOriginalAmount . -0.430
## EstimatedLoss -0.430 .
```

Moderate negative correlation between Loan Amount and loss.

### Loan Amounts and Monthly Incomes

```
##
## CORRELATIONS
## ============
## - correlation type: pearson
## - correlations shown only when both variables are numeric
##
## LoanOriginalAmount StatedMonthlyIncome
## LoanOriginalAmount . 0.201
## StatedMonthlyIncome 0.201 .
```

Weak positive correlation between Loan Amount and Monthly Income.

### MonthlyIncomes and Debt to Income Ratio

```
##
## CORRELATIONS
## ============
## - correlation type: pearson
## - correlations shown only when both variables are numeric
##
## DebtToIncomeRatio StatedMonthlyIncome
## DebtToIncomeRatio . -0.123
## StatedMonthlyIncome -0.123 .
```

Weak negative correlation between Loan Amount and Monthly Income.

### Linear Model - Borrower Rates

## Let's try to build a linear model that explains the borrower's interest rates in terms of the independent variables - Credit Score, Rating, Prosper Score, Total number of loans, Delinquency and the Loan Amount

`## `

```
## Call:
## lm(formula = BorrowerRate ~ CreditScore + CustRating + ProsperScore +
## TotalProsperLoans + LoanCurrentDaysDelinquent + LoanOriginalAmount,
## data = loans)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.15865 -0.01203 -0.00089 0.01286 0.16471
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.43e-01 3.83e-03 37.35 <2e-16 ***
## CreditScore -6.21e-05 4.47e-06 -13.90 <2e-16 ***
## CustRatingA 2.99e-02 7.46e-04 40.02 <2e-16 ***
## CustRatingB 6.62e-02 8.61e-04 76.86 <2e-16 ***
## CustRatingC 1.05e-01 9.51e-04 110.97 <2e-16 ***
## CustRatingD 1.49e-01 1.05e-03 141.38 <2e-16 ***
## CustRatingE 2.02e-01 1.21e-03 166.55 <2e-16 ***
## CustRatingHR 2.18e-01 1.27e-03 171.00 <2e-16 ***
## ProsperScore -2.38e-03 9.25e-05 -25.73 <2e-16 ***
## TotalProsperLoans -9.37e-04 2.01e-04 -4.67 3e-06 ***
## LoanCurrentDaysDelinquent 1.58e-05 1.02e-06 15.42 <2e-16 ***
## LoanOriginalAmount 6.11e-07 3.20e-08 19.07 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.0222 on 19785 degrees of freedom
## (94140 observations deleted due to missingness)
## Multiple R-squared: 0.919, Adjusted R-squared: 0.919
## F-statistic: 2.04e+04 on 11 and 19785 DF, p-value: <2e-16
```

We are able to explain 92% of the variation in interest rates in terms of these independent variables.

### Linear Model - Estimated Loss

## Now let's try to build a linear model that explains the estimated loss on the loan in terms of the interest rate, customer rating, credit score and the borrower's debt to income ratio

```
```

```
##
## Call:
## lm(formula = EstimatedLoss ~ BorrowerRate + CustRating + CreditScore +
## DebtToIncomeRatio, data = loans)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.04117 -0.00523 -0.00007 0.00436 0.20980
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.61e-02 7.48e-04 21.53 <2e-16 ***
## BorrowerRate 1.89e-01 1.60e-03 118.14 <2e-16 ***
## CustRatingA 9.06e-03 1.72e-04 52.51 <2e-16 ***
## CustRatingB 2.23e-02 2.10e-04 106.48 <2e-16 ***
## CustRatingC 3.82e-02 2.56e-04 149.22 <2e-16 ***
## CustRatingD 5.79e-02 3.28e-04 176.51 <2e-16 ***
## CustRatingE 8.23e-02 4.02e-04 204.92 <2e-16 ***
## CustRatingHR 1.12e-01 4.40e-04 253.63 <2e-16 ***
## CreditScore -2.12e-05 9.36e-07 -22.62 <2e-16 ***
## DebtToIncomeRatio -6.00e-04 1.10e-04 -5.45 5e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.00963 on 77547 degrees of freedom
## (36380 observations deleted due to missingness)
## Multiple R-squared: 0.956, Adjusted R-squared: 0.956
## F-statistic: 1.87e+05 on 9 and 77547 DF, p-value: <2e-16
```

We are able to explain more than 95% of the variation in Loss to Investors in terms of these independent variables.

## Interest Rate Profile by Borrower Quality

Let's look at the overall profile of interest rates offered, by borrower quality.

Each Borrower quality rating appears to have at least two peaks, except for the highest rated (AA). They generally blend into each other.

## Loss Rate by Borrower Quality

Let's look at the overall profile of interest rates offered, by borrower quality.

Each Borrower quality rating appears to have multiple peaks. Losses are higher with lower