By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    construction analytics
    5 Benefits of Analytics to Manage Commercial Construction
    5 Min Read
    benefits of data analytics for financial industry
    Fascinating Changes Data Analytics Brings to Finance
    7 Min Read
    analyzing big data for its quality and value
    Use this Strategic Approach to Maximize Your Data’s Value
    6 Min Read
    data-driven seo for product pages
    6 Tips for Using Data Analytics for Product Page SEO
    11 Min Read
    big data analytics in business
    5 Ways to Utilize Data Analytics to Grow Your Business
    6 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Not your typical financial risk model: A detailed data analysis example
Share
Notification Show More
Latest News
cloud-centric companies using network relocation
Cloud-Centric Companies Discover Benefits & Pitfalls of Network Relocation
Cloud Computing
construction analytics
5 Benefits of Analytics to Manage Commercial Construction
Analytics
database compliance guide
Four Strategies For Effective Database Compliance
Data Management
Digital Security From Weaponized AI
Fortifying Enterprise Digital Security Against Hackers Weaponizing AI
Security
DevOps on cloud
Optimizing Cost with DevOps on the Cloud
Development
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Mining > Not your typical financial risk model: A detailed data analysis example
Data MiningPredictive Analytics

Not your typical financial risk model: A detailed data analysis example

TimManns
Last updated: 2010/10/29 at 3:04 AM
TimManns
9 Min Read
SHARE
- Advertisement -

I’ve not done a lot of analysis in the finance industry, and my Google searches didn’t yield helpful insights for similar data mining. I just finished a project and would like some feedback. I’m trying to explain this as a data preparation and analysis approach to solve a specific problem. I’ve described as best I could without names or actual data. I also did a lot of presentation and extra info for the segments not described here. If anyone has relevant words of wisdom, or suggestions for a different approach they would have taken, then please describe it!

I’ve not done a lot of analysis in the finance industry, and my Google searches didn’t yield helpful insights for similar data mining. I just finished a project and would like some feedback. I’m trying to explain this as a data preparation and analysis approach to solve a specific problem. I’ve described as best I could without names or actual data. I also did a lot of presentation and extra info for the segments not described here. If anyone has relevant words of wisdom, or suggestions for a different approach they would have taken, then please describe it! Otherwise, perhaps this will be helpful to others…

- Advertisement -

The business problem to solve was generating customer insight (Businesses with loans), with considerations for each client business’ financial health and business loan repayment risk.

The first thing we concentrated on was tax payments. The data I had access to contained typical finance account monthly summaries (eg. balance at close of month, total $ of transactions etc) but also two years of detailed transactional history of all outgoing and inbound money transfers/payments (eg. including tax payments made by many thousands of businesses). We examined two years of summary data and also all transactions for only those money transfers/payments that involved the account number belonging to the tax man.

More Read

data mining helps with offsite SEO

Can Data Mining Aid with Off-Page SEO Strategies?

Albanian Bitcoin Investors Tap the Power of Predictive Analytics
Predictive Analytics Improves Trading Decisions as Euro Rebounds
Can Predictive Analytics Help Traders Navigate Bitcoin’s Volatility?
Perks of Predictive Analytics for Businesses Big and Small

The core idea was to understand each businesses tax payments over time in order to get an accurate view of their financial health. Obviously this would have great importance in predicting future loan repayments or likelihood of future financial problems. One main objective was to understand if tax payment behavior differed significantly between customers, and a secondary consideration was the risk profiles of any subgroups or segments that could be identified.

It was a quick preliminary investigation (less than two weeks work) so I tackled the problem very simplistically to meet deadlines.

For the majority of client businesses tax payments occur quarterly or monthly, so I first summarized the data to a quarterly aggregation, for example;

- Advertisement -

As you can see above, each customer could have many records (actually it was a maximum of 8, one for each quarter over a two year period), each record showing the account balance at the end of the quarter and the net sum of payments made to (or from!) the tax man.

Then I created two offset copies of Tax Payments, one being the previous record (Lag) and the other being the subsequent record (Lead) like so;

I then simply scaled the data so that everything was between 0-1 by using;

(X – (minimum of X)) / ((maximum of X) – (minimum of X))

Obviously, where X is one of the variables representing quarterly account balance or tax payments, and the maximum is within Customer ID.

- Advertisement -

For example the raw data here;

Got rescaled to;

I did the all raw balance and tax payment variable rescaling this way so that I could later run a Pearson’s correlation, and k-means clustering, and also graph data easily on the same axis (directly compare balance and tax payments). Some business customers had very large account balances, but small tax payments.

For example I could eventually generate a line chart like this showing a specific business’ relationship between balance (dotted line) and tax payments (bold red line);

I then ran a simple Pearson’s correlation with the variable ‘Balance’ correlated against the 3 tax payment variables (original, lag , and lead) with a correlation Group By clause on the Customer ID. This would output three correlation scores, one for the original (account balance and tax payments in same month), second for the correlation between current account balance and previous month’s tax payments, and the third for the current account balance and future month tax payments.

- Advertisement -

My thought process was to use the highest correlation score (along with balance and tax payment amounts as described below) to build k-means clusters to segment the customer base. Hopefully the segments would reflect, amongst other things, the strongest relationship between account balance and tax payments.

I joined the correlation outputs to the data and then I flipped/transposed and summarized the data so that each quarter was a new column for balance and tax payments, creating a very wide and summarized data set. For example;

…also including the correlation, lag, lead and original value variables in the single record per customer…

Now I have a dataset that is a nice single record per customer, and concentrated on representing the growth or decline in tax payments over the 2 year period. I did this quite simply by converting the raw payments into percentages (of the sum of each customer’s payments over the two years). In some cases a high proportion of the customer’s payments occurred many months ago, which represents a decline in recent quarters.

I then built a K-means model using inputs such as;

- Advertisement -

– the highest correlation score (of the three per customer) and categorical encoding of the correlations (eg. ‘negative correlation’ / ‘positive correlation’, ‘lag’ / ‘lead’ etc)
– Data manipulated payment sums
– Variables representing growth or decline in payments over time.

The segments that were generated have proved to perform very well. Many features of the client business that were not used in the segmentation (eg number of accounts per client, and risk propensity) could be distinguished quite clearly by each segment.

When I examined the incidence of risk (failure or problems repaying a business loan) for a three month period (also with a three month gap) I found some segments had almost double the risk propensity.

Timeline described below;

As you can see, there were a very small number of risk outcomes (just 204 in three months) but each of these is very high value, so any lift in risk prediction is beneficial. I hate working with such small samples, but sometimes you get given lemons….

Suppose I built five clusters, here’s an example summary of the type of results I managed to get;

Where ‘Risk Index’ is simply calculated as;

(‘% Of Total Risk’ – ‘% Of Client Count’ ) / ‘% Of Client Count’

So, this is showing that cluster 5 has 67.91% higher propensity to be a bad risk that the entire base (well, in the analysis…). Conversely cluster 2 is much less (-70%) likely to be a bad risk than the average customer.

Maybe not your typical financial risk model….

TimManns October 29, 2010
Share this Article
Facebook Twitter Pinterest LinkedIn
Share
- Advertisement -

Follow us on Facebook

Latest News

cloud-centric companies using network relocation
Cloud-Centric Companies Discover Benefits & Pitfalls of Network Relocation
Cloud Computing
construction analytics
5 Benefits of Analytics to Manage Commercial Construction
Analytics
database compliance guide
Four Strategies For Effective Database Compliance
Data Management
Digital Security From Weaponized AI
Fortifying Enterprise Digital Security Against Hackers Weaponizing AI
Security

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

data mining helps with offsite SEO
Data Mining

Can Data Mining Aid with Off-Page SEO Strategies?

10 Min Read
predictive analytics helps Albanian bitcoin investors
Blockchain

Albanian Bitcoin Investors Tap the Power of Predictive Analytics

9 Min Read
benefits of data analytics for financial management
Predictive Analytics

Predictive Analytics Improves Trading Decisions as Euro Rebounds

10 Min Read
predictive analytics can help bitcoin traders predict future price movements
Blockchain

Can Predictive Analytics Help Traders Navigate Bitcoin’s Volatility?

8 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence
data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?