Thursday, October 22, 2009

Investing from Data Mining

Business intelligence, data mining, etc are techniques that make use of the computation power of computers, and algorithm from human research to find meaning out of data. From a statistician point of view, the goal of investing is easy. The goal is to find the stock that will give you the capital return since the best dividend return cannot achieve more than the best capital return.


Given the goal, with Reuters data, you can just zoom into 1 label/data which is price per share. Ignoring all data set, I did a quick run through finding which data correlates closely to price per share without even cleaning the data. The best was Gross Dividend, Total Revenue and Cash from Operating Activities each with correlation ratio of around 0.5 to 0.6. A positive 0.6 ratio means that given Gross Dividend rises by 1, price should rises by 0.6. From a statistician point of view and without any knowledge of the data, it means if a companies has more dividend, total revenue and cash from operating activities, the prices is very likely to rises too.

Since dividend seems to have a huge impact to price, I did an additional step to find which companies had no given any dividend and is the most probably to give one based on the financial statements. Again using data mining, I found that Swiber is the most likely to give dividend. However, the company seems to be leveraging more currently than paying to its shareholder.

Limitations of Approach:
Past Performance Is Not An Indicator Of Future Results. Like how many unit trusts advertised in small print, "Past Performance Is Not An Indicator Of Future Results", this approach assumed that you can use past data to predict the future.

Data cleaning. Typically, in data mining, we need to clean the data to remove outliers, which are supposed to be data that is extraordinary and skew the accuracy. However, I didn't do that since I don't have much prior knowledge and statistics can be easily manipulated to improve accuracy of result. Also, the result can probably be improved by normalized some data like converting revenue, operating cashflow, etc to the per share unit. Nevertheless, I am assuming that I have no knowledge of data and is only interested to get the stock that will maximize capital gain. Besides, I only did a simple analysis just for interests.

I also been looking at stock prices only Monday weekly. Mostly for making more efficient use of time and also not being myopic. Sometimes, when you look too closely, you are ignorant of the big changes. I will slowly try to switch to looking only monthly, and hopefully only yearly. Warren Buffett once mentions about having one good idea each year. With less choice, you will filter out the weaker choices hopefully and only buy the best one. I still keep updated on certain keywords in SGX and also news of holding I am buying. But there are mostly "push" based, so I only act if there are news rather than watching news for something interesting.

1 comment:


