By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    data science anayst
    Growing Demand for Data Science & Data Analyst Roles
    6 Min Read
    predictive analytics in dropshipping
    Predictive Analytics Helps New Dropshipping Businesses Thrive
    12 Min Read
    data-driven approach in healthcare
    The Importance of Data-Driven Approaches to Improving Healthcare in Rural Areas
    6 Min Read
    analytics for tax compliance
    Analytics Changes the Calculus of Business Tax Compliance
    8 Min Read
    big data analytics in gaming
    The Role of Big Data Analytics in Gaming
    10 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Text Mining on Financial News
Share
Notification Show More
Latest News
ai in automotive industry
AI Is Changing the Automotive Industry Forever
Artificial Intelligence
SMEs Use AI-Driven Financial Software for Greater Efficiency
Artificial Intelligence
data security in big data age
6 Reasons to Boost Data Security Plan in the Age of Big Data
Big Data
data science anayst
Growing Demand for Data Science & Data Analyst Roles
Data Science
ai software development
Key Strategies to Develop AI Software Cost-Effectively
Artificial Intelligence
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Uncategorized > Text Mining on Financial News
Uncategorized

Text Mining on Financial News

ThemosKalafatis
Last updated: 2008/11/16 at 12:51 PM
ThemosKalafatis
5 Min Read
SHARE

As discussed previously, an analyst should give specific attention to problem representation particularly when we are dealing with text data. We are going to discuss a way to do this. However, something has to give and there is no perfect solution for this task.

First of all we have to find the source of the news : It could be financial news sites such as Bloomberg, Financial Times, or RSS Feeds URLs such as the ones provided by MarketWatch. RSS …


As discussed previously, an analyst should give specific attention to problem representation particularly when we are dealing with text data. We are going to discuss a way to do this. However, something has to give and there is no perfect solution for this task.

First of all we have to find the source of the news : It could be financial news sites such as Bloomberg, Financial Times, or RSS Feeds URLs such as the ones provided by MarketWatch. RSS Feeds might be a better solution because there is already some predetermined categorization of news according to the feed type and this can be great help for some analysts.

More Read

big data improves

3 Ways Big Data Improves Leadership Within Companies

IT Is Not Analytics. Here’s Why.
Romney Invokes Analytics in Rebuke of Trump
WEF Davos 2016: Top 100 CEO bloggers
In Memoriam: Robin Fray Carey

After finding the news sources and making the necessary code to get the actual information we could end up with the following text file :


You can see that i use a ‘^’ separator to differentiate between :

1) A date stamp,
2) A date string
3) The news string
4) A characterization of the news (important or unimportant)
5) A categorization of the financial news.

This simple file could provide the basis for a training file for text categorization. Assuming that we have trained algorithms to automatically classify news, we could use a news classifier to first categorize news to important or unimportant and pass only the important news to a second classifier which will do the detailed classification of the news.

Another option is to use clustering : You can imagine that the solution detailed above has a tremendous amount of work depending on how much data you are planning to collect…so too much data means too much work, less data could mean -usually but not always- less accuracy.

But how could clustering be performed on such data? Simply, we just use field number (4) on our training text file to train a clustering algorithm and then see what ‘classes’ the algorithm has come up with.

So let’s see a small example about clustering : This is a capture from WEKA just before the clustering process :


As you can see i have produced a training file which essentially contains the ‘buzzwords’ of financial news : barrel, recession, Yen, Euro, ECB, price, consumer, etc. The file is then analyzed by K-means algorithm to extract clusters of the same ‘buzzwords’. Each cluster is assigned a number so each news header ultimately falls onto one cluster number.

After running the K-Means algorithm i ended up with 16 clusters. Let’s see two instances that K-Means decided that they should fall under cluster ‘6’ :

Instance_number : 130.0

Fear
Decrease
US
Economy
Futures

and

Instance_number : 174.0

Fear
Decrease
US
Price
Oil
Banking
Recession

So the first instance is about fear of drop in US Economy which results in Futures in US dropping and the second instance must be -something about- a decrease of Oil prices and Banking stocks because of the fear of US recession. Not bad at all…

But not so fast : Clustering presents a lot of problems later in the process. Remember that what we are after, is to combine text mining and data mining together to better understand how the markets react. Should one use classification or clustering? There are many more things to take under consideration and for obvious reasons i cannot disclose all the details of such a project…but i am hoping to give to the interested reader a good enough introduction on the subject.

Link to original post

ThemosKalafatis November 16, 2008
Share this Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

ai in automotive industry
AI Is Changing the Automotive Industry Forever
Artificial Intelligence
SMEs Use AI-Driven Financial Software for Greater Efficiency
Artificial Intelligence
data security in big data age
6 Reasons to Boost Data Security Plan in the Age of Big Data
Big Data
data science anayst
Growing Demand for Data Science & Data Analyst Roles
Data Science

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

big data improves
Big DataJobsKnowledge ManagementUncategorized

3 Ways Big Data Improves Leadership Within Companies

6 Min Read
Image
Uncategorized

IT Is Not Analytics. Here’s Why.

7 Min Read

Romney Invokes Analytics in Rebuke of Trump

4 Min Read

WEF Davos 2016: Top 100 CEO bloggers

14 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data
AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?