FAQ FinSentS Sentiment Analysis by InfoTrie

Frequently Asked Questions

 

FAQ - Bear Stock Sentiment AnalysisA measure of the bullishness / bearishness of the language used in media coverage of a given stock on a given day. Ranges from -5 (extremely negative coverage) to +5 (extremely positive coverage); a score of 0 indicates an absence of articles for that day.

FAQ - Bull Stock Sentiment AnalysisAlternatively, scores may also be expressed on a 0 to 10 scale with 0 still indicating an absence of articles for that day, 1,2,3 a negative score, 4,5,6,7 a neutral score, and 8,9,10 a positive score.

FAQ - Weighted Company Sentiment Analysis ScoreThe overall sentiment score is computed as a weighted average of sentiment scores over news titles, headers and company specific phrases. A significant correlation between news sentiment and stock prices is observed across most listings.

We can also provide sentiment scores from 1 – 100 on demand.

The scores are weighted averages of the individual scores of news / blogs / twitter during the day. Upon request, we can provide in-depth details on our computation.

To put it simple, we handle textual financial data using Data Mining and Text Mining methods. Our engine does the following steps in order to generate sentiment score:

FAQ - Filtering Stock Sentiment Analysis by InfoTrie Filtering and selection of relevant information per topic

FAQ - Relevance Stock Sentiment Analysis Computation of relevance score per article

FAQ - Article Level Equity Sentiment Analysis Computation of sentiment score per article

Weighted Company Sentiment Analysis ScoreComputation of weighted average score of the individual topic

FAQ - Real Time Forex Sentiment AnalysisSentiment score is being generated in real time. A benchmark clocked it at over a million tuples processed per second per node. Latency is now fully real time.

Article Sentiment Analysis

Aspect Sentiment AnalysisEntity Sentiment AnalysisSentiment scores are computed at various levels. We draw information from the headline, header, body and even footer. Different (customisable) weights are assigned to each section.

Three main methods can be provided:

Word Count Harvard DictionnaryA statistical approach (indeed mainly relying on positive/negative word counts).

The statistical approach itself is not binary. It is model based and can be easily calibrated.

Below is a non-exhaustive list of the outputs:

  • Count of positive words
  • Count of negative words
  • Weight of the sentence positions
  • Relevance score
  • Volume of news available
  • Buzz score

It still allows for word disambiguation. This method is fast, scalable (especially across languages), but lacks indeed the precision that the grammar based approach can have.

Grammar BasedA grammar based approach.

By far the largest selection of technologies for exploiting grammar in sentiment analysis come from the use of HMM- or CRF-type sequence modeling, and consequently, this will be a major component of the course. This type of machine learning uses syntactic and other features as binary-valued functions in learning to label windows of text.

Deep Learning SentimentMachine /deep learning based approach.

The deep learning model actually builds up a representation of whole sentences based on the sentence structure. It computes the sentiment based on how words compose the meaning of longer phrases. This way, the model is not as easily fooled as previous models.

Harvard NLP For English content we are using a customised version of the Harvard dictionary.

Identify Sentiment Words We then identify new sentiment words from a finance related corpus combined with manual checking

AND, OR, BUT, EITHER – OR,

NEITHER – NOR

Stock Sentiment Analysis We will deal with context-dependent sentiment words (context_sentiment_word, aspect)

The deficit is expected to last long (negative)

The stock is recommended a long position (positive)

Review Stock Sentiment Analysis Dictionnary The dictionary and its features will be periodically reviewed with the latest corpus

In order to manage this issue, we modify the engine slightly:

Curated SourcesProcessing Financial content only

 

Name Entity Recognition Companies People LocationDetecting synonyms, acronyms and other nicknames of entities, topics and people

 

Train Classifier Disambiguation ClassifierSentiment AnalysisIdentifying corpus for relevant entities, topics and people. It can be calibrated using Machine Learning (for instance to recognize documents talking about a given sector or industry)

FAQ - Interpreting Stock Sentiment Analysis

News analytics reaction could be more violent than spot price reflection/ movements. That could be one effective way to identify market signals.

Economics research analysts, journalists and bloggers use various forms of reports to follow the trendy topics. Basically these are the topics that will increase their sales or number of visitors if we talk about websites.

Because many medium and small companies do not have daily news activities, it could explain the sensitivity. And it creates a particular news analytics that helps to detect market patterns; it is being called “Buzz”.

Content of buzz activities can be provided on demand.

IFeed Sentiment Analysis API

InfoTrie Excel Plugin for Sentiment AnalysisYes, we have more than 15 years of historical data. For personal analysis, the daily sentiment can be extracted in csv format from the website itself, or from our Excel Plugin or our API .

FAQ - InfoTrie FinSentS Sentiment Analysis

  • Investors/Traders / Brokers / Sales
  • Economics research cells
  • Market and Credit Risk Managers
  • Compliance cells

At the macro level, we aid in real time decision making for the following groups of people:

  • Hedge Fund owners
  • Asset Management companies
  • Financial Institutions (Central Banks, Investment Banks, Private Banks)
  • Market Infrastructure (Exchanges, Depositaries)
  • Market Data providers

FAQ - Stock Forex Commodities Cryptos Sentiment Analysis

FinSentS is computing sentiment scores for:

  1. 50,000+ Stocks and Major global stock indices
  • 15,000+ North American stocks
  • 8,000+ European stocks
  • 4,000+ Japanese stocks
  • 14,000+ Asian stocks excluding Japan
  • Chinese stocks: http://tushare.org/
  • 3,000+ Australia / New Zealand stocks
  • 1,000+ South American stocks
  1. Commodities

More than 181 kinds of Commodities including:

  • Base and Precious Metals
  • Soft
  • Energy
  1. FOREX and Currencies

Hundreds of combinations of currencies and currencies pairs:

  • AUD
  • EUR
  • GBP
  • HKD
  • INR
  • JPY
  • NOK
  • NZD
  • SEK
  • SGD
  • USD
  1. Macro economic topics like unemployment, Interest rate etc 
  1. Hundreds of political events and topics such as:
  • Syria War
  • Elections
  • Shut Down
  • Management appointment
  • Regulation (Dodd Frank, EMIR, FATCA, Tobin)

FAQ - Multi Lingual Sentiment Analysis

Yes. The InfoTrie FinSentS engine is really multi-lingual even if for an entry price FinSentS processes only English. For instance, we could offer analytics in Mandarin, German, French, Spanish, Bahasa, Japanese, Korean etc… .We are also open to suggestions on new languages to explore.

FAQ - Sources for Sentiment Analysis

For our Financial News and Sentiment Screener, FinSentS, we use:

  1. Newspaper websites (10 000+ sources for sentiment processing and more than millions of sources for news tracking)

i.e. 4- Traders, Equities.com, CNBC, Bloomberg, Business week, Street Insider.

  1. Financial Blogs (10 000+ sources)

i.e. ZeroHedge, Washington Post, Paul Krugman, Naked Capital.

  1. Twitter company account (500 companies part of SP500 Index)

i.e. Apple Inc., Microsoft Corp, Exxon Mobil.

  1. Multi-language

Chinese

i.e. Sina Weibo, Shanghai Stock Exchange, ShenZhen Stock Exchange.

French, German, Italian, Spanish, Bahasa, …

We already include private sources such as Economics research papers and Bloomberg’s News feed. Additional private sources of information can be processed on demand.

Additional data and feeds from:

  1. Premium sources,  like Bloomberg, Reuters or Dow Jones..
  2. Social media, such as Weibo,  Twitter, StockTwits…
  3. Chat rooms, Companies and Analysts research, Reports etc  SEC and global fillings, broker research, conference calls, investor’s relations presentations, social media, real time news and press releases ..

And explore voice and video on a project basis.

Topic Classification Forex Sentiment AnalysisOur system can indeed be configured to assign different weights in the sentiment computation for different corporate actions like earning announcements, M&A, dividends etc.

Extend Deep Learning ClassifierThis is generally reviewed during the calibration process, where we can apply various numerical models and machine learning techniques.

The quality control of InfoTrie’s data set (news, blogs, and social media) relies on 2 fundamental bricks:

  1. The sources are primarily handpicked among a selection of:

Curated sources for Alternative Data

  • Web sites (free press, financial blog, regulator website, exchange’s website)
  • Dedicated economic research from Investment Banks
  • Dedicated news feed “Bloomberg real time content”, “Dow Jones real time content
  • ”Reliable Twitter accounts or important hashtags
  1. Automatic relevance computation and noise filtering

Relevance Stock Sentiment Analysis“Relevance” scores are computed automatically and real-time for any topic or entity on news, tweets or blogs.

Noise Filters Sentiment AnalysisMultiple additional “noise filters” are in place to ensure the quality of the data (for instance on Twitter’s hashtags).

  1. Topic classification

Topic Classification Forex Sentiment AnalysisNews articles topics are automatically labeled based on their titles and contents with machine learning algorithm

Additional datasets (chat, forum, text, analyst and company reports, transcripts, PDF, excel, recorded phone conversation, video, etc) can be considered if required.

Read our blog to find more! | Discover FinSentS! | Check our I-Feed API| Try our Excel Plugin