What is text mining?
Text mining is an interdisciplinary field including methods and theory from areas such as information retrieval, machine learning, statistics, computational linguistics and data mining. Some industry sectors where text mining is used are publishing and media, telecommunications technology, insurance and financial markets. The value of text mining is being seen in other areas such as market analysis. More businesses are using text mining to analyze competitors, monitor customer opinions and with more of an aim to understand the market and their position within the market better. Social media sites like facebook, twitter, google+, linkedin, Instagram are being used more by businesses for market analysis. Traditional methods of data collection for the purpose of market analysis, such as focus groups or face to face interviews are becoming more costly, increasing in non-response and more time consuming. More people are expressing their views and opinions on social media platforms making this information more accessible and free for businesses that take the time to get it.
Text mining is similar to data mining except now the data source is a text document, unstructured or structured. The idea is to extract useful information from a collection of text documents.
In order to mine the text, the text has to first be gathered. When dealing with the web, that is where scraping the web for the data, text in this case, comes in. Once we have scraped the web or usually a particular site, for the data, text, you want, we need to clean it up and make it ready for mining and doing some analysis with and on. One common form of analysis with text is sentiment analysis. Sentiment analysis is used in a wide range of industries including insurance, food services, large consumer brands such as IBM, DHL. Usually the use of social media sites such as twitter are scraped for data, text in this case, and then analyzed as needed.
What is sentiment analysis?
Sentiment analysis of opinion mining is extracting subjective information from text documents. Usually the goal is to determine the attitude of the author on a particular topic of the overall polarity or subjectivity of the document text exchange. What exactly is polarity of subjectivity? This may be thought of as classifying the overall feel of the text as positive, negative or neutral. So now we have a classification problems of the text into two or more classes. Emotion recognition is an extension of sentiment analysis where the goal now is to gain a better understanding of the opinion of the text. The classes are refined to emotions such as anger, disgust, fear, happiness, sadness and surprise. A dictionary of positive and negative words is used as a reference for sentiment analysis. What is a dictionary in sentiment analysis? It’s a listing of words and their associated polarity and/or emotion the word expresses. Sometimes making your own dictionary or adding words to an existing dictionary is necessary.
Term | Emotion |
Wretch | Sadness |
Wretched | Sadness |
Wroth | Anger |
Wrothful | Anger |
Yucki | Disgust |
Yucky |
Disgust |
Zeal | Joy |
Zealous | Joy |
Worsening | Negative |
Worship | Positive |
Worst | Negative |
Worth | positive |
Worthwhile | Positive |
Worthiness | Positive |
Worthless | Negative |
Worthlessly | Negative |
Worthlessness | negative |
Sentiment analysis can be used for marketing purposes to get a sense of how a company is performing, to redesign marketing and advertising campaigns or to analyze the competition. Social media platforms like Facebook, twitter, Instagram provide platforms for consumers to express their views on events, products and more. Even from the rates in amazon and other shopping platforms can help provide information on consumer opinions giving businesses an idea of what products are being sought after and which are not.
For more details on text mining and sentiment analysis please have a look at our text mining project, Tweet Mining – Addressing the Elephant in the Room.
References:
- Social media competitive analysis and text mining: A case study in the pizza industry, Wu He, Shenghua Zha, Ling li, Volume 33, p 464-473, 2013.
- Practical text mining and statistical analysis for non-structured text data applications, Miner, Gar elder, John, IV, Hill, Thomas, Academic Press, 2012.
- Social Media Analytics: Data Mining Applied to Insurance Twitter Posts, Roosevelt C. Mosely Jr., Casualty Actuarial Society E-Forum, Winter 2012, Volume 2, p. 1-36.
- More than words: Social networks’ text mining for consumer brand sentiments, Mohamed M. Mostafa, Expert Systems with Applications, Volume 40, p 4241-4251, 2013.
- SentiWordNet, URL: http://sentiwordnet.isti.cnr.it/
- Negative Google Reviews, why it’s important to respond - December 10, 2024
- How to design Website Architecture or Website Structure? - July 26, 2024
- 2024 Marketing Trends to stay Ahead of - July 25, 2024