## 16.1 Introduction to Text Mining

Text mining, also referred to as text data mining, roughly equivalent to text analytics, refers to the process of deriving high-quality information from text. High-quality information is typically derived through the devising of patterns and trends through means such as statistical pattern learning.

• Welbers et al. (2017) provides a mild introduction to Text Analytics using R

• Text mining has gained momentum and is used in analytics worldwide

• Sentiment Analysis

• Predicting Stock Market and other Financial Applications

• Customer influence

• News Analytics

• Social Network Analysis

• Customer Service and Help Desk

### 16.1.1 Text Data

• Text data is ubiquitous in social media analytics.

• Traditional media, social media, survey data, and numerous other sources.

• Massive quantity of text in the modern information age.

• The mounting availability of and interest in text data has been the development of a variety of statistical approaches for analysing this data.

### 16.1.2 Generic Text Mining System

• Following figure demonstrates a general text mining system (Source: )
knitr::include_graphics("fig-2.png")

### 16.1.3 Data pre-processing in Text Mining

• Following figure summarises main steps in a typical data pre-processing stage of text mining
knitr::include_graphics("fig-3.png")

### References

Feldman, Ronen, and James Sanger. 2007. The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. Cambridge University Press.