
Text Analytics in R
1 August, 2022
4
4
0
Contributors
What is text analytics?
Why is it important?
Getting Started
Mark Twain Books
•
•
•
•

Snapshot of the mark_twin data frame
Identifying Stop Words

View of the first few rows of the stop_words function
Tokenizing and Removing Stop Words

Source: Stanford NLP — Tokenizing example
1.
2.
3.
4.

We can see the words are no individually separated and the book ID is tagged to it
Frequency Distribution of Words

The most popular word is time.
Visualizing the Data
1.
2.
3.

Overview
Stay tuned — I will be sharing more tutorials about using Twitter’s API to extract and scrape tweets.
r