Skip to content

textclean

Text Cleaning Tools

v0.9.7 · Mar 4, 2026 · GPL-2

Description

Tools to clean and process text. Tools are geared at checking for substrings that are not optimal for analysis and replacing or removing them (normalizing) with more analysis friendly substrings (see Sproat, Black, Chen, Kumar, Ostendorf, & Richards (2001) <doi:10.1006/csla.2001.0169>) or extracting them into new variables. For example, emoticons are often used in text but not always easily handled by analysis algorithms. The replace_emoticon() function replaces emoticons with word equivalents.

Downloads

7.4K

Last 30 days

1444th

21.5K

Last 90 days

121.9K

Last year

Trend: +21.2% (30d vs prior 30d)

CRAN Check Status

14 OK
Show all 14 flavors
Flavor Status
r-devel-linux-x86_64-debian-clang OK
r-devel-linux-x86_64-debian-gcc OK
r-devel-linux-x86_64-fedora-clang OK
r-devel-linux-x86_64-fedora-gcc OK
r-devel-macos-arm64 OK
r-devel-windows-x86_64 OK
r-oldrel-macos-arm64 OK
r-oldrel-macos-x86_64 OK
r-oldrel-windows-x86_64 OK
r-patched-linux-x86_64 OK
r-release-linux-x86_64 OK
r-release-macos-arm64 OK
r-release-macos-x86_64 OK
r-release-windows-x86_64 OK

Check History

OK 14 OK · 0 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Mar 11, 2026
NOTE 13 OK · 1 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Mar 10, 2026
NOTE r-patched-linux-x86_64

Rd files

checkRd: (-1) check_text.Rd:31: Lost braces in \itemize; \value handles \item{}{} directly
checkRd: (-1) check_text.Rd:32: Lost braces in \itemize; \value handles \item{}{} directly
checkRd: (-1) check_text.Rd:33: Lost braces in \itemize; \value hand
...[truncated]...
ck_text.Rd:53: Lost braces in \itemize; \value handles \item{}{} directly
checkRd: (-1) replace_html.Rd:12: Lost braces
    12 | \item{symbol}{logical.  If code{TRUE} the symbols are retained with appropriate
       |                                ^

Reverse Dependencies (9)

suggests

Dependency Network

Dependencies Reverse dependencies data.table english glue lexicon (>= 1.0.0) mgsub qdapRegex stringi textshape(>= 1.0.1) NUSS SemanticDistance sentimentr spell.replacer text2emotion textstem upstartr LilRhino misc textclean

Version History

new 0.9.7 Mar 10, 2026
updated 0.9.7 ← 0.9.3 diff Mar 4, 2026
updated 0.9.3 ← 0.9.2 diff Jul 22, 2018
updated 0.9.2 ← 0.7.3 diff Jun 8, 2018
updated 0.7.3 ← 0.7.2 diff Apr 23, 2018
updated 0.7.2 ← 0.6.3 diff Apr 18, 2018
updated 0.6.3 ← 0.5.1 diff Jan 13, 2018
updated 0.5.1 ← 0.3.1 diff Dec 11, 2017
updated 0.3.1 ← 0.3.0 diff Feb 21, 2017
updated 0.3.0 ← 0.2.0 diff Jan 21, 2017
new 0.2.0 Jan 9, 2017