Skip to content

udpipe

Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

v0.8.16 · Jan 30, 2026 · MPL-2.0

Description

This natural language processing toolkit provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization' and 'dependency parsing' of raw text. Next to text parsing, the package also allows you to train annotation models based on data of 'treebanks' in 'CoNLL-U' format as provided at <https://universaldependencies.org/format.html>. The techniques are explained in detail in the paper: 'Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe', available at <doi:10.18653/v1/K17-3009>. The toolkit also contains functionalities for commonly used data manipulations on texts which are enriched with the output of the parser. Namely functionalities and algorithms for collocations, token co-occurrence, document term matrix handling, term frequency inverse document frequency calculations, information retrieval metrics (Okapi BM25), handling of multi-word expressions, keyword detection (Rapid Automatic Keyword Extraction, noun phrase extraction, syntactical patterns) sentiment scoring and semantic similarity analysis.

Downloads

CRAN

5.4K

Last 30 days

1615th

19K

Last 90 days

105.5K

Last year

Trend: -24.4% (30d vs prior 30d)

r2u CRAN

63

Last 30 days

344

Last 90 days

1.3K

Last year

Trend: -57.1% (30d vs prior 30d)

CRAN Check Status

13 OK
Show all 13 flavors
Flavor Status
r-devel-linux-x86_64-debian-clang OK
r-devel-linux-x86_64-debian-gcc OK
r-devel-linux-x86_64-fedora-clang OK
r-devel-linux-x86_64-fedora-gcc OK
r-devel-windows-x86_64 OK
r-oldrel-macos-arm64 OK
r-oldrel-macos-x86_64 OK
r-oldrel-windows-x86_64 OK
r-patched-linux-x86_64 OK
r-release-linux-x86_64 OK
r-release-macos-arm64 OK
r-release-macos-x86_64 OK
r-release-windows-x86_64 OK

Check History

OK 13 OK · 0 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Jun 8, 2026
ERROR 12 OK · 0 NOTE · 0 WARNING · 1 ERROR · 0 FAILURE Jun 7, 2026
ERROR r-devel-linux-x86_64-debian-gcc

PDF version of manual

Rd conversion errors:
Converting parsed Rd's to LaTeX ......Warning in file(out, "wt") :
  cannot open file '/tmp/RtmpAvN8LI/ltx24990631382743/udpipe_accuracy.tex': No space left on device
Error in file(out, "wt") : cannot open the connection
OK 12 OK · 0 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Apr 25, 2026
NOTE 11 OK · 3 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Apr 22, 2026
NOTE r-oldrel-macos-arm64

installed package size

installed size is 25.5Mb
  sub-directories of 1Mb or more:
    dummydata   1.4Mb
    libs       21.5Mb
NOTE r-oldrel-macos-x86_64

installed package size

installed size is 26.8Mb
  sub-directories of 1Mb or more:
    dummydata   1.4Mb
    libs       22.9Mb
NOTE r-oldrel-windows-x86_64

installed package size

installed size is  6.5Mb
  sub-directories of 1Mb or more:
    dummydata   1.4Mb
    libs        2.5Mb
ERROR 10 OK · 3 NOTE · 0 WARNING · 1 ERROR · 0 FAILURE Apr 18, 2026
ERROR r-devel-windows-x86_64

whether package can be installed

Installation failed.
See 'd:/Rcompile/CRANpkg/local/4.6/udpipe.Rcheck/00install.out' for details.
NOTE r-oldrel-macos-arm64

installed package size

installed size is 25.5Mb
  sub-directories of 1Mb or more:
    dummydata   1.4Mb
    libs       21.5Mb
NOTE r-oldrel-macos-x86_64

installed package size

installed size is 26.8Mb
  sub-directories of 1Mb or more:
    dummydata   1.4Mb
    libs       22.9Mb
NOTE r-oldrel-windows-x86_64

installed package size

installed size is  6.5Mb
  sub-directories of 1Mb or more:
    dummydata   1.4Mb
    libs        2.5Mb
NOTE 11 OK · 3 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Mar 10, 2026
NOTE r-oldrel-macos-arm64

installed package size

installed size is 25.5Mb
  sub-directories of 1Mb or more:
    dummydata   1.4Mb
    libs       21.5Mb
NOTE r-oldrel-macos-x86_64

installed package size

installed size is 26.8Mb
  sub-directories of 1Mb or more:
    dummydata   1.4Mb
    libs       22.9Mb
NOTE r-oldrel-windows-x86_64

installed package size

installed size is  6.5Mb
  sub-directories of 1Mb or more:
    dummydata   1.4Mb
    libs        2.5Mb

Reverse Dependencies (18)

Dependency Network

Dependencies Reverse dependencies Rcpp data.table Matrix cleanNLP corpustools finnsurveytext sentixr sumup tall ACEP BTM birddog doc2vec nametagger pseudobibeR text2vec textplot textrank +3 more reverse deps udpipe

Version History

29 tracked
new 0.8.16 Mar 10, 2026
updated 0.8.16 ← 0.8.15 diff Jan 29, 2026
updated 0.8.15 ← 0.8.14 diff Nov 27, 2025
updated 0.8.14 ← 0.8.13 diff Nov 25, 2025
updated 0.8.13 ← 0.8.12 diff Nov 25, 2025
updated 0.8.12 ← 0.8.11 diff Sep 3, 2025
updated 0.8.11 ← 0.8.10 diff Jan 5, 2023
updated 0.8.10 ← 0.8.9 diff Nov 9, 2022
updated 0.8.9 ← 0.8.8 diff Mar 23, 2022
updated 0.8.8 ← 0.8.6 diff Dec 1, 2021
updated 0.8.6 ← 0.8.5 diff May 31, 2021
updated 0.8.5 ← 0.8.4-1 diff Dec 9, 2020
updated 0.8.4-1 ← 0.8.4 diff Oct 11, 2020
updated 0.8.4 ← 0.8.3 diff Oct 9, 2020
updated 0.8.3 ← 0.8.2 diff Jul 5, 2019