Description
Interfaces with the 'Hugging Face' tokenizers library to provide implementations of today's most used tokenizers such as the 'Byte-Pair Encoding' algorithm <https://huggingface.co/docs/tokenizers/index>. It's extremely fast for both training new vocabularies and tokenizing texts.
Downloads
20.2K
Last 30 days
865th
43.7K
Last 90 days
128.4K
Last year
Trend: +54% (30d vs prior 30d)
CRAN Check Status
8
NOTE
6
OK
Show all 14 flavors
| Flavor | Status |
|---|---|
| r-devel-linux-x86_64-debian-clang | NOTE |
| r-devel-linux-x86_64-debian-gcc | NOTE |
| r-devel-linux-x86_64-fedora-clang | NOTE |
| r-devel-linux-x86_64-fedora-gcc | NOTE |
| r-devel-macos-arm64 | OK |
| r-devel-windows-x86_64 | NOTE |
| r-oldrel-macos-arm64 | NOTE |
| r-oldrel-macos-x86_64 | NOTE |
| r-oldrel-windows-x86_64 | OK |
| r-patched-linux-x86_64 | NOTE |
| r-release-linux-x86_64 | OK |
| r-release-macos-arm64 | OK |
| r-release-macos-x86_64 | OK |
| r-release-windows-x86_64 | OK |
Check details (8 non-OK)
NOTE
r-devel-linux-x86_64-debian-clang
compiled code
File ‘tok/libs/tok.so’: Found non-API call to R: ‘R_UnboundValue’ Compiled code should not call non-API entry points in R. See ‘Writing portable packages’ in the ‘Writing R Extensions’ manual, and section ‘Moving into C API compliance’ for issues with the use of non-API entry points.
NOTE
r-devel-linux-x86_64-debian-gcc
compiled code
File ‘tok/libs/tok.so’: Found non-API call to R: ‘R_UnboundValue’ Compiled code should not call non-API entry points in R. See ‘Writing portable packages’ in the ‘Writing R Extensions’ manual, and section ‘Moving into C API compliance’ for issues with the use of non-API entry points.
NOTE
r-devel-linux-x86_64-fedora-clang
compiled code
File ‘tok/libs/tok.so’: Found non-API call to R: ‘R_UnboundValue’ Compiled code should not call non-API entry points in R. See ‘Writing portable packages’ in the ‘Writing R Extensions’ manual, and section ‘Moving into C API compliance’ for issues with the use of non-API entry points.
NOTE
r-devel-linux-x86_64-fedora-gcc
compiled code
File ‘tok/libs/tok.so’: Found non-API call to R: ‘R_UnboundValue’ Compiled code should not call non-API entry points in R. See ‘Writing portable packages’ in the ‘Writing R Extensions’ manual, and section ‘Moving into C API compliance’ for issues with the use of non-API entry points.
NOTE
r-devel-windows-x86_64
compiled code
File 'tok/libs/x64/tok.dll': Found non-API call to R: 'R_UnboundValue' Compiled code should not call non-API entry points in R. See 'Writing portable packages' in the 'Writing R Extensions' manual, and section 'Moving into C API compliance' for issues with the use of non-API entry points.
NOTE
r-oldrel-macos-arm64
installed package size
installed size is 6.5Mb
sub-directories of 1Mb or more:
libs 5.7Mb
NOTE
r-oldrel-macos-x86_64
installed package size
installed size is 6.6Mb
sub-directories of 1Mb or more:
libs 5.9Mb
NOTE
r-patched-linux-x86_64
compiled code
File ‘tok/libs/tok.so’: Found non-API calls to R: ‘R_MissingArg’, ‘R_UnboundValue’ Compiled code should not call non-API entry points in R. See ‘Writing portable packages’ in the ‘Writing R Extensions’ manual, and section ‘Moving into C API compliance’ for issues with the use of non-API entry points.
Check History
NOTE 12 OK · 2 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Mar 10, 2026
NOTE
r-oldrel-macos-arm64
installed package size
installed size is 6.5Mb
sub-directories of 1Mb or more:
libs 5.7Mb
NOTE
r-oldrel-macos-x86_64
installed package size
installed size is 6.6Mb
sub-directories of 1Mb or more:
libs 5.9Mb