Skip to content

contentanalysis

Scientific Content and Citation Analysis from PDF Documents

v1.0.0 · Mar 7, 2026 · GPL (>= 3)

Description

Provides comprehensive tools for extracting and analyzing scientific content from PDF documents, including citation extraction, reference matching, text analysis, and bibliometric indicators. Supports multi-column PDF layouts, 'CrossRef' API <https://www.crossref.org/documentation/retrieve-metadata/rest-api/> integration, and advanced citation parsing.

Downloads

19.3K

Last 30 days

889th

55.9K

Last 90 days

96.7K

Last year

Trend: -0.6% (30d vs prior 30d)

CRAN Check Status

1 ERROR
3 NOTE
10 OK
Show all 14 flavors
Flavor Status
r-devel-linux-x86_64-debian-clang ERROR
r-devel-linux-x86_64-debian-gcc OK
r-devel-linux-x86_64-fedora-clang OK
r-devel-linux-x86_64-fedora-gcc OK
r-devel-macos-arm64 OK
r-devel-windows-x86_64 OK
r-oldrel-macos-arm64 NOTE
r-oldrel-macos-x86_64 NOTE
r-oldrel-windows-x86_64 NOTE
r-patched-linux-x86_64 OK
r-release-linux-x86_64 OK
r-release-macos-arm64 OK
r-release-macos-x86_64 OK
r-release-windows-x86_64 OK
Check details (4 non-OK)
ERROR r-devel-linux-x86_64-debian-clang

re-building of vignette outputs

Error(s) in re-building vignettes:
  ...
--- re-building ‘introduction.Rmd’ using rmarkdown
trying URL 'https://raw.githubusercontent.com/massimoaria/contentanalysis/master/inst/examples/example_paper.pdf'
Content type 'application/octet-stream' length 543702 bytes (530 KB)
==================================================
downloaded 530 KB


Quitting from introduction.Rmd:216-223 [reference-sources]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<error/vctrs_error_subscript_oob>
Error in `analysis$parsed_references[, c("ref_first_author", "ref_year", "ref_journal",
    "ref_source")]`:
! Can't subset columns that don't exist.
✖ Column `ref_journal` doesn't exist.
---
Backtrace:
    ▆
 1. ├─utils::head(...)
 2. ├─...[]
 3. └─tibble:::`[.tbl_df`(...)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Error: processing vignette 'introduction.Rmd' failed with diagnostics:
Can't subset columns that don't exist.
✖ Column `ref_journal` doesn't exist.
--- failed re-building ‘introduction.Rmd’

SUMMARY: processing the following file failed:
  ‘introduction.Rmd’

Error: Vignette re-building failed.
Execution halted
NOTE r-oldrel-macos-arm64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-macos-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-windows-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb

Check History

ERROR 10 OK · 3 NOTE · 0 WARNING · 1 ERROR · 0 FAILURE Apr 3, 2026
ERROR r-devel-linux-x86_64-debian-clang

re-building of vignette outputs

Error(s) in re-building vignettes:
  ...
--- re-building ‘introduction.Rmd’ using rmarkdown
trying URL 'https://raw.githubusercontent.com/massimoaria/contentanalysis/master/inst/examples/example_paper.pdf'
Content type 'application/octet-stream' leng
...[truncated]...
nostics:
Can't subset columns that don't exist.
✖ Column `ref_journal` doesn't exist.
--- failed re-building ‘introduction.Rmd’

SUMMARY: processing the following file failed:
  ‘introduction.Rmd’

Error: Vignette re-building failed.
Execution halted
NOTE r-oldrel-macos-arm64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-macos-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-windows-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE 11 OK · 3 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Apr 1, 2026
NOTE r-oldrel-macos-arm64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-macos-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-windows-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
ERROR 10 OK · 3 NOTE · 0 WARNING · 1 ERROR · 0 FAILURE Mar 31, 2026
ERROR r-devel-linux-x86_64-debian-gcc

re-building of vignette outputs

Error(s) in re-building vignettes:
  ...
--- re-building ‘introduction.Rmd’ using rmarkdown
trying URL 'https://raw.githubusercontent.com/massimoaria/contentanalysis/master/inst/examples/example_paper.pdf'
Content type 'application/octet-stream' leng
...[truncated]...
nostics:
Can't subset columns that don't exist.
✖ Column `ref_journal` doesn't exist.
--- failed re-building ‘introduction.Rmd’

SUMMARY: processing the following file failed:
  ‘introduction.Rmd’

Error: Vignette re-building failed.
Execution halted
NOTE r-oldrel-macos-arm64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-macos-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-windows-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE 11 OK · 3 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Mar 10, 2026
NOTE r-oldrel-macos-arm64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-macos-x86_64

installed package size

installed size is  5.3Mb
  sub-directories of 1Mb or more:
    doc   4.7Mb
NOTE r-oldrel-windows-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb

Reverse Dependencies (1)

imports

Dependency Network

Dependencies Reverse dependencies base64enc dplyr httr2 igraph jsonlite magrittr openalexR (>= 2.0.2) pdftools purrr stringr (>= 1.5.2) tibble tidyr tidytext (>= 0.4.3) visNetwork bibliometrix contentanalysis

Version History

new 1.0.0 Mar 10, 2026
updated 1.0.0 ← 0.2.1 diff Mar 6, 2026
updated 0.2.1 ← 0.2.0 diff Dec 11, 2025
new 0.2.0 Oct 29, 2025