Skip to content

contentanalysis

Scientific Content and Citation Analysis from PDF Documents

v1.1.1 · Jun 15, 2026 · GPL (>= 3)

Description

Provides comprehensive tools for extracting and analyzing scientific content from PDF documents, including citation extraction, reference matching, text analysis, and bibliometric indicators. Supports multi-column PDF layouts, 'CrossRef' API <https://www.crossref.org/documentation/retrieve-metadata/rest-api/> integration, and advanced citation parsing.

Downloads

CRAN

17.6K

Last 30 days

882nd

61.3K

Last 90 days

158K

Last year

Trend: -16.8% (30d vs prior 30d)

r2u CRAN

22

Last 30 days

98

Last 90 days

198

Last year

Trend: -29% (30d vs prior 30d)

CRAN Check Status

13 OK
Show all 13 flavors
Flavor Status
r-devel-linux-x86_64-debian-clang OK
r-devel-linux-x86_64-debian-gcc OK
r-devel-linux-x86_64-fedora-clang OK
r-devel-linux-x86_64-fedora-gcc OK
r-devel-windows-x86_64 OK
r-oldrel-macos-arm64 OK
r-oldrel-macos-x86_64 OK
r-oldrel-windows-x86_64 OK
r-patched-linux-x86_64 OK
r-release-linux-x86_64 OK
r-release-macos-arm64 OK
r-release-macos-x86_64 OK
r-release-windows-x86_64 OK

Check History

OK 13 OK · 0 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Jun 30, 2026
ERROR 12 OK · 0 NOTE · 0 WARNING · 1 ERROR · 0 FAILURE Jun 29, 2026
ERROR r-devel-windows-x86_64

re-building of vignette outputs

Error(s) in re-building vignettes:
--- re-building 'introduction.Rmd' using rmarkdown
trying URL 'https://raw.githubusercontent.com/massimoaria/contentanalysis/master/inst/examples/example_paper.pdf'
Content type 'application/octet-stream' length 543
...[truncated]...
nostics:
Can't subset columns that don't exist.
✖ Column `ref_journal` doesn't exist.
--- failed re-building 'introduction.Rmd'

SUMMARY: processing the following file failed:
  'introduction.Rmd'

Error: Vignette re-building failed.
Execution halted
OK 13 OK · 0 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Jun 28, 2026
ERROR 12 OK · 0 NOTE · 0 WARNING · 1 ERROR · 0 FAILURE Jun 27, 2026
ERROR r-devel-linux-x86_64-debian-gcc

re-building of vignette outputs

Error(s) in re-building vignettes:
  ...
--- re-building ‘introduction.Rmd’ using rmarkdown
trying URL 'https://raw.githubusercontent.com/massimoaria/contentanalysis/master/inst/examples/example_paper.pdf'
Content type 'application/octet-stream' leng
...[truncated]...
nostics:
Can't subset columns that don't exist.
✖ Column `ref_journal` doesn't exist.
--- failed re-building ‘introduction.Rmd’

SUMMARY: processing the following file failed:
  ‘introduction.Rmd’

Error: Vignette re-building failed.
Execution halted
OK 13 OK · 0 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Jun 9, 2026
ERROR 12 OK · 0 NOTE · 0 WARNING · 1 ERROR · 0 FAILURE Jun 8, 2026
ERROR r-devel-linux-x86_64-debian-gcc

R code for possible problems

Fatal error: cannot create 'R_TempDir'
OK 12 OK · 0 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Apr 25, 2026
NOTE 11 OK · 3 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Apr 8, 2026
NOTE r-oldrel-macos-arm64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-macos-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-windows-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
ERROR 10 OK · 3 NOTE · 0 WARNING · 1 ERROR · 0 FAILURE Apr 3, 2026
ERROR r-devel-linux-x86_64-debian-clang

re-building of vignette outputs

Error(s) in re-building vignettes:
  ...
--- re-building ‘introduction.Rmd’ using rmarkdown
trying URL 'https://raw.githubusercontent.com/massimoaria/contentanalysis/master/inst/examples/example_paper.pdf'
Content type 'application/octet-stream' leng
...[truncated]...
nostics:
Can't subset columns that don't exist.
✖ Column `ref_journal` doesn't exist.
--- failed re-building ‘introduction.Rmd’

SUMMARY: processing the following file failed:
  ‘introduction.Rmd’

Error: Vignette re-building failed.
Execution halted
NOTE r-oldrel-macos-arm64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-macos-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-windows-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE 11 OK · 3 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Apr 1, 2026
NOTE r-oldrel-macos-arm64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-macos-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-windows-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
ERROR 10 OK · 3 NOTE · 0 WARNING · 1 ERROR · 0 FAILURE Mar 31, 2026
ERROR r-devel-linux-x86_64-debian-gcc

re-building of vignette outputs

Error(s) in re-building vignettes:
  ...
--- re-building ‘introduction.Rmd’ using rmarkdown
trying URL 'https://raw.githubusercontent.com/massimoaria/contentanalysis/master/inst/examples/example_paper.pdf'
Content type 'application/octet-stream' leng
...[truncated]...
nostics:
Can't subset columns that don't exist.
✖ Column `ref_journal` doesn't exist.
--- failed re-building ‘introduction.Rmd’

SUMMARY: processing the following file failed:
  ‘introduction.Rmd’

Error: Vignette re-building failed.
Execution halted
NOTE r-oldrel-macos-arm64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-macos-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-windows-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE 11 OK · 3 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Mar 10, 2026
NOTE r-oldrel-macos-arm64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb
NOTE r-oldrel-macos-x86_64

installed package size

installed size is  5.3Mb
  sub-directories of 1Mb or more:
    doc   4.7Mb
NOTE r-oldrel-windows-x86_64

installed package size

installed size is  6.4Mb
  sub-directories of 1Mb or more:
    doc    4.7Mb
    help   1.4Mb

Code

Structure

Lines of code

19,724

Files

100

Compiled share

0%

Has compiled src

No

Language breakdown

R 10,136 (51.4%)Tests 7,116 (36.1%)Docs 1,703 (8.6%)Vignettes 769 (3.9%)

API

Exported functions

24

Internal functions

60

Recent export changes

v1.1.0+1 classify_rhetorical_moves
v1.0.0+2 describe_citation_clusters, plot_citation_clusters

Testing & CI

Has tests

Yes

Test-to-code ratio

0.70

testthat edition

3

CI present

No

CI type

[]

PR gated

No

Docs

Return-value doc rate

100%

\dontrun example ratio

100%

Roxygen coverage

95.8%

Has pkgdown

No

NEWS present

Yes

Health & Security signals

Informational signals; not verdicts.

on.exit coverage

0%

Unsafe pattern score

0

Dep constraint coverage

92.9%

Secret pattern count

2

Bundled 3rd-party code

2 items

Portability & License

Min R version

4.1.0

System requirements

C++ standard

License

GPL (>= 3)

License flags

SPDX valid, OSI approved

History

Versions

5

First release

2025-10-30

Latest release

2026-06-16

Avg cadence

58 days

Cold removal rate

Dep drift

0

LOC over versions

v0.2.0: 15,244 LOCv0.2.1: 15,635 LOCv1.0.0: 17,203 LOCv1.1.0: 19,724 LOCv1.1.1: 19,724 LOC

Per-file churn detail lives in the source pipeline: https://github.com/r-observatory/cran-code-metrics.

Reverse Dependencies (1)

imports

Dependency Network

Dependencies Reverse dependencies base64enc dplyr httr2 igraph jsonlite magrittr openalexR (>= 2.0.2) pdftools purrr stringr (>= 1.5.2) tibble tidyr tidytext (>= 0.4.3) visNetwork bibliometrix contentanalysis

Version History

6 tracked
updated 1.1.1 ← 1.1.0 diff Jun 16, 2026
updated 1.1.0 ← 1.0.0 diff May 19, 2026
new 1.0.0 Mar 10, 2026
updated 1.0.0 ← 0.2.1 diff Mar 6, 2026
updated 0.2.1 ← 0.2.0 diff Dec 11, 2025
new 0.2.0 Oct 29, 2025