Skip to content

archiveRetriever

Retrieve Archived Web Pages from the 'Internet Archive'

v0.4.1 · Oct 16, 2025 · Apache License (>= 2.0)

Description

Scraping content from archived web pages stored in the 'Internet Archive' (<https://archive.org>) using a systematic workflow. Get an overview of the mementos available from the respective homepage, retrieve the Urls and links of the page and finally scrape the content. The final output is stored in tibbles, which can be then easily used for further analysis.

Downloads

CRAN

358

Last 30 days

11095th

900

Last 90 days

5.3K

Last year

Trend: +43.8% (30d vs prior 30d)

r2u CRAN

10

Last 30 days

25

Last 90 days

137

Last year

Trend: -28.6% (30d vs prior 30d)

autoCRAN

9

Last 7 days

57

Last 30 days

0

All-time

autoCRAN-only: this name is served only by autoCRAN, so the count is exact.

CRAN Check Status

13 OK
Show all 13 flavors
Flavor Status
r-devel-linux-x86_64-debian-clang OK
r-devel-linux-x86_64-debian-gcc OK
r-devel-linux-x86_64-fedora-clang OK
r-devel-linux-x86_64-fedora-gcc OK
r-devel-windows-x86_64 OK
r-oldrel-macos-arm64 OK
r-oldrel-macos-x86_64 OK
r-oldrel-windows-x86_64 OK
r-patched-linux-x86_64 OK
r-release-linux-x86_64 OK
r-release-macos-arm64 OK
r-release-macos-x86_64 OK
r-release-windows-x86_64 OK

Check History

OK 14 OK · 0 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Mar 10, 2026

Dependency Network

Dependencies Reverse dependencies anytime dplyr ggplot2 gridExtra httr jsonlite lubridate rvest stringr tibble tidyr xml2 archiveRetriever

Version History

11 tracked
new 0.4.1 Mar 10, 2026
updated 0.4.1 ← 0.4.0 diff Oct 15, 2025
updated 0.4.0 ← 0.3.1 diff Jun 10, 2024
updated 0.3.1 ← 0.3.0 diff Dec 22, 2022
updated 0.3.0 ← 0.2.0 diff Dec 19, 2022
updated 0.2.0 ← 0.1.2 diff Jun 20, 2022
updated 0.1.2 ← 0.1.1 diff Jun 6, 2022
updated 0.1.1 ← 0.1.0 diff Mar 2, 2022
updated 0.1.0 ← 0.0.2 diff May 26, 2021
updated 0.0.2 ← 0.0.1 diff Mar 18, 2021
new 0.0.1 Mar 9, 2021