Skip to content

pickmax

Split and Coalesce Duplicated Records

v0.1.0 · Jul 15, 2025 · GPL-3

Description

Deduplicates datasets by retaining the most complete and informative records. Identifies duplicated entries based on a specified key column, calculates completeness scores for each row, and compares values within groups. When differences between duplicates exceed a user-defined threshold, records are split into unique IDs; otherwise, they are coalesced into a single, most complete entry. Returns a list containing the original duplicates, the split entries, and the final coalesced dataset. Useful for cleaning survey or administrative data where duplicated IDs may reflect minor data entry inconsistencies.

Downloads

CRAN

201

Last 30 days

23113th

506

Last 90 days

1.8K

Last year

Trend: +54.6% (30d vs prior 30d)

r2u CRAN

0

Last 30 days

18

Last 90 days

75

Last year

Trend: -100% (30d vs prior 30d)

autoCRAN

8

Last 7 days

15

Last 30 days

0

All-time

autoCRAN-only: this name is served only by autoCRAN, so the count is exact.

CRAN Check Status

13 OK
Show all 13 flavors
Flavor Status
r-devel-linux-x86_64-debian-clang OK
r-devel-linux-x86_64-debian-gcc OK
r-devel-linux-x86_64-fedora-clang OK
r-devel-linux-x86_64-fedora-gcc OK
r-devel-windows-x86_64 OK
r-oldrel-macos-arm64 OK
r-oldrel-macos-x86_64 OK
r-oldrel-windows-x86_64 OK
r-patched-linux-x86_64 OK
r-release-linux-x86_64 OK
r-release-macos-arm64 OK
r-release-macos-x86_64 OK
r-release-windows-x86_64 OK

Check History

OK 14 OK · 0 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Mar 10, 2026

Code

Structure

Lines of code

191

Files

6

Compiled share

0%

Has compiled src

No

Language breakdown

R 133 (69.6%)Docs 58 (30.4%)

API

Exported functions

1

Internal functions

0

Recent export changes

v0.1.0+1 pickmax

Testing & CI

Has tests

No

Test-to-code ratio

0.00

testthat edition

CI present

No

CI type

[]

PR gated

No

Docs

Return-value doc rate

100%

\dontrun example ratio

0%

Roxygen coverage

100%

Has pkgdown

No

NEWS present

No

Health & Security signals

Informational signals; not verdicts.

on.exit coverage

Unsafe pattern score

0

Dep constraint coverage

0%

Secret pattern count

0

Bundled 3rd-party code

2 items

Portability & License

Min R version

System requirements

C++ standard

License

GPL-3

License flags

SPDX valid, OSI approved

History

Versions

1

First release

2025-07-15

Latest release

2025-07-15

Avg cadence

Cold removal rate

Dep drift

0

Per-file churn detail lives in the source pipeline: https://github.com/r-observatory/cran-code-metrics.

Dependency Network

Dependencies Reverse dependencies dplyr rlang magrittr pickmax

Version History

1 tracked
new 0.1.0 Mar 10, 2026

R Observatory began tracking this package on Mar 10, 2026; it first appeared on CRAN Jul 15, 2025. Releases before tracking aren’t shown.