Skip to content

meerva

Analysis of Data with Measurement Error Using a Validation Subsample

v0.2-2 · Oct 27, 2021 · GPL-3

Description

Sometimes data for analysis are obtained using more convenient or less expensive means yielding "surrogate" variables for what could be obtained more accurately, albeit with less convenience; or less conveniently or at more expense yielding "reference" variables, thought of as being measured without error. Analysis of the surrogate variables measured with error generally yields biased estimates when the objective is to make inference about the reference variables. Often it is thought that ignoring the measurement error in surrogate variables only biases effects toward the null hypothesis, but this need not be the case. Measurement errors may bias parameter estimates either toward or away from the null hypothesis. If one has a data set with surrogate variable data from the full sample, and also reference variable data from a randomly selected subsample, then one can assess the bias introduced by measurement error in parameter estimation, and use this information to derive improved estimates based upon all available data. Formulaically these estimates based upon the reference variables from the validation subsample combined with the surrogate variables from the whole sample can be interpreted as starting with the estimate from reference variables in the validation subsample, and "augmenting" this with additional information from the surrogate variables. This suggests the term "augmented" estimate. The meerva package calculates these augmented estimates in the regression setting when there is a randomly selected subsample with both surrogate and reference variables. Measurement errors may be differential or non-differential, in any or all predictors (simultaneously) as well as outcome. The augmented estimates derive, in part, from the multivariate correlation between regression model parameter estimates from the reference variables and the surrogate variables, both from the validation subset. Because the validation subsample is chosen at random any biases imposed by measurement error, whether non-differential or differential, are reflected in this correlation and these correlations can be used to derive estimates for the reference variables using data from the whole sample. The main functions in the package are meerva.fit which calculates estimates for a dataset, and meerva.sim.block which simulates multiple datasets as described by the user, and analyzes these datasets, storing the regression coefficient estimates for inspection. The augmented estimates, as well as how measurement error may arise in practice, is described in more detail by Kremers WK (2021) <arXiv:2106.14063> and is an extension of the works by Chen Y-H, Chen H. (2000) <doi:10.1111/1467-9868.00243>, Chen Y-H. (2002) <doi:10.1111/1467-9868.00324>, Wang X, Wang Q (2015) <doi:10.1016/j.jmva.2015.05.017> and Tong J, Huang J, Chubak J, et al. (2020) <doi:10.1093/jamia/ocz180>.

Downloads

830

Last 30 days

4296th

2.5K

Last 90 days

9.1K

Last year

Trend: -2.6% (30d vs prior 30d)

CRAN Check Status

4 NOTE
10 OK
Show all 14 flavors
Flavor Status
r-devel-linux-x86_64-debian-clang NOTE
r-devel-linux-x86_64-debian-gcc NOTE
r-devel-linux-x86_64-fedora-clang NOTE
r-devel-linux-x86_64-fedora-gcc NOTE
r-devel-macos-arm64 OK
r-devel-windows-x86_64 OK
r-oldrel-macos-arm64 OK
r-oldrel-macos-x86_64 OK
r-oldrel-windows-x86_64 OK
r-patched-linux-x86_64 OK
r-release-linux-x86_64 OK
r-release-macos-arm64 OK
r-release-macos-x86_64 OK
r-release-windows-x86_64 OK
Check details (4 non-OK)
NOTE r-devel-linux-x86_64-debian-clang

CRAN incoming feasibility

Maintainer: ‘Walter K Kremers <kremers.walter@mayo.edu>’

The Description field contains
  <arXiv:2106.14063> and is an extension of the works by Chen Y-H, Chen
Please refer to arXiv e-prints via their arXiv DOI <doi:10.48550/arXiv.YYMM.NNNNN>.
NOTE r-devel-linux-x86_64-debian-gcc

CRAN incoming feasibility

Maintainer: ‘Walter K Kremers <kremers.walter@mayo.edu>’

The Description field contains
  <arXiv:2106.14063> and is an extension of the works by Chen Y-H, Chen
Please refer to arXiv e-prints via their arXiv DOI <doi:10.48550/arXiv.YYMM.NNNNN>.
NOTE r-devel-linux-x86_64-fedora-clang

dependencies in R code

Namespace in Imports field not imported from: ‘dplyr’
  All declared Imports should be used.
NOTE r-devel-linux-x86_64-fedora-gcc

dependencies in R code

Namespace in Imports field not imported from: ‘dplyr’
  All declared Imports should be used.

Check History

NOTE 10 OK · 4 NOTE · 0 WARNING · 0 ERROR · 0 FAILURE Mar 10, 2026
NOTE r-devel-linux-x86_64-debian-clang

CRAN incoming feasibility

Maintainer: ‘Walter K Kremers <kremers.walter@mayo.edu>’

The Description field contains
  <arXiv:2106.14063> and is an extension of the works by Chen Y-H, Chen
Please refer to arXiv e-prints via their arXiv DOI <doi:10.48550/arXiv.YYMM.NNNNN>.
NOTE r-devel-linux-x86_64-debian-gcc

CRAN incoming feasibility

Maintainer: ‘Walter K Kremers <kremers.walter@mayo.edu>’

The Description field contains
  <arXiv:2106.14063> and is an extension of the works by Chen Y-H, Chen
Please refer to arXiv e-prints via their arXiv DOI <doi:10.48550/arXiv.YYMM.NNNNN>.
NOTE r-devel-linux-x86_64-fedora-clang

dependencies in R code

Namespace in Imports field not imported from: ‘dplyr’
  All declared Imports should be used.
NOTE r-devel-linux-x86_64-fedora-gcc

dependencies in R code

Namespace in Imports field not imported from: ‘dplyr’
  All declared Imports should be used.

Dependency Network

Dependencies Reverse dependencies survival dplyr tidyr ggplot2 mvtnorm matrixcalc meerva

Version History

new 0.2-2 Mar 10, 2026
updated 0.2-2 ← 0.2-1 diff Oct 26, 2021
updated 0.2-1 ← 0.1-2 diff May 12, 2021
updated 0.1-2 ← 0.1-1 diff Apr 26, 2021
new 0.1-1 Apr 18, 2021