Several R packages to handle missing values in clustering, multilevel data analysis and high-dimensional data analysis.
clusterMI is a R package to perform clustering with missing values. For achieving this goal, multiple imputation is used. The package offers various multiple imputation methods dedicated to clustered individuals (Audigier et al. (2021)). In addition, it allows pooling results both in terms of partition and instability (Audigier and Niang (2023)). Among applications, such functionalities can be used to choose a number of clusters with missing values.
More details are available in the associated vignette
micemd is a R package dedicated to multiple imputation with two-level data.
Why using micemd?
Statistical analysis often requires allowance for a multilevel structure. For example, a two-level structure occurs when individual data from several studies are aggregated, as in individual participant data (IPD) meta-analysis: individuals are at the lowest level, and the studies at the higher level. However, variables of each study are often incomplete (sporadically missing) and often differ between studies (leading to systematically missing variables), making challenging to analyse such data. micemd offers several solutions to overcome such issues.
What are its functionalities?
micemd is an ad-don for the mice R package which performs multiple imputation using chained equations. Its additional functionalities consist of:
missMDA is a R package that allows you to:
To apply a statistical method on an incomplete data set using missMDA, look at this vignette
R packages dealing with missing data can be found at the CRAN Task View on missing data (Josse et al. (2025))