Skip to contents

The aim of the package is to provide an implementation of the G-means algorithm in R. The G-means algorithm is a clustering algorithm that extends the k-means algorithm by automatically determining the number of clusters. The algorithm was introduced by Hamerly and Elkan (2003).

Installation

You can install the development version of gmeans from GitHub with:

# install.packages("pak")
pak::pak("m-muecke/gmeans")

Usage

library(gmeans)

km <- gmeans(mtcars)
km
#> K-means clustering with 2 clusters of sizes 18, 14
#> 
#> Cluster means:
#>        mpg      cyl     disp        hp     drat       wt     qsec        vs
#> 1 23.97222 4.777778 135.5389  98.05556 3.882222 2.609056 18.68611 0.7777778
#> 2 15.10000 8.000000 353.1000 209.21429 3.229286 3.999214 16.77214 0.0000000
#>          am     gear     carb
#> 1 0.6111111 4.000000 2.277778
#> 2 0.1428571 3.285714 3.500000
#> 
#> Clustering vector:
#>           Mazda RX4       Mazda RX4 Wag          Datsun 710      Hornet 4 Drive 
#>                   1                   1                   1                   1 
#>   Hornet Sportabout             Valiant          Duster 360           Merc 240D 
#>                   2                   1                   2                   1 
#>            Merc 230            Merc 280           Merc 280C          Merc 450SE 
#>                   1                   1                   1                   2 
#>          Merc 450SL         Merc 450SLC  Cadillac Fleetwood Lincoln Continental 
#>                   2                   2                   2                   2 
#>   Chrysler Imperial            Fiat 128         Honda Civic      Toyota Corolla 
#>                   2                   1                   1                   1 
#>       Toyota Corona    Dodge Challenger         AMC Javelin          Camaro Z28 
#>                   1                   2                   2                   2 
#>    Pontiac Firebird           Fiat X1-9       Porsche 914-2        Lotus Europa 
#>                   2                   1                   1                   1 
#>      Ford Pantera L        Ferrari Dino       Maserati Bora          Volvo 142E 
#>                   2                   1                   2                   1 
#> 
#> Within cluster sum of squares by cluster:
#> [1] 58920.54 93643.90
#>  (between_SS / total_SS =  75.5 %)
#> 
#> Available components:
#> 
#> [1] "cluster"      "centers"      "totss"        "withinss"     "tot.withinss"
#> [6] "betweenss"    "size"         "iter"         "ifault"
  • nortest: R package for testing the composite hypothesis of normality.