Package 'partitionComparison'

Title: Implements Measures for the Comparison of Two Partitions
Description: Provides several measures ((dis)similarity, distance/metric, correlation, entropy) for comparing two partitions of the same set of objects. The different measures can be assigned to three different classes: Pair comparison (containing the famous Jaccard and Rand indices), set based, and information theory based. Many of the implemented measures can be found in Albatineh AN, Niewiadomska-Bugaj M and Mihalko D (2006) <doi:10.1007/s00357-006-0017-z> and Meila M (2007) <doi:10.1016/j.jmva.2006.11.013>. Partitions are represented by vectors of class labels which allow a straightforward integration with existing clustering algorithms (e.g. kmeans()). The package is mostly based on the S4 object system.
Authors: Fabian Ball [aut, cre, cph, ctb], Andreas Geyer-Schulz [cph]
Maintainer: Fabian Ball <[email protected]>
License: MIT + file LICENSE
Version: 0.2.6
Built: 2024-10-13 06:35:14 UTC
Source: https://github.com/kit-iism-em/partitioncomparison

Help Index


partitionComparison: Implements Measures for the Comparison of Two Partitions

Description

Provides several measures ((dis)similarity, distance/metric, correlation, entropy) for comparing two partitions of the same set of objects. The different measures can be assigned to three different classes: Pair comparison (containing the famous Jaccard and Rand indices), set based, and information theory based. Many of the implemented measures can be found in Albatineh AN, Niewiadomska-Bugaj M and Mihalko D (2006) doi:10.1007/s00357-006-0017-z and Meila M (2007) doi:10.1016/j.jmva.2006.11.013. Partitions are represented by vectors of class labels which allow a straightforward integration with existing clustering algorithms (e.g. kmeans()). The package is mostly based on the S4 object system.

Details

This package provides a large collection of measures to compare two partitions. Some survey articles for these measures are cited below, the seminal papers for each individual measure is provided with the function definition.

Most functionality is implemented as S4 classes and methods so that an adoption is easily possible for special needs and specifications. The main class is Partition which merely wraps an atomic vector of length nn for storing the class label of each object. The computation of all measures is designed to work on vectors of class labels.

All partition comparison methods can be called in the same way: <measure method>(p, q) with p, q being the two partitions (as Partition instances). One often does not explicitly want to transform the vector of class labels (as output of another package's function/algorithm) into Partition instances before using measures from this package. For convenience, the function registerPartitionVectorSignatures exists which dynamically creates versions of all measures that will directly work with plain R vectors.

Author(s)

Maintainer: Fabian Ball [email protected] [copyright holder, contributor]

Other contributors:

References

Albatineh AN, Niewiadomska-Bugaj M, Mihalko D (2006). “On Similarity Indices and Correction for Chance Agreement.” Journal of Classification, 23(2), 301–313. ISSN 0176-4268, doi:10.1007/s00357-006-0017-z.

Meila M (2007). “Comparing Clusterings–an Information Based Distance.” Journal of Multivariate Analysis, 98(5), 873–895. doi:10.1016/j.jmva.2006.11.013.

See Also

Useful links:

Examples

# Generate some data
set.seed(42)
data <- cbind(x=c(rnorm(50), rnorm(30, mean=5)), y=c(rnorm(50), rnorm(30, mean=5)))
# Run k-means with two/three centers
data.km2 <- kmeans(data, 2)
data.km3 <- kmeans(data, 3)

# Load this library
library(partitionComparison)
# Register the measures to take ANY input
registerPartitionVectorSignatures(environment())
# Compare the clusters
randIndex(data.km2$cluster, data.km3$cluster)
# [1] 0.8101266

Subsetting Partition instances

Description

This method overrides the standard subsetting to prevent alteration (makes partitions, i.e. class labels, immutable).

Usage

## S4 replacement method for signature 'Partition'
x[i, j] <- value

Arguments

x

A Partition instance

i

Extract

j

Extract

value

Extract

Author(s)

Fabian Ball [email protected]


Adjusted Rand Index

Description

Compute the Adjusted Rand Index (ARI)

2(N00N11N10N01)N01N12+N10N21\frac{2(N_{00}N_{11} - N_{10}N_{01})}{N'_{01}N_{12} + N'_{10}N_{21}}

Usage

adjustedRandIndex(p, q)

## S4 method for signature 'Partition,Partition'
adjustedRandIndex(p, q)

## S4 method for signature 'PairCoefficients,missing'
adjustedRandIndex(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • adjustedRandIndex(p = Partition, q = Partition): Compute given two partitions

  • adjustedRandIndex(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Hubert L, Arabie P (1985). “Comparing Partitions.” Journal of Classification, 2(1), 193–218.

Examples

isTRUE(all.equal(adjustedRandIndex(new("Partition", c(0, 0, 0, 1, 1)), 
                                   new("Partition", c(0, 0, 1, 1, 1))), 1/6))

Baulieu Index 1

Description

Compute the index 1 of Baulieu

N2N(N10+N01)+(N10N01)2N2\frac{ N^2 - N(N_{10} + N_{01}) + (N_{10} - N_{01})^2 }{ N^2 }

Usage

baulieu1(p, q)

## S4 method for signature 'Partition,Partition'
baulieu1(p, q)

## S4 method for signature 'PairCoefficients,missing'
baulieu1(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • baulieu1(p = Partition, q = Partition): Compute given two partitions

  • baulieu1(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Baulieu FB (1989). “A Classification of Presence/Absence Based Dissimilarity Coefficients.” Journal of Classification, 6(1), 233–246. ISSN 0176-4268, 1432-1343, doi:10.1007/BF01908601.

Examples

isTRUE(all.equal(baulieu1(new("Partition", c(0, 0, 0, 1, 1)), 
                          new("Partition", c(0, 0, 1, 1, 1))), 0.76))

Baulieu Index 2

Description

Compute the index 2 of Baulieu

N11N00N10N01N2\frac{ N_{11}N_{00} - N_{10}N_{01} }{ N^2 }

Usage

baulieu2(p, q)

## S4 method for signature 'Partition,Partition'
baulieu2(p, q)

## S4 method for signature 'PairCoefficients,missing'
baulieu2(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • baulieu2(p = Partition, q = Partition): Compute given two partitions

  • baulieu2(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Baulieu FB (1989). “A Classification of Presence/Absence Based Dissimilarity Coefficients.” Journal of Classification, 6(1), 233–246. ISSN 0176-4268, 1432-1343, doi:10.1007/BF01908601.

Examples

isTRUE(all.equal(baulieu2(new("Partition", c(0, 0, 0, 1, 1)), 
                          new("Partition", c(0, 0, 1, 1, 1))), 0.04))

Classification Error Distance

Description

Compute the classification error distance

11nmaxσCPCσ(C)1 - \frac{1}{n} \max_{\sigma}{\sum_{C \in \cal{P}}{|C \cap \sigma(C)|}}

with σ\sigma a weighted matching between the clusters of both partitions. The nodes are the classes of each partition, the weights are the overlap of objects.

Usage

classificationErrorDistance(p, q)

## S4 method for signature 'Partition,Partition'
classificationErrorDistance(p, q)

Arguments

p

The partition PP

q

The partition QQ

Methods (by class)

  • classificationErrorDistance(p = Partition, q = Partition): Compute given two partitions

Hint

This measure is implemented using lp.assign from the lpSolve package to compute the maxmimal matching of a weighted bipartite graph.

Author(s)

Fabian Ball [email protected]

References

Meila M, Heckerman D (2001). “An Experimental Comparison of Model-Based Clustering Methods.” Machine Learning, 42(1), 9–29.

Meila M (2005). “Comparing Clusterings: An Axiomatic View.” In Proceedings of the 22nd International Conference on Machine Learning, ICML '05, 577–584. ISBN 978-1-59593-180-1, doi:10.1145/1102351.1102424.

Examples

isTRUE(all.equal(classificationErrorDistance(new("Partition", c(0, 0, 0, 1, 1)), 
                                             new("Partition", c(0, 0, 1, 1, 1))), 0.2))

Compare two partitions with all measures

Description

Compute the comparison between two partitions for all available measures.

Usage

compareAll(p, q)

## S4 method for signature 'Partition,Partition'
compareAll(p, q)

Arguments

p

The partition PP

q

The partition QQ

Value

Instance of data.frame with columns measure and value

Methods (by class)

  • compareAll(p = Partition, q = Partition): Compare given two Partition instances

Warning

This method will identify every generic S4 method that has a signature "Partition", "Partition" (including signatures with following "missing" parameters, e.g. "Partition", "Partition", "missing") as a partition comparison measure, except this method itself (otherwise: infinite recursion). This means one has to take care when defining other methods with the same signature in order not to produce unwanted side-effects!

Author(s)

Fabian Ball [email protected]

Examples

compareAll(new("Partition", c(0, 0, 0, 1, 1)), new("Partition", c(0, 0, 1, 1, 1)))
## Not run: 
                        measure       value
 1            adjustedRandIndex 0.166666667
 2                     baulieu1 0.760000000
 3                     baulieu2 0.040000000
 4  classificationErrorDistance 0.200000000
 5                  czekanowski 0.500000000
 6                dongensMetric 2.000000000
 7                 fagerMcGowan 0.250000000
 8          folwkesMallowsIndex 0.500000000
 9              gammaStatistics 0.166666667
 10              goodmanKruskal 0.333333333
 11               gowerLegendre 0.750000000
 12                      hamann 0.200000000
 13          jaccardCoefficient 0.333333333
 14                  kulczynski 0.500000000
 15                  larsenAone 0.800000000
 16                 lermanIndex 0.436435780
 17                mcconnaughey 0.000000000
 18            minkowskiMeasure 1.000000000
 19                mirkinMetric 8.000000000
 20           mutualInformation 0.291103166
 21       normalizedLermanIndex 0.166666667
 22 normalizedMutualInformation 0.432538068
 23                     pearson 0.006944444
 24                      peirce 0.166666667
 25                   randIndex 0.600000000
 26              rogersTanimoto 0.428571429
 27                   russelRao 0.200000000
 28               rvCoefficient 0.692307692
 29                sokalSneath1 0.583333333
 30                sokalSneath2 0.200000000
 31                sokalSneath3 0.333333333
 32      variationOfInformation 0.763817002
 33                    wallaceI 0.500000000
 34                   wallaceII 0.500000000

## End(Not run)

Compute the four coefficients N11N_{11}, N10N_{10}, N01N_{01}, N00N_{00}

Description

Given two object partitions P and Q, of same length n, each of them described as a vector of cluster ids, compute the four coefficients (N11N_{11}, N10N_{10}, N01N_{01}, N00N_{00}) all of the pair comparison measures are based on.

Usage

computePairCoefficients(p, q)

Arguments

p

The partition PP

q

The partition QQ

Author(s)

Fabian Ball [email protected]

Examples

pc <- computePairCoefficients(new("Partition", c(0, 0, 0, 1, 1)), 
                              new("Partition", c(0, 0, 1, 1, 1)))
isTRUE(all.equal(N11(pc), 2))
isTRUE(all.equal(N10(pc), 2))
isTRUE(all.equal(N01(pc), 2))
isTRUE(all.equal(N00(pc), 4))

Czekanowski Index

Description

Compute the Czekanowski index

2N112N11+N10+N01\frac{2N_{11}}{2N_{11} + N_{10} + N_{01}}

Usage

czekanowski(p, q)

## S4 method for signature 'Partition,Partition'
czekanowski(p, q)

## S4 method for signature 'PairCoefficients,missing'
czekanowski(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • czekanowski(p = Partition, q = Partition): Compute given two partitions

  • czekanowski(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Czekanowski J (1932). “Coefficient of Racial Likeness" Und ,,Durchschnittliche Differenz".” Anthropologischer Anzeiger, 9(3/4), 227–249.

Examples

isTRUE(all.equal(czekanowski(new("Partition", c(0, 0, 0, 1, 1)), 
                             new("Partition", c(0, 0, 1, 1, 1))), 0.5))

Dongen's Metric

Description

Compute Dongen's metric

2nCPmaxDQCDDQmaxCPCD2n - \sum_{C \in P} \max_{D \in Q} |C \cap D| - \sum_{D \in Q} \max_{C \in P} |C \cap D|

Usage

dongensMetric(p, q)

## S4 method for signature 'Partition,Partition'
dongensMetric(p, q)

Arguments

p

The partition PP

q

The partition QQ

Methods (by class)

  • dongensMetric(p = Partition, q = Partition): Compute given two partitions

Author(s)

Fabian Ball [email protected]

References

van Dongen S (2000). “Performance Criteria For Graph Clustering And Markov Cluster Experiments.” Technical Report INS-R 0012, CWI.

See Also

projectionNumber

Examples

isTRUE(all.equal(dongensMetric(new("Partition", c(0, 0, 0, 1, 1)), 
                               new("Partition", c(0, 0, 1, 1, 1))), 2))

Entropy

Description

Compute the Shannon entropy

ipilogbpi-\sum_{i} p_i \log_b p_i

Usage

entropy(x, log_base)

## S4 method for signature 'numeric,numeric'
entropy(x, log_base)

## S4 method for signature 'Partition,numeric'
entropy(x, log_base)

## S4 method for signature 'ANY,missing'
entropy(x, log_base = exp(1))

Arguments

x

A probability distribution

log_base

Optional base of the logarithm (default: ee)

Methods (by class)

  • entropy(x = Partition, log_base = numeric): Entropy of a partition represented by x

Hint

This method is used internally for measures based on information theory

Author(s)

Fabian Ball [email protected]

Examples

isTRUE(all.equal(entropy(c(.5, .5)), log(2)))
isTRUE(all.equal(entropy(c(.5, .5), 2), 1))
isTRUE(all.equal(entropy(c(.5, .5), 4), .5))

# Entropy of a partition
isTRUE(all.equal(entropy(new("Partition", c(0, 0, 1, 1, 1))), entropy(c(2/5, 3/5))))

Fager & McGowan Index

Description

Compute the index of Fager and McGowan

N11N21N1212N21\frac{N_{11}}{\sqrt{N_{21}N_{12}}} - \frac{1}{2\sqrt{N_{21}}}

Usage

fagerMcGowan(p, q)

## S4 method for signature 'Partition,Partition'
fagerMcGowan(p, q)

## S4 method for signature 'PairCoefficients,missing'
fagerMcGowan(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • fagerMcGowan(p = Partition, q = Partition): Compute given two partitions

  • fagerMcGowan(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Fager EW, McGowan JOHNA (1963). “Zooplankton Species Groups in the North Pacific Co-Occurrences of Species Can Be Used to Derive Groups Whose Members React Similarly to Water-Mass Types.” Science, 140(3566), 453–460.

Examples

isTRUE(all.equal(fagerMcGowan(new("Partition", c(0, 0, 0, 1, 1)), 
                              new("Partition", c(0, 0, 1, 1, 1))), 0.25))

Folwkes & Mallows Index

Description

Compute the index of Folwkes and Mallows

N11N21N11N12\sqrt{\frac{N_{11}}{N_{21}} \frac{N_{11}}{N_{12}}}

which is a combination of the two Wallace indices.

Usage

folwkesMallowsIndex(p, q)

## S4 method for signature 'Partition,Partition'
folwkesMallowsIndex(p, q)

## S4 method for signature 'PairCoefficients,missing'
folwkesMallowsIndex(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • folwkesMallowsIndex(p = Partition, q = Partition): Compute given two partitions

  • folwkesMallowsIndex(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Fowlkes EB, Mallows CL (1983). “A Method for Comparing Two Hierarchical Clusterings.” Journal of the American Statistical Association, 78(383), 553–569.

See Also

wallaceI wallaceII

Examples

isTRUE(all.equal(folwkesMallowsIndex(new("Partition", c(0, 0, 0, 1, 1)), 
                                     new("Partition", c(0, 0, 1, 1, 1))), 0.5))

Gamma Statistics

Description

Compute the Gamma statistics

N11N00N10N01N21N12N10N01\frac{N_{11}N_{00} - N_{10}N_{01}}{\sqrt{ N_{21}N_{12}N'_{10}N'_{01} }}

Usage

gammaStatistics(p, q)

## S4 method for signature 'Partition,Partition'
gammaStatistics(p, q)

## S4 method for signature 'PairCoefficients,missing'
gammaStatistics(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • gammaStatistics(p = Partition, q = Partition): Compute given two partitions

  • gammaStatistics(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Yule GU (1900). “On the Association of Attributes in Statistics: With Illustrations from the Material of the Childhood Society, &c.” Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 194, 257–319.

Examples

isTRUE(all.equal(gammaStatistics(new("Partition", c(0, 0, 0, 1, 1)), 
                                 new("Partition", c(0, 0, 1, 1, 1))), 1/6))

Goodman & Kruskal Index

Description

Compute the index of Goodman and Kruskal

N11N00N10N01N11N00+N10N01\frac{N_{11}N_{00} - N_{10}N_{01}}{N_{11}N_{00} + N_{10}N_{01}}

Usage

goodmanKruskal(p, q)

## S4 method for signature 'Partition,Partition'
goodmanKruskal(p, q)

## S4 method for signature 'PairCoefficients,missing'
goodmanKruskal(p, q)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • goodmanKruskal(p = Partition, q = Partition): Compute given two partitions

  • goodmanKruskal(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Goodman LA, Kruskal WH (1954). “Measures of Association for Cross Classifications.” Journal of the American Statistical Association, 49(268), 732–764. ISSN 0162-1459, doi:10.1080/01621459.1954.10501231.

Examples

isTRUE(all.equal(goodmanKruskal(new("Partition", c(0, 0, 0, 1, 1)), 
                                new("Partition", c(0, 0, 1, 1, 1))), 1/3))

Gower & Legendre Index

Description

Compute the index of Gower and Legendre

N11+N00N11+12(N10+N01)+N00\frac{N_{11} + N_{00}}{N_{11} + \frac{1}{2}\left(N_{10} + N_{01}\right) + N_{00}}

Usage

gowerLegendre(p, q)

## S4 method for signature 'Partition,Partition'
gowerLegendre(p, q)

## S4 method for signature 'PairCoefficients,missing'
gowerLegendre(p, q)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • gowerLegendre(p = Partition, q = Partition): Compute given two partitions

  • gowerLegendre(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Gower JC, Legendre P (1986). “Metric and Euclidean Properties of Dissimilarity Coefficients.” Journal of Classification, 3(1), 5–48. ISSN 0176-4268, 1432-1343, doi:10.1007/BF01896809.

Examples

isTRUE(all.equal(gowerLegendre(new("Partition", c(0, 0, 0, 1, 1)), 
                               new("Partition", c(0, 0, 1, 1, 1))), 0.75))

Hamann Coefficient

Description

Compute the Hamann coefficient

(N11+N00)(N10+N01)N\frac{(N_{11} + N_{00}) - (N_{10} + N_{01})}{N}

Usage

hamann(p, q)

## S4 method for signature 'Partition,Partition'
hamann(p, q)

## S4 method for signature 'PairCoefficients,missing'
hamann(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • hamann(p = Partition, q = Partition): Compute given two partitions

  • hamann(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Hamann U (1961). “Merkmalsbestand Und Verwandtschaftsbeziehungen Der Farinosae: Ein Beitrag Zum System Der Monokotyledonen.” Willdenowia, 2(5), 639–768. ISSN 0511-9618.

Examples

isTRUE(all.equal(hamann(new("Partition", c(0, 0, 0, 1, 1)), 
                        new("Partition", c(0, 0, 1, 1, 1))), 0.2))

Jaccard Coefficient

Description

Compute the Jaccard coefficient

N11N11+N10+N01\frac{N_{11}}{N_{11} + N_{10} + N_{01}}

Usage

jaccardCoefficient(p, q)

## S4 method for signature 'Partition,Partition'
jaccardCoefficient(p, q)

## S4 method for signature 'PairCoefficients,missing'
jaccardCoefficient(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • jaccardCoefficient(p = Partition, q = Partition): Compute given two partitions

  • jaccardCoefficient(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Jaccard P (1908). “Nouvelles Recherches Sur La Distribution Florale.” Bulletin de la Société Vaudoise des Sciences Naturelles, 44(163), 223–270.

Examples

isTRUE(all.equal(jaccardCoefficient(new("Partition", c(0, 0, 0, 1, 1)), 
                                    new("Partition", c(0, 0, 1, 1, 1))), 1/3))

Kulczynski Index

Description

Compute the Kulczynski index

12(N11N21+N11N12)\frac{1}{2} \left(\frac{N_{11}}{N_{21}} + \frac{N_{11}}{N_{12}} \right)

Usage

kulczynski(p, q)

## S4 method for signature 'Partition,Partition'
kulczynski(p, q)

## S4 method for signature 'PairCoefficients,missing'
kulczynski(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • kulczynski(p = Partition, q = Partition): Compute given two partitions

  • kulczynski(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Kulczynski S (1927). “Zespoly Roslin w Pieninach.” Bull. Intern. Acad. Pol. Sci. Lett. Cl. Sci. Math. Nat., B (Sci. Nat.), 1927(Suppl 2), 57–203.

Examples

isTRUE(all.equal(kulczynski(new("Partition", c(0, 0, 0, 1, 1)), 
                            new("Partition", c(0, 0, 1, 1, 1))), 0.5))

Larsen & Aone Measure

Description

Compute the measure of Larsen and Aone

1PCPmaxDQ2CDC+D\frac{1}{|\cal{P}|} \sum_{C \in \cal{P}}{\max_{D \in \cal{Q}}{\frac{2|C \cap D|}{|C| + |D|}}}

Usage

larsenAone(p, q)

## S4 method for signature 'Partition,Partition'
larsenAone(p, q)

Arguments

p

The partition PP

q

The partition QQ

Methods (by class)

  • larsenAone(p = Partition, q = Partition): Compute given two partitions

Author(s)

Fabian Ball [email protected]

References

Larsen B, Aone C (1999). “Fast and Effective Text Mining Using Linear-Time Document Clustering.” In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '99, 16–22. ISBN 1-58113-143-7, doi:10.1145/312129.312186.

Examples

isTRUE(all.equal(larsenAone(new("Partition", c(0, 0, 0, 1, 1)), 
                            new("Partition", c(0, 0, 1, 1, 1))), 0.8))

Lerman Index

Description

Compute the Lerman index

N11E(N11)σ2(N11)\frac{N_{11} - E(N_{11})}{\sqrt{\sigma^2(N_{11})}}

Usage

lermanIndex(p, q, c = NULL)

## S4 method for signature 'Partition,Partition,missing'
lermanIndex(p, q, c = NULL)

## S4 method for signature 'Partition,Partition,PairCoefficients'
lermanIndex(p, q, c = NULL)

Arguments

p

The partition PP

q

The partition QQ

c

PairCoefficients or NULL

Methods (by class)

  • lermanIndex(p = Partition, q = Partition, c = missing): Compute given two partitions

  • lermanIndex(p = Partition, q = Partition, c = PairCoefficients): Compute given the partitions and pair coefficients

Author(s)

Fabian Ball [email protected]

References

Lerman IC (1988). “Comparing Partitions (Mathematical and Statistical Aspects).” In Bock H (ed.), Classification and Related Methods of Data Analysis, 121–132.

Hubert L, Arabie P (1985). “Comparing Partitions.” Journal of Classification, 2(1), 193–218.

Denœud L, Guénoche A (2006). “Comparison of Distance Indices Between Partitions.” In Batagelj V, Bock H, Ferligoj A, Žiberna A (eds.), Data Science and Classification, Studies in Classification, Data Analysis, and Knowledge Organization, 21–28. Springer Berlin Heidelberg. ISBN 978-3-540-34415-5 978-3-540-34416-2.

See Also

normalizedLermanIndex

Examples

isTRUE(all.equal(lermanIndex(new("Partition", c(0, 0, 0, 1, 1)), 
                             new("Partition", c(0, 0, 1, 1, 1))), 2/sqrt(21)))

McConnaughey Index

Description

Compute the McConnaughey index

N112N10N01N21N12\frac{N_{11}^2 - N_{10}N_{01}}{N_{21}N_{12}}

Usage

mcconnaughey(p, q)

## S4 method for signature 'Partition,Partition'
mcconnaughey(p, q)

## S4 method for signature 'PairCoefficients,missing'
mcconnaughey(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • mcconnaughey(p = Partition, q = Partition): Compute given two partitions

  • mcconnaughey(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

McConnaughey BH, Laut LP (1964). The Determination and Analysis of Plankton Communities. Lembaga Penelitian Laut.

Examples

isTRUE(all.equal(mcconnaughey(new("Partition", c(0, 0, 0, 1, 1)), 
                              new("Partition", c(0, 0, 1, 1, 1))), 0))

Minkowski Measure

Description

Compute the Minkowski measure

N10+N01N11+N10\sqrt{ \frac{N_{10} + N_{01}}{N_{11} + N_{10}} }

Usage

minkowskiMeasure(p, q)

## S4 method for signature 'Partition,Partition'
minkowskiMeasure(p, q)

## S4 method for signature 'PairCoefficients,missing'
minkowskiMeasure(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • minkowskiMeasure(p = Partition, q = Partition): Compute given two partitions

  • minkowskiMeasure(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Minkowski H (1911). Gesammelte Abhandlungen von Hermann Minkowski, Zweiter Band, number 2. B. G. Teubner, Leipzig, Berlin.

Examples

isTRUE(all.equal(minkowskiMeasure(new("Partition", c(0, 0, 0, 1, 1)), 
                                  new("Partition", c(0, 0, 1, 1, 1))), 1))

Mirkin Metric

Description

Compute the Mirkin metric

2(N10+N01)2(N_{10} + N_{01})

Usage

mirkinMetric(p, q)

## S4 method for signature 'Partition,Partition'
mirkinMetric(p, q)

## S4 method for signature 'PairCoefficients,missing'
mirkinMetric(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • mirkinMetric(p = Partition, q = Partition): Compute given two partitions

  • mirkinMetric(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Mirkin BG, Chernyi LB (1970). “Measurement of the Distance Between Partitions of a Finite Set of Objects.” Automation and Remote Control, 31(5), 786–792.

Examples

isTRUE(all.equal(mirkinMetric(new("Partition", c(0, 0, 0, 1, 1)), 
                              new("Partition", c(0, 0, 1, 1, 1))), 8))

Mutual Information

Description

Compute the mutual information

CPDQCDnlognCDCD\sum_{C \in P} \sum_{D \in Q} {\frac{|C \cap D|}{n} \log n\frac{|C \cap D|}{|C| |D|}}

Usage

mutualInformation(p, q)

## S4 method for signature 'Partition,Partition'
mutualInformation(p, q)

Arguments

p

The partition PP

q

The partition QQ

Methods (by class)

  • mutualInformation(p = Partition, q = Partition): Compute given two partitions

Author(s)

Fabian Ball [email protected]

References

Vinh NX, Epps J, Bailey J (2010). “Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance.” Journal of Machine Learning Research, 11, 2837–2854.

See Also

normalizedMutualInformation

Examples

isTRUE(all.equal(mutualInformation(new("Partition", c(0, 0, 0, 1, 1)), 
                 new("Partition", c(0, 0, 1, 1, 1))), 4/5*log(5/3) + 1/5*log(5/9)))

Method to retrieve the complex coefficient NN

Description

It is defined as N=N11+N10+N01+N00N = N_{11} + N_{10} + N_{01} + N_{00} which equals (n2)n \choose{2} with nn the number of objects

Usage

N(obj)

## S4 method for signature 'PairCoefficients'
N(obj)

Arguments

obj

Instance of PairCoefficients

Author(s)

Fabian Ball [email protected]


Method to retrieve the coefficient N00N_{00}

Description

Method to retrieve the coefficient N00N_{00}

Usage

N00(obj)

## S4 method for signature 'PairCoefficients'
N00(obj)

Arguments

obj

Instance of PairCoefficients

Author(s)

Fabian Ball [email protected]


Method to retrieve the coefficient N01N_{01}

Description

Method to retrieve the coefficient N01N_{01}

Usage

N01(obj)

## S4 method for signature 'PairCoefficients'
N01(obj)

Arguments

obj

Instance of PairCoefficients

Author(s)

Fabian Ball [email protected]


Method to retrieve the complex coefficient N01N'_{01}

Description

It is defined as N01=N00+N01N'_{01} = N_{00} + N_{01}

Usage

N01p(obj)

## S4 method for signature 'PairCoefficients'
N01p(obj)

Arguments

obj

Instance of PairCoefficients

Author(s)

Fabian Ball [email protected]


Method to retrieve the coefficient N10N_{10}

Description

Method to retrieve the coefficient N10N_{10}

Usage

N10(obj)

## S4 method for signature 'PairCoefficients'
N10(obj)

Arguments

obj

Instance of PairCoefficients

Author(s)

Fabian Ball [email protected]


Method to retrieve the complex coefficient N10N'_{10}

Description

It is defined as N10=N00+N10N'_{10} = N_{00} + N_{10}

Usage

N10p(obj)

## S4 method for signature 'PairCoefficients'
N10p(obj)

Arguments

obj

Instance of PairCoefficients

Author(s)

Fabian Ball [email protected]


Method to retrieve the coefficient N11N_{11}

Description

Method to retrieve the coefficient N11N_{11}

Usage

N11(obj)

## S4 method for signature 'PairCoefficients'
N11(obj)

Arguments

obj

Instance of PairCoefficients

Author(s)

Fabian Ball [email protected]


Method to retrieve the complex coefficient N12N_{12}

Description

It is defined as N12=N11+N01N_{12} = N_{11} + N_{01}

Usage

N12(obj)

## S4 method for signature 'PairCoefficients'
N12(obj)

Arguments

obj

Instance of PairCoefficients

Author(s)

Fabian Ball [email protected]


Method to retrieve the complex coefficient N21N_{21}

Description

It is defined as N21=N11+N10N_{21} = N_{11} + N_{10}

Usage

N21(obj)

## S4 method for signature 'PairCoefficients'
N21(obj)

Arguments

obj

Instance of PairCoefficients

Author(s)

Fabian Ball [email protected]


Normalized Lerman Index

Description

Compute the normalized Lerman index

L(P,Q)/L(P,P)L(Q,Q)L(P, Q) / \sqrt{L(P, P)L(Q, Q)}

where LL is the Lerman index.

Usage

normalizedLermanIndex(p, q, c = NULL)

## S4 method for signature 'Partition,Partition,missing'
normalizedLermanIndex(p, q, c = NULL)

## S4 method for signature 'Partition,Partition,PairCoefficients'
normalizedLermanIndex(p, q, c = NULL)

Arguments

p

The partition PP

q

The partition QQ

c

PairCoefficients or NULL

Methods (by class)

  • normalizedLermanIndex(p = Partition, q = Partition, c = missing): Compute given two partitions

  • normalizedLermanIndex(p = Partition, q = Partition, c = PairCoefficients): Compute given the partitions and pair coefficients

Author(s)

Fabian Ball [email protected]

References

Lerman IC (1988). “Comparing Partitions (Mathematical and Statistical Aspects).” In Bock H (ed.), Classification and Related Methods of Data Analysis, 121–132.

Hubert L, Arabie P (1985). “Comparing Partitions.” Journal of Classification, 2(1), 193–218.

See Also

lermanIndex

Examples

isTRUE(all.equal(normalizedLermanIndex(new("Partition", c(0, 0, 0, 1, 1)), 
                                       new("Partition", c(0, 0, 1, 1, 1))), 1/6))

Normalized Mutual Information

Description

Compute the mutual information (MIMI) which is normalized either by the minimum/maximum partition entropy (HH)

MI(P,Q)φ(H(P),H(Q)), φ{min,max}\frac{MI(P, Q)}{\varphi(H(P), H(Q))},\ \varphi \in \{\min, \max\}

or the sum

2MI(P,Q)H(P)+H(Q)\frac{2 \cdot MI(P, Q)}{H(P) + H(Q)}

Usage

normalizedMutualInformation(p, q, type = c("min", "max", "sum"))

## S4 method for signature 'Partition,Partition,character'
normalizedMutualInformation(p, q, type = c("min", "max", "sum"))

## S4 method for signature 'Partition,Partition,missing'
normalizedMutualInformation(p, q, type = NULL)

Arguments

p

The partition PP

q

The partition QQ

type

One of "min" (default), "max" or "sum"

Methods (by class)

  • normalizedMutualInformation(p = Partition, q = Partition, type = character): Compute given two partitions

  • normalizedMutualInformation(p = Partition, q = Partition, type = missing): Compute given two partitions with type="min"

Author(s)

Fabian Ball [email protected]

References

Kvalseth TO (1987). “Entropy and Correlation: Some Comments.” IEEE Transactions on Systems, Man and Cybernetics, 17(3), 517–519. ISSN 0018-9472, doi:10.1109/TSMC.1987.4309069.

See Also

mutualInformation, entropy

Examples

isTRUE(all.equal(normalizedMutualInformation(
                   new("Partition", c(0, 0, 0, 1, 1)),
                   new("Partition", c(0, 0, 1, 1, 1)), "min"),
                 normalizedMutualInformation(
                   new("Partition", c(0, 0, 0, 1, 1)), 
                   new("Partition", c(0, 0, 1, 1, 1)), "max")
                 ))

S4 class to represent coefficients of object pairs for the comparison of two object partitions (say PP and QQ).

Description

S4 class to represent coefficients of object pairs for the comparison of two object partitions (say PP and QQ).

Slots

N11

The number of object pairs that are in both partitions together in a cluster

N00

The number of object pairs that are in no partition together in a cluster

N10

The number of object pairs that are only in partition PP together in a cluster

N01

The number of object pairs that are only in partition QQ together in a cluster

Author(s)

Fabian Ball [email protected]

See Also

N11 N10 N01 N00


Simple S4 class to represent a partition of objects as vector of class labels.

Description

This class is a wrapper around a vector but allows only the atomic vectors logical, numeric, integer, complex, character, raw. The reason for this is that only those types seem to make sense as class labels. Furthermore, class labels are immutable.

Author(s)

Fabian Ball [email protected]

Examples

p <- new("Partition", c(0, 0, 1, 1, 1))
q <- new("Partition", c("a", "a", "b", "b", "b"))

## Not run: 
# This won't work:
new("Partition", c(list("a"), "a", "b", "b", "b"))
p[2] <- 2

## End(Not run)

Pearson Index

Description

Compute the Pearson index

N11N00N10N01N21N12N01N10\frac{N_{11}N_{00} - N_{10}N_{01}}{N_{21}N_{12}N'_{01}N'_{10}}

Usage

pearson(p, q)

## S4 method for signature 'Partition,Partition'
pearson(p, q)

## S4 method for signature 'PairCoefficients,missing'
pearson(p, q)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • pearson(p = Partition, q = Partition): Compute given two partitions

  • pearson(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Pearson K (1926). “On the Coefficient of Racial Likeness.” Biometrika, 18(1/2), 105–117. ISSN 0006-3444, doi:10.2307/2332498.

Examples

isTRUE(all.equal(pearson(new("Partition", c(0, 0, 0, 1, 1)), 
                         new("Partition", c(0, 0, 1, 1, 1))), 1/144))

Peirce Index

Description

Compute the Peirce index

N11N00N10N01N21N01\frac{N_{11}N_{00} - N_{10}N_{01}}{N_{21}N'_{01}}

Usage

peirce(p, q)

## S4 method for signature 'Partition,Partition'
peirce(p, q)

## S4 method for signature 'PairCoefficients,missing'
peirce(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • peirce(p = Partition, q = Partition): Compute given two partitions

  • peirce(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Peirce CS (1884). “The Numerical Measure of the Success of Predictions.” Science, 4(93), 453–454.

Examples

isTRUE(all.equal(peirce(new("Partition", c(0, 0, 0, 1, 1)), 
                        new("Partition", c(0, 0, 1, 1, 1))), 1/6))

Compute the projection number of two partitions

Description

Given two partitions (p, q) represented as vectors of cluster ids, compute the projection number which is the sum of maximum cluster overlaps for all clusters of PP to any cluster of QQ.

Usage

projectionNumber(p, q)

Arguments

p

Partition PP

q

Partition QQ

Author(s)

Fabian Ball [email protected]

See Also

dongensMetric

Examples

isTRUE(all.equal(projectionNumber(c(0, 0, 0, 1, 1), c(0, 0, 1, 1, 1)), 4))

Rand Index

Description

Compute the Rand index

N11+N00N\frac{N_{11} + N_{00}}{N}

Usage

randIndex(p, q)

## S4 method for signature 'Partition,Partition'
randIndex(p, q)

## S4 method for signature 'PairCoefficients,missing'
randIndex(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • randIndex(p = Partition, q = Partition): Compute given two partitions

  • randIndex(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Rand WM (1971). “Objective Criteria for the Evaluation of Clustering Algorithms.” Journal of the American Statistical Association, 66(336), 846–850.

Examples

isTRUE(all.equal(randIndex(new("Partition", c(0, 0, 0, 1, 1)),
                           new("Partition", c(0, 0, 1, 1, 1))), 0.6))

Make comparison measures usable with any vectors

Description

The comparison measures are defined to use the class Partition as parameters. If you do not want to explicitly convert an arbitrary vector of class labels (probably as a result from another package's algorithm) into a Partition instance, calling this function will create methods for all measures that allow "ANY" input which is implicitly converted to Partition.

Usage

registerPartitionVectorSignatures(e)

Arguments

e

The environment to register the methods in (mostly environment() is fine)

Author(s)

Fabian Ball [email protected]

Examples

library(partitionComparison)
randIndex(new("Partition", c(0, 0, 0, 1, 1)), new("Partition", c(0, 0, 1, 1, 1)))
# [1] 0.6
## Not run: randIndex(c(0, 0, 0, 1, 1), c(0, 0, 1, 1, 1))
# Error in (function (classes, fdef, mtable) :
# unable to find an inherited method for function 'randIndex' for signature '"numeric", "numeric"'
registerPartitionVectorSignatures(environment())
randIndex(c(0, 0, 0, 1, 1), c(0, 0, 1, 1, 1))
# [1] 0.6

Rogers & Tanimoto Index

Description

Compute the index of Rogers and Tanimoto

N11+N00N11+2(N10+N01)+N00\frac{N_{11} + N_{00}}{N_{11} + 2(N_{10} + N_{01}) + N_{00}}

Usage

rogersTanimoto(p, q)

## S4 method for signature 'Partition,Partition'
rogersTanimoto(p, q)

## S4 method for signature 'PairCoefficients,missing'
rogersTanimoto(p, q)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • rogersTanimoto(p = Partition, q = Partition): Compute given two partitions

  • rogersTanimoto(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Rogers DJ, Tanimoto TT (1960). “A Computer Program for Classifying Plants.” Science, 132(3434), 1115–1118. ISSN 0036-8075, 1095-9203, doi:10.1126/science.132.3434.1115.

Examples

isTRUE(all.equal(rogersTanimoto(new("Partition", c(0, 0, 0, 1, 1)), 
                                new("Partition", c(0, 0, 1, 1, 1))), 3/7))

Russel & Rao Index

Description

Compute the index of Russel and Rao

N11N\frac{N_{11}}{N}

Usage

russelRao(p, q)

## S4 method for signature 'Partition,Partition'
russelRao(p, q)

## S4 method for signature 'PairCoefficients,missing'
russelRao(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • russelRao(p = Partition, q = Partition): Compute given two partitions

  • russelRao(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Russel PF, Rao TR (1940). “On Habitat and Association of Species of Anopheline Larvae in South-Eastern Madras.” Journal of the Malaria Institute of India, 3(1), 153–178.

Examples

isTRUE(all.equal(russelRao(new("Partition", c(0, 0, 0, 1, 1)), 
                           new("Partition", c(0, 0, 1, 1, 1))), 0.2))

RV Coefficient

Description

Compute the RV coefficient

n+2N11(p)(2N21(p)+n)(2N12(p)+n)\frac{n + 2N_{11}(p)}{\sqrt{(2N_{21}(p) + n) (2N_{12}(p) + n)}}

Usage

rvCoefficient(p, q)

## S4 method for signature 'Partition,Partition'
rvCoefficient(p, q)

## S4 method for signature 'PairCoefficients,missing'
rvCoefficient(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • rvCoefficient(p = Partition, q = Partition): Compute the RV coefficient given two partitions

  • rvCoefficient(p = PairCoefficients, q = missing): Compute the RV coefficient given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Robert P, Escoufier Y (1976). “A Unifying Tool for Linear Multivariate Statistical Methods: The RV- Coefficient.” Journal of the Royal Statistical Society. Series C (Applied Statistics), 25(3), 257–265. ISSN 00359254.

Youness G, Saporta G (2004). “Some Measures of Agreement between Close Partitions.” Student, 51, 1–12.

Examples

isTRUE(all.equal(rvCoefficient(new("Partition", c(0, 0, 0, 1, 1)), 
                               new("Partition", c(0, 0, 1, 1, 1))), 9/13))

Sokal & Sneath Index 1

Description

Compute the index 1 of Sokal and Sneath

14(N11N21+N11N12+N00N10+N00N01)\frac{1}{4} \left( \frac{N_{11}}{N_{21}} + \frac{N_{11}}{N_{12}} + \frac{N_{00}}{N'_{10}} + \frac{N_{00}}{N'_{01}} \right)

Usage

sokalSneath1(p, q)

## S4 method for signature 'Partition,Partition'
sokalSneath1(p, q)

## S4 method for signature 'PairCoefficients,missing'
sokalSneath1(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • sokalSneath1(p = Partition, q = Partition): Compute given two partitions

  • sokalSneath1(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Sokal RR, Sneath PHA (1963). Principles of numerical taxonomy.. Freeman, San Francisco.

Examples

isTRUE(all.equal(sokalSneath1(new("Partition", c(0, 0, 0, 1, 1)), 
                             new("Partition", c(0, 0, 1, 1, 1))), 7/12))

Sokal & Sneath Index 2

Description

Compute the index 2 of Sokal and Sneath

N11N11+2(N10+N01)\frac{N_{11}}{N_{11} + 2(N_{10} + N_{01})}

Usage

sokalSneath2(p, q)

## S4 method for signature 'Partition,Partition'
sokalSneath2(p, q)

## S4 method for signature 'PairCoefficients,missing'
sokalSneath2(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • sokalSneath2(p = Partition, q = Partition): Compute given two partitions

  • sokalSneath2(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Sokal RR, Sneath PHA (1963). Principles of numerical taxonomy.. Freeman, San Francisco.

Examples

isTRUE(all.equal(sokalSneath2(new("Partition", c(0, 0, 0, 1, 1)), 
                              new("Partition", c(0, 0, 1, 1, 1))), 0.2))

Sokal & Sneath Index 3

Description

Compute the index 3 of Sokal and Sneath

N11N00N21N12N01N10\frac{N_{11}N_{00}}{\sqrt{N_{21}N_{12}N'_{01}N'_{10}}}

Usage

sokalSneath3(p, q)

## S4 method for signature 'Partition,Partition'
sokalSneath3(p, q)

## S4 method for signature 'PairCoefficients,missing'
sokalSneath3(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • sokalSneath3(p = Partition, q = Partition): Compute given two partitions

  • sokalSneath3(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Sokal RR, Sneath PHA (1963). Principles of numerical taxonomy.. Freeman, San Francisco.

Examples

isTRUE(all.equal(sokalSneath3(new("Partition", c(0, 0, 0, 1, 1)), 
                              new("Partition", c(0, 0, 1, 1, 1))), 1/3))

Variation of Information

Description

Compute the variation of information

H(P)+H(Q)2MI(P,Q)H(P) + H(Q) - 2MI(P, Q)

where MIMI is the mutual information, HH the partition entropy

Usage

variationOfInformation(p, q)

## S4 method for signature 'Partition,Partition'
variationOfInformation(p, q)

Arguments

p

The partition PP

q

The partition QQ

Methods (by class)

  • variationOfInformation(p = Partition, q = Partition): Compute given two partitions

Author(s)

Fabian Ball [email protected]

References

Meila M (2003). “Comparing Clusterings by the Variation of Information.” In Schölkopf B, Warmuth MK (eds.), Learning Theory and Kernel Machines, volume 2777 of Lecture Notes in Computer Science, 173–187. Springer Berlin / Heidelberg. ISBN 978-3-540-40720-1.

Meila M (2007). “Comparing Clusterings–an Information Based Distance.” Journal of Multivariate Analysis, 98(5), 873–895. doi:10.1016/j.jmva.2006.11.013.

See Also

mutualInformation, entropy

Examples

isTRUE(all.equal(variationOfInformation(new("Partition", c(0, 0, 0, 1, 1)),
                                        new("Partition", c(0, 0, 1, 1, 1))),
                                        0.763817))

Wallace I

Description

Compute Wallace' index I

N11N21\frac{N_{11}}{N_{21}}

Usage

wallaceI(p, q)

## S4 method for signature 'Partition,Partition'
wallaceI(p, q)

## S4 method for signature 'PairCoefficients,missing'
wallaceI(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • wallaceI(p = Partition, q = Partition): Compute given two partitions

  • wallaceI(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Wallace DL (1983). “A Method for Comparing Two Hierarchical Clusterings: Comment.” Journal of the American Statistical Association, 78(383), 569–576.

See Also

folwkesMallowsIndex

Examples

isTRUE(all.equal(wallaceI(new("Partition", c(0, 0, 0, 1, 1)), 
                          new("Partition", c(0, 0, 1, 1, 1))), 0.5))

Wallace II

Description

Compute Wallace' index II

N11N12\frac{N_{11}}{N_{12}}

Usage

wallaceII(p, q)

## S4 method for signature 'Partition,Partition'
wallaceII(p, q)

## S4 method for signature 'PairCoefficients,missing'
wallaceII(p, q = NULL)

Arguments

p

The partition PP or an instance of PairCoefficients

q

The partition QQ or NULL

Methods (by class)

  • wallaceII(p = Partition, q = Partition): Compute given two partitions

  • wallaceII(p = PairCoefficients, q = missing): Compute given the pair coefficients

Author(s)

Fabian Ball [email protected]

References

Wallace DL (1983). “A Method for Comparing Two Hierarchical Clusterings: Comment.” Journal of the American Statistical Association, 78(383), 569–576.

See Also

folwkesMallowsIndex

Examples

isTRUE(all.equal(wallaceII(new("Partition", c(0, 0, 0, 1, 1)), 
                           new("Partition", c(0, 0, 1, 1, 1))), 0.5))