c

undersampling.core

ClassPurityMaximization

class ClassPurityMaximization extends Algorithm

Class Purity Maximization algorithm. Original paper: "An Unsupervised Learning Approach to Resolving the Data Imbalanced Issue in Supervised Learning Problems in Functional Genomics" by Kihoon Yoon and Stephen Kwek.

Linear Supertypes
Algorithm, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ClassPurityMaximization
  2. Algorithm
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new ClassPurityMaximization(data: Data, seed: Long = System.currentTimeMillis(), minorityClass: Any = -1)

    data

    data to work with

    seed

    seed to use. If it is not provided, it will use the system time

    minorityClass

    indicates the minority class. If it's set to -1, it will set to the one with less instances

Value Members

  1. def sample(file: Option[String] = None, distance: Distance = Distances.EUCLIDEAN): Data

    Undersampling method based in ClassPurityMaximization clustering

    Undersampling method based in ClassPurityMaximization clustering

    file

    file to store the log. If its set to None, log process would not be done

    distance

    distance to use when calling the NNRule algorithm

    returns

    Data structure with all the important information