class UndersamplingBasedClustering extends Algorithm
Undersampling Based on Clustering algorithm. Original paper: "Under-Sampling Approaches for Improving Prediction of the Minority Class in an Imbalanced Dataset" by Show-Jane Yen and Yue-Shi Lee.
- Alphabetic
- By Inheritance
- UndersamplingBasedClustering
- Algorithm
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
UndersamplingBasedClustering(data: Data, seed: Long = System.currentTimeMillis(), minorityClass: Any = -1)
- data
data to work with
- seed
seed to use. If it is not provided, it will use the system time
- minorityClass
indicates the minority class. If it's set to -1, it will set to the one with less instances
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate() @throws( ... )
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @HotSpotIntrinsicCandidate()
-
def
sample(file: Option[String] = None, method: String = "random", m: Double = 1.0, k: Int = 3, numClusters: Int = 50, restarts: Int = 1, minDispersion: Double = 0.0001, maxIterations: Int = 200): Data
Undersampling method based in SBC
Undersampling method based in SBC
- file
file to store the log. If its set to None, log process would not be done
- method
selection method to apply. Possible options: random, NearMiss1, NearMiss2, NearMiss3, MostDistant and MostFar
- m
ratio used in the SSize calculation
- k
number of neighbours to use when computing k-NN rule (normally 3 neighbours)
- numClusters
number of clusters to be created by KMeans algorithm
- restarts
number of times to relaunch KMeans algorithm
- minDispersion
stop KMeans algorithm if dispersion is lower than this value
- maxIterations
number of iterations to be done in KMeans algorithm
- returns
Data structure with all the important information
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )