Mutual information with absolute dependency for feature selection in machine learning models
Abstract:
Systems and techniques are provided for determining mutual information with absolute dependency for feature selection. Items may be received from a dataset. Each item may include two random variables. A first random variable may be associated with a first range of discrete values, and a second random variable may be associated with a second range of discrete values. Mutual information between the two random variables may be determined according to one of: I ⁡ ( X , Y ) = ∑ x ∈ X ⁢ ⁢ ∑ y ∈ Y ⁢ ⁢  p ⁡ ( x , y ) · log ⁡ ( p ⁡ ( x , y ) p ⁡ ( x ) · p ⁡ ( y ) )  and I ⁡ ( X , Y ) = ∑ x ∈ X ⁢ ⁢ ∑ y ∈ Y ⁢ ⁢  p ⁡ ( y ) · log ⁡ ( p ⁡ ( x , y ) p ⁡ ( x ) · p ⁡ ( y ) )  , I(X,Y) may be the mutual information between X and Y, x may be a value for X, y may be a value for Y, p(x,y) may be a joint probability distribution function of x and y, p(x) may be a marginal probability distribution function of x, and p(y) may be a marginal probability distribution function of y. The mutual information may be used in a machine learning system to predict a value for one of the random variables for an item for which the value is unknown.
Information query
Patent Agency Ranking
0/0