Class LossMixtureDensity

    • Constructor Detail

      • LossMixtureDensity

        public LossMixtureDensity()
    • Method Detail

      • computeScore

        public double computeScore​(INDArray labels,
                                   INDArray preOutput,
                                   IActivation activationFn,
                                   INDArray mask,
                                   boolean average)
        Computes the aggregate score as a sum of all of the individual scores of each of the labels against each of the outputs of the network. For the mixture density network, this is the negative log likelihood that the given labels fall within the probability distribution described by the mixture of gaussians of the network output.
        Specified by:
        computeScore in interface ILossFunction
        Parameters:
        labels - Labels to score against the network.
        preOutput - Output of the network (before activation function has been called).
        activationFn - Activation function for the network.
        mask - Mask to be applied to labels (not used for MDN).
        average - Whether or not to return an average instead of a total score (not used).
        Returns:
        Returns a single double which corresponds to the total score of all label values.
      • computeScoreArray

        public INDArray computeScoreArray​(INDArray labels,
                                          INDArray preOutput,
                                          IActivation activationFn,
                                          INDArray mask)
        This method returns the score for each of the given outputs against the given set of labels. For a mixture density network, this is done by extracting the "alpha", "mu", and "sigma" components of each gaussian and computing the negative log likelihood that the labels fall within a linear combination of these gaussian distributions. The smaller the negative log likelihood, the higher the probability that the given labels actually would fall within the distribution. Therefore by minimizing the negative log likelihood, we get to a position of highest probability that the gaussian mixture explains the phenomenon.
        Specified by:
        computeScoreArray in interface ILossFunction
        Parameters:
        labels - Labels give the sample output that the network should be trying to converge on.
        preOutput - The output of the last layer (before applying the activation function).
        activationFn - The activation function of the current layer.
        mask - Mask to apply to score evaluation (not supported for this cost function).
        Returns:
      • computeGradient

        public INDArray computeGradient​(INDArray labels,
                                        INDArray preOutput,
                                        IActivation activationFn,
                                        INDArray mask)
        This method returns the gradient of the cost function with respect to the output from the previous layer. For this cost function, the gradient is derived from Bishop's paper "Mixture Density Networks" (1994) which gives an elegant closed-form expression for the derivatives with respect to each of the output components.
        Specified by:
        computeGradient in interface ILossFunction
        Parameters:
        labels - Labels to train on.
        preOutput - Output of neural network before applying the final activation function.
        activationFn - Activation function of output layer.
        mask - Mask to apply to gradients.
        Returns:
        Gradient of cost function with respect to preOutput parameters.
      • getNMixtures

        public int getNMixtures()
        Returns the number of gaussians this loss function will attempt to find.
        Returns:
        Number of gaussians to find.
      • getLabelWidth

        public int getLabelWidth()
        Returns the width of each label vector.
        Returns:
        Width of label vectors expected.