Skip to content

Eliezer Silva

Probabilistic modelling, machine learning and information retrieval

  • Biography
  • Contact
  • Courses
  • Open threads and questions
  • Projects
  • Publications
  • Welcome

Month: May 2015

Exercise: gradient of soft-max error function

May 7, 2015January 17, 2016 Eliezer SilvaLeave a comment

The soft-max regression model can be used in the k classes classification problem. The model consists of composition of probabilities distribution for each k classes. So, the activation function h_\theta(x^{(i)}) is given by:

h_\theta(x^{(i)}) = \begin{bmatrix} p(y^{(i)} = 1 | x^{(i)}; \theta) \\ p(y^{(i)} = 2 | x^{(i)}; \theta) \\ \vdots \\ p(y^{(i)} = k | x^{(i)}; \theta) \end{bmatrix} = \frac{1}{ \sum_{j=1}^{k}{e^{ \theta_j^T x^{(i)} }} } \begin{bmatrix} e^{ \theta_1^T x^{(i)} } \\ e^{ \theta_2^T x^{(i)} } \\ \vdots \\ e^{ \theta_k^T x^{(i)} } \\ \end{bmatrix}
And
h_{\theta_n}(x^{(i)}) = p(y^{(i)} = n | x^{(i)}; \theta) = \frac{ e^{ \theta_n^T x^{(i)} } }{ \sum_{j=1}^{k}{e^{ \theta_j^T x^{(i)} }} }

The error function is given by:
J(\theta) = - \frac{1}{m} \left[ \sum_{i=1}^{m} \sum_{j=1}^{k} \delta_{y^{(i)}j} \log \frac{e^{\theta_j^T x^{(i)}}}{\sum_{l=1}^k e^{ \theta_l^T x^{(i)} }}\right]=- \frac{1}{m} \left[ \sum_{i=1}^{m} \sum_{j=1}^{k} \delta_{y^{(i)}j} \log( h_{\theta_j}(x^{(i)}) )\right]

Continue reading “Exercise: gradient of soft-max error function” →

Posted in NavigateTagged calculation, exercise, math, neuralnets

Pages

  • Biography
  • Contact
  • Courses
  • Open threads and questions
  • Projects
  • Publications
  • Welcome

Blog

  • Welcome
  • Publications
  • Navigate
  • Projects
  • Biography
  • Contact
Blog at WordPress.com.
  • Follow Following
    • Eliezer Silva
    • Already have a WordPress.com account? Log in now.
    • Eliezer Silva
    • Customize
    • Follow Following
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar