Sunday, 1 January 2017

Errata #4 .. Lots of Updates

I've been lucky to have readers that tale the time to provide feedback, error fixes, and suggestions for things that could be made clearer.

I am really pleased that this happens - it means people are interested, that they care, and want to share their insights.

A few suggestions had built up over recent weeks - and I've updated the content. This is is a bigger update than normal.


Thanks

Thanks go to Prof A Abu-Hanna,  "His Divine Shadow",  Andy, Joshua, Luther, ... and many others who provided valuable ideas and fixes for errors, including in the blog comments sections.


Key Updates

Some of the key updates worth mentioning are:
  • Error in calculus introduction appendix where the example explaining how to differentiate $s = t^3$. The second line of working out on page 204 shows $\frac{6 t^2 \Delta  x + 4 \Delta x^3}{2\Delta x}$ which should be $\frac{6 t^2 \Delta  x + 2 \Delta x^3}{2\Delta x}$. That 4 should be a 2.
  • Another error in the calculus appendix section on functions of functions ... showed $(x^2 +x)$ which should have been $(x^3 + x)$. 
  • Small error on page 65 where $w_{3,1}$ is said to be 0.1 when it should be 0.4. 
  • Page 99 shows the summarised update expression as $\Delta{w_{jk}} = \alpha \cdot sigmoid(O_k) \cdot (1 - sigmoid(O_k)) \cdot O_j^T$ .. it should have been the much simpler ..




Worked Examples Using Output Errors - Updated!

A few readers noticed that the example error used in the example to illustrate the weight update process is not realistic.

Why? How? Here is an example diagram used in the book - click to enlarge.


The output error from the first output layer node (top right) is shown as 1.5. Since the output of that node is the output from a sigmoid function it must be between 0 and 1 (and not including 0 or 1). The target values must also be within this range. That means the error .. the difference between actual and target values .. can't be as large as 1.5. The error can't be bigger than 0.99999... at the very worst. That's why $e_1 = 1.5$ is unrealistic.

The calculations illustrating how we do backpropagation are still ok. The error values were chosen at random ... but it would be better if we had chosen a more realistic error.

The examples in the book have been updated with a new output error as 0.8.


Updated Book

The book will be updated with these fixes as soon as the Appendix on how to run the neural networks and MNIST challenged on the Raspberry Pi Zero is updated too - the Raspian software has seen quite a few updates and probably doesn't need the workarounds described there.

6 comments:

  1. Not sure if this error has been taken into account already. On Page 98, it says that weights are increased when the slope is positive. It should be the other way around.

    ReplyDelete
  2. You are right! I've made the correction to the book.

    For othe readers .. here is the text:



    new_w = old_w - ( a * (dE/dw )

    The updated weight w_jk is the old weight adjusted by the negative of the error slope we just worked out. It’s negative because we want to increase the weight if we have a positive slope, and decrease it if we have a negative slope, as we saw earlier. The symbol alpha 𝛂, is a factor which moderates the strength of these changes to make sure we don’t overshoot. It’s often called a learning rate .

    That should say "we want to decrease the weight if we have a positive slope, and increase it if we have a negative slope" The equation is itself correct.

    ReplyDelete
    Replies
    1. Thank you this is what I have been stuck with!

      Delete
  3. This comment has been removed by the author.

    ReplyDelete
  4. thank you this is what I have been stuck with!!

    ReplyDelete
  5. thanks you for getting in touch ... and helping everyone else too!

    ReplyDelete