SoftMax activation function
What's the purpose of the softmax function? What sort of examples would it be good to use? Where exactly is it supposed to go?(I read somewhere that is only supposed to be applied to the output layer). Can we do without it?
Alex
I reinvented the softmax function for the problem of estimating the distribution of probabilities of chess moves in a given position. I assign a score to each move using a traditional alpha-beta search and then convert them to probabilities using softmax. You can train a few weights in the evaluation function to make the distribution of probabilities as close to a training database as possible.
I do this to decide what branches to explore deeper in an experimental chess program, but then I found out that other people have used the same technique to try to quantify playing styles: See this article.
Anyway, I didn't know until I did a Google search today that this thing was called softmax, but it makes sense. If you multiply all your scores by a large number, you get a probability distribution that just picks the maximum.
I do this to decide what branches to explore deeper in an experimental chess program, but then I found out that other people have used the same technique to try to quantify playing styles: See this article.
Anyway, I didn't know until I did a Google search today that this thing was called softmax, but it makes sense. If you multiply all your scores by a large number, you get a probability distribution that just picks the maximum.
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement