Predictor

Members
  • Content count

    285
  • Joined

  • Last visited

Community Reputation

198 Neutral

About Predictor

  • Rank
    Member
  1. Chess AI with Neural Networks

    [quote name='Eralp' timestamp='1346597479' post='4975738'] If you would code a chess AI with NN how would you format the input and the output? [/quote] Directly feeding the piece positions to a neural network is unlikely to be useful. If you really want to apply neural networks to chess, I suggest using a conventional game tree search, with a neural network as the evaluation function. Instead of feeding it the raw board state, though, I'd recommend using a number of hand-crafted features, such as: difference between opponent and self for number of possible moves (a crude measure of mobility), whether opponent or self controls the center 4 squares, difference between opponent and self for standard material measure: queen = 9, rook = 5, etc., and so forth. An additional challenge with this approach is that no direct performance feedback will be available. Most artificial neural networks are constructed via supervised learning, through which errors are directly fed to learning mechanism. When playing games, win or loss performance is only known at the end of the game (after many recalls of the neural network model).
  2. Quote:Original post by Spacepeon I am looking for some background articles on the use of neural networks to provide an evaluation function in a two-players board game. The main reason is that I recently found this link : http://keldon.net/bluemoon/ where you can you can find a very good ai for a card game called bluemoon. On its turn, the ai tries every possible move and evaluates the resulting situations with a neural network. What surprised me is the way the networks are trainned, with only minimal prior knowledge on the strategy, by having two networks play a lot of games against each other. (and I was also impressed by the efficiency of this approach : the ia is not an easy opponent !) So do you know any other similar works ? Is this kind of training commonly used ? There have been a number of efforts along these lines. Look for systems like BOXES and Adaptive Critic.
  3. Quote:Original post by alvaro Quote:Original post by Predictor [...sensible stuff that I agree with completely...] Also, consider that "convergence" of the training process is likely not necessary not even desirable: By the time the training process quits, you have likely overfit the data. Better results can be had through early stopping or constraining the number of hidden nodes in the model. I have never been very convinced by early stopping. I fully appreciate how much of a danger overfitting is, and the natural solution to me seems to be to reduce the number of free parameters that make up the model (i.e., fewer hidden nodes). If the model has more parameters than what the data grants, won't early stopping give us a function that still has too many "wrinkles", except now they are random instead of overfit? Have you had good experiences using early stopping? Perhaps there is some way of looking at it that I am missing? I understand intuitively what you're saying, but all I can tell you is that I have had good experiences with early stopping and that I know several authors who suggest it as well. The flip side, of course, is that exploration of the effect of the number of hidden nodes takes more time to compute than having "too many" and stopping early, though I suppose this might not be too bad, especially on today's hardware.
  4. Quote:Original post by Deliverance I've been playing with neural networks these days and found quite an interesting thing(for me) about them. I'm trying to solve the XOR problem using a two layer network. I found that in the initial phase every weight must be initialized with some random value. These random values seem to be very important or so i found. There are combinations of random valuse that will cause the neural network to not converge to a solution, and i wonder why is this so? How can i initialize the random values so that the neural network will always converge? Take as an example the code here in the file bpnet.h replace line 31 from this *** Source Snippet Removed *** to this *** Source Snippet Removed *** Now, when running the sample you'll see that the neural network did not successufully converge to an approximation of the XOR function. Why is that? I don't know for sure, because it could be any of several different causes, but I will guess that your model is falling into local optima: areas which are better than their immediate surroundings, but not actually the best possible solutions. If this is indeed the problem, there are several possible remedies: 1. re-run multiple times with different random initialization each time 2. initialize more intelligently (so that the initial model breaks the data well) 3. use an optimizer which is less prone to becoming trapped in local optima (hybrid global/local methods may work well for this) Also, consider that "convergence" of the training process is likely not necessary not even desirable: By the time the training process quits, you have likely overfit the data. Better results can be had through early stopping or constraining the number of hidden nodes in the model.
  5. Bias in Neural Networks

    Quote:Original post by kvsingh Hi, I'm a newbie to the world of ANN. I'm aware of the Gradient Desecent Rule and the Backpropagation Theorem. What I don't get is , when is using a bias important? For example, when mapping the AND function, when i use 2 inputs and 1 output, it does not give the correct weights, however , when i use 3 inputs(1 of which is a bias), it gives the correct weights. I think the simplest way to understand the usefulness of a bias term is to think about what happens when all of the input variables equal zero. No matter what the weights are on those variables, the output of the linear part of any neuron will be zero. The bias term allows the linear portion of a neuron to output a value other than zero when all inputs are zero.
  6. Question on Fuzzy Logic Rule Sets

    Quote:Original post by ColinS I'm trying to make a fuzzy rule set where I'm using a left shoulder, triangle, and right shoulder. My left shoulder is 18,9,0 for minbound,peak,maxbound. My triangle is 9,0,-9 and my right shoulder is 0,-9,-18. I was just wondering if you're allowed to use negative bounds in your fuzzy rule set. Yes, that's fine. It's the fuzzy truth values which should not be outside of 0.0 - 1.0.
  7. Re: Neural networks Quote:Original post by Narf the MouseI do not, at the moment, have the money for a programming book, although I may be able to spend some in a ~week. You might try the Usenet comp.ai.neural-nets FAQ, at least as a start: http://www.faqs.org/faqs/ai-faq/neural-nets/part1/ [Edited by - Predictor on March 17, 2010 10:14:04 AM]
  8. any neural network experts?

    Quote:Original post by Si1ver Hey, does anyone have experience training neural networks with large datasets? I doing character recognition and at this point it isn't as feasible to use trail and error to train, especially when I have to leave it on over night. At the moment I'm training characters A-Z, with 10 training examples for each character. My network structure is 36,36,36,26. Learning rate = 0.05 Momentum 0.01 With higher learning rates or momentums it doesn't seem to converge, just goes up and down vigorously, but with these it seems to converge slowly, just slower and slower as it gets from 2000 error to 1200 error. And at 1200 error, where the slope seems to get caught in local minima it is only partially good at recognising characters. My only guess would be that the network isn't complex enough for the dataset? So I am now training with 36,36,36,30,26 to see what happens, but it takes forever. Like one minute per an epoch. any ideas anyone? should i try increasing the size of the hidden layers, instead of increasing the amount of layers? I use early stopping to prevent over-fitting, and my experience is that backpropagation frequently hits its optimum fairly quickly (sometimes in as few as 50 training iterations). Obviously, your experience may be very different, given different software and data. As a general note, once the neural network has more than one output node, there exists the possibility that some output nodes are trained before others. This implies that further training will overfit some nodes while others are yet underfit. With 26 output nodes, I suggest that there is a strong possibility that this is happening. I suggest trying different methods of pre-processing the data. This will often improve results more than tinkering with training settings. Good luck!
  9. AI for a small RTS

    Quote:Original post by MONSTROZZITY Orignally I wanted to try to make an RTS game kind of like supreme commander but I reallized that I have no real idea as to go about the AI creation proccess. Im using Python because Im using Blender 3d which Im very comfortable modeling with and because it has a built in game engine. It seems that most of the posts on this site are about C, C++, and C#. Blender 3d doesn't support them though. Even if you don't know how to go about it in Python then any tips for my consideration would be way more than appreciated. I imagine that you would get more help with algorithms, here, than with Python specifically. Even a quick search on Bing reveals several resources for game programming in Python, such as: Python Game Programming Challenge "Game Programming With Python" Pygame -Will Dwinnell Data Mining in MATLAB
  10. Image (Pre-)Processing for Face Detection

    Quote:Original post by willh Quote:Original post by Predictor Assuming that equalization (or whatever pre-processing is being used) is effective at improving performance, then time and data may be economized by not needing such a large development data set. There are plenty of face sets available, for free, online; getting a large training set is trivial. Whether those data sets would apply to any given application is an open question. Regardless, there are reasons to not not want large training sets, other than the cost of obtaining the data.
  11. Image (Pre-)Processing for Face Detection

    Quote:Original post by willh Quote:Original post by Side Winder Well, I was under the impression that it helps with distinguishing features after greyscaling. Makes the greys more varied so when the data is put into the ANN it's got a broader range of input? Having lots of face samples under different lighting conditions will give your ANN a broad range of inputs. It's not uncommon to use several 1000 positive training samples. Assuming that equalization (or whatever pre-processing is being used) is effective at improving performance, then time and data may be economized by not needing such a large development data set.
  12. Image (Pre-)Processing for Face Detection

    Quote:Original post by Side Winder Well, I was under the impression that it helps with distinguishing features after greyscaling. Makes the greys more varied so when the data is put into the ANN it's got a broader range of input? Actually, I think equalization might help, but because it standardizes the images for lighting. In other words, faces under different lighting should occupy a smaller portion of the input space.
  13. Need to implement something unusual, in JS

    I suggest trying something simple, such as waiting 25% longer than the longest pause between keystrokes. If that is unsatisfactory, then try conditioning based on how many characters have been typed in the current answer. So, there might be a maximum time between keystrokes for answers that are (at the time of measurement) 1 - 10 characters long, 11 - 20 characters long, and 21+ characters long. Use whichever maximum corresponds to the current length of the answer. Quote:Original post by speciesUnknown I've got an unusual AI problem that I need to find a solution for; This is not related to game development, but game developers are so crazy that I'm sure somebody will have a solution. I need to implement some user behaviour interpretation, in javascript. Basically, this will gather keystroke frequency data as the user types. The theory is that as they type it logs their typing speed, and the average pause between words or sentences, and then decides whether they have "finished" typing. This will be used to decide when the user has finished answering a question, and advance a "questionnaire" form (in the browser) to the next question. The biggest potential problem that I can think of is that anomolies in the data will prevent the system from making any kind of accurate predictions. I can process the data in PHP when the page loads, but the rest needs to be in real time, in javascript. I'm already pretty adept at making javascript do things that even google wouldn't attempt, so the language is not the issue, although I may find that speed is a problem. Does anybody have any ideas for how I can do this?
  14. Neural Network Book Recomendations?

    I think that real progress on neural networks has slowed in recent years. I suggest looking into related areas instead. A good example is model ensembles. See, for instance: Combining Models to Improve Classifier Accuracy and Robustness, by Dean W. Abbott. Other areas of interest might be: dealing with missing values and attribute selection. Quote:Original post by Snikks I'm currently writing a literature review for my Doctorate thesis and was wondering what books people here would recommend for information on the history and more recent development of ANN's in all their shapes and sizes? I have a basic working knowledge of ANN's and have implemented a few, as my undergraduate thesis involved working with ANN's and different training techniques.
  15. Understanding Neural Net Programs

    Strictly speaking, backpropagation is short for "backpropagation of errors", which is a general process of sending errors measured at the neural network output back through the successive layers, but most often is used to refer specifically to some variation of the generalized delta rule, a specific way to train the neural network. Feedforward refers to the way the artificial neurons are connected to one another and how information flows among them. A multilayer perceptron ("MLP") is a particular neural architecture arranged as distinct layers of artificial neurons, which may or may not be trained by backpropagation. Despite referring particular aspects of a neural network, the above terms are often used interchangeably to mean a multilayer perceptron being trained by the generalized delta rule. Anyway, MLPs with nonlinear transfer functions are indeed able to learn to map classes (such as XOR) which are not linearly separable. These are not the only types of artificial neural networks, of course, and there are methods besides neural networks which are used to solve learning problems (tree induction, rule induction, naive Bayes, k-nearest neighbors, regression, discriminant analysis, etc.). If you are really interested in MLP driven by backpropagation, consider the book, Neural Networks for Statistical Modeling by Murray Smith. I think it is out of print now, but I found the explanation of how all this stuff works quite accessible. Quote:Original post by tobinare I'm trying to learn about basic Neural Networks and programming for them (C++). I've downloaded several source codes from tutorials etc. and libraries. They all seem to work the same by creating the network with layers and neurons and apply the backpropagtion in some manner. I've compiled several programs and run them easily enough. As a metric to compare their functionality and ease of use I've tried to apply the typical XOR problem to each of them. It seems half of the programs handle this problem well by learning and solving for 1 and 0. The other half seem to solve for 0.5 which is the case for a Multilayer Perceptron code I have. What's the difference between different algorithms in NN's approach to this problem? I read that the XOR problem cannot be solved by MLP but why does another progam solve it easily? I've tried reading several tutorials but I get lost in the nomenclature. I hope someone can shed some light on Backpropagation, FeedForwad, etc.