# Genetic Algorithm and Sudoku

## Recommended Posts

AndyTang    122
I've recently been interested in Sudoku puzzles and I'm trying to figure out a way to generate the puzzles. At first I thought I could use a genetic algorithm to generate a puzzle through generation iterations as I heard a few people have already tried. Here lies the first stumbling block. I have a decent fitness method and I can choose two puzzles from the population (randomly generated grid filled with numbers) but I cannot figure out a way to cross breed these two grids. Since there doesn't seem to be a way that would help determine a possible solution except by randomly copying numbers from one or the other. Any suggestions? BTW, does any one knows what determines the difficulty of a sudoku problem? Is it just the number of missing spaces.

##### Share on other sites
RunningInt    133
I actually tried doing this about 2 months ago. As you mentioned, there doesn't seem to be a good crossover operator available. In my opinion even if you found a crossover operator that produced valid solutions, it would do little good anyway as it would likely just be effectively macro-mutation (and macro-mutation is highly unlikely to increase fitness).

I chose to just use mutation instead and focused on figuring out some good mutation operators instead. I implemented a substitution mutation and a swap mutation (swapped two random boxes in the grid).

Unfortunately my conclusion was that GA's are not well suited to this problem. The reason is that Suduko requires a perfect solution, wheras the evolutionary process is best suited for problems where near-perfect solutions are acceptable. In Sudoku near-perfect solutions are just wrong solutions.

Using a GA to solve suduko you can easily reach a local maximum where you have just two numbers in the wrong order, but it would take a long series of downhill mutations to reach the global optimum.

Unless you hybridise the algorithm with some sort of domain knowledge (which is kind of cheating) then you may have to leave your machine running for days to find the solution.

I chose to cheat instead. I found that suduko can actually be solved using a sequence of logical steps. In answer to your question about how the difficulty of Suduko puzzles is based, no it is not soley based on the number of missing spaces. There is a sequence of steps you can use to solve a suduko puzzle and more difficult puzzles will require more steps.

The first step I used was to store a list of possible values for each box in the grid. Then the program removes values from this list that cannot go in the box (ie if there is a 5 in the same row, or column as the box then 5 cannot go into the box, so remove 5 is removed from the list of possible values for that box).

After ruling out values for all the boxes in the grid, the next step is to find boxes with only one possible value left. You then obviously assign that value to the box, and then because the grid has changed, you redo step 1 above.

These steps alone can complete the easier Suduko puzzles, but with moderately hard puzzles you reach a stage where there are boxes not yet filled in, but they all have more than one possible value.

The next step is to look at each 3x3 subgrid seperately. You know the rule that a subgrid must contain the numbers 1 to 9. Take the numbers 1 to 9 in turn and check to see if they can only go in one box in the subgrid. If so then fill in that box, and redo from the first step.

The above steps will solve most moderate puzzles, and some hard. But some puzzles are even harder and I got stuck on that stage. My brain wasn't good enough to figure out the next logical step (I have no idea if there are more barriers after that).

Instead I just brute forced the rest, by treating it like a combination lock, and exiting early if a combination broke the suduko constraints. The above steps reduced the combinations enough so that it could be solved instantaneously.

##### Share on other sites
SnakeHunta    148
I fully agree with RunningInt. Sudoku is not a puzzle that will be solved very well at all with a genetic algorithm. The fact that there are strategies and various logical steps necessary says it all. The best way to implement any AI to solve Sudoku is to program it to use the same techniques a human player would use in solving the puzzle. You can easily find these anywhere and in fact using them creates for a very fast puzzle solver ( as opposed to brute force... ).

##### Share on other sites
kevmo    205
Sudoku is a prime example of a class of problems known as Constraint Satisfaction Problems. The formal definition of a CSP is as follows (taken from my AI professors website):

A CSP is a triplet {V, D, C}, where V is a finite number of variables, V={V1, V2, ..., Vn}.

Each variable may be assigned a value from a domain D.

Each member of C is a pair: the first element of the pair is a set of variables, and the second element is the set of legal values for those sets. (Note that C is usually represented with a function, rather than explicitly).

The method to solve such a problem is depth first search with constraint propogation (detailed in the website I cited earlier). The basic idea is for each variable, determine the set of all possible values for it. You perform a depth first search, where every time you try setting a variable, you check that variable with all of the constraints, eliminating illegal values from other variables. You then propogate these changes recursively, doing a similar check for each variable possible set that you changed. If you ever prune a set to the empty set, then you end that branch of the search and backtrack.

In practice, the constraint propogation "pre-processing" will sometimes solve the problem with no search at all. If it doesn't, it usually doesn't take many iterations of the search to find a solution.

##### Share on other sites
Kylotan    9860
Have the last 3 posters all missed the fact that the original poster asked to "generate the puzzles" which is not the same as "solving the puzzles"?

##### Share on other sites
RunningInt    133
LOL yes. whoops. I guess generating one follows a similar method though. First you have to effectively solve one, and then get rid of some of the answers.

##### Share on other sites
kevmo    205
Quote:
 Original post by KylotanHave the last 3 posters all missed the fact that the original poster asked to "generate the puzzles" which is not the same as "solving the puzzles"?

Quote:
 Since there doesn't seem to be a way that would help determine a possible solution except by randomly copying numbers from one or the other.

His post was kind of confusing. He said he wanted to make puzzles, but he also said he was having difficulty getting the computer to solve them. I was answering the latter question.

Additionally, as RunningInt touched on, making a puzzle is essentially solving a puzzle with no constraints, and then replacing some of the squares with blanks. By randomizing which node you expand in the depth first search part of constraint propogation, you can create random solutions. The problem is then reduced to finding a strategy of covering squares given a certain difficulty, checking to make sure the end result is solvable.

##### Share on other sites
RPGeezus    216
Why do you care about cross-over? Is random 'mutation' not enough?

Will

##### Share on other sites
WeirdoFu    205
Quote:
 Original post by RPGeezusWhy do you care about cross-over? Is random 'mutation' not enough?Will

Well, some believe that crossover is the only thing needed in GAs, while others believe that the whole crossover concept is worthless.

In this case, due to the high level of dependency among the values in the candidate solution, crossover may not be exactly possible.

As for generating a problem, I think using a GA may not be the best idea for directly generating a problem. However, you could use a GA to generate an end result, then use a reverse solver to start plucking out values.

##### Share on other sites
TechnoGoth    2937
I was planning on making a little Sudoku game, and the method for generating the puzzles I thought up is as follows.

Basically the there are two phases:

Phase one: Fill the board.

To fill the board you use a simple iterative search. For each box you determine the valid locations for the current number and then place it in a random spot. Then move to the next box. This repeated until the current number is placed in all boxes, then you move to the next number and repeat. Eventually you will have a completed Sudoku.

Phase two: determine values to show:

Once you have a completed board you create a play board by determining the number values to show based on the difficulty level and then show that many random values and you are done.

##### Share on other sites
AndyTang    122
I apologise for the confusing first post.

As KevMo correctly stated, generating a sudoku board is near enough the same as solving a sudoku board. The main difference is that 'solving' one has a head start where the initial values on the board is 'correct' and will lead to a solution.

I've always though that the mutator in a GA provides the random element, whilst the crossover provides the process to move the population towards some possible solution. If the whole breeding process is totally random, then it seems like it is just wildly searching for the solution.
As some of you mention, this means that GA is possibly not suitable for what I am trying to accomplished.

Quote:
 Original post by TechnoGothPhase one: Fill the board.To fill the board you use a simple iterative search. For each box you determine the valid locations for the current number and then place it in a random spot. Then move to the next box. This repeated until the current number is placed in all boxes, then you move to the next number and repeat. Eventually you will have a completed Sudoku.

Does this need some form of backtracking - will it always give a solution and you wont get stuck half way filling the board?

Thanks all for answering, this has been very enlightening.

##### Share on other sites
Martin Proud    124

##### Share on other sites
kevmo    205
Quote:
 Original post by ChrominiumDoes this need some form of backtracking - will it always give a solution and you wont get stuck half way filling the board?

When solving an already generated Sudoku puzzle using constraint propogation, I do not think you would need backtracking. However, when generating a puzzle from scratch using DFS+constraint propogration, you will, because I think it is possible to choose a value for a node that turns out later to be impossible.

##### Share on other sites
Kylotan    9860
Quote:
 Original post by ChrominiumI've always though that the mutator in a GA provides the random element, whilst the crossover provides the process to move the population towards some possible solution. If the whole breeding process is totally random, then it seems like it is just wildly searching for the solution.As some of you mention, this means that GA is possibly not suitable for what I am trying to accomplished.

Crossover and mutation are not so separate. Usually both involve a random factor, but traditionally crossover combines part of one solution with another (in the hope that good parts from each will be combined) while mutation makes small changes to a solution (in the hope that a small but significant improvement can be made). I like to think of crossover as coming up with ideas and mutation as refining them.

I think GAs would be a decent solution to this problem but I expect there is a more rigorous mathematical approach that would work better.

##### Share on other sites
Metorical    580
27 equations. Cij where C00 is the upper left corner, i is the row , j is the column. Probably this CSP thingy but I'm not a mathematician.

C00 + C01 + C02 + ... + C08 = 45 and similarly for all rows, all columns and all 3 by 3 grids...

Solve the matrix. It's an interesting idea to do it with a genetic algorithm but it follows in the vein of using neural nets for everything. Certain techniques should be used in specific cases but quite often the popular techniques get applied instead. However if you're doing it just for fun then enjoy :-) Seems interesting enough!

##### Share on other sites
kevmo    205
Quote:
 Original post by Metorical27 equations. Cij where C00 is the upper left corner, i is the row , j is the column. Probably this CSP thingy but I'm not a mathematician.C00 + C01 + C02 + ... + C08 = 45 and similarly for all rows, all columns and all 3 by 3 grids...Solve the matrix. It's an interesting idea to do it with a genetic algorithm but it follows in the vein of using neural nets for everything. Certain techniques should be used in specific cases but quite often the popular techniques get applied instead. However if you're doing it just for fun then enjoy :-) Seems interesting enough!

These equations aren't enough. A valid Sudoku puzzle is solveable without guessing, so you would need to be able to get an exact answer with the given square values. There are 81 squares on a Sudoku puzzle, but if you only used 27 equations, then you could only solve for 27 unknowns. This means you would need to cover up >50 squares for a solvable puzzle. Howevever, it is possible to make a Sudoku puzzle with only 20 or so givens, which would be 60 unknowns.

The thing missing in your equations is that you need restrictions on the variables to make them in the range of 1-9. You can't really make that restriction if you were to solve your equations by matrix inversion.

This is where the constraint satisfaction part comes in. You augment your given (but modified) constraints with the constraint that each square comes from the set of 1 through 9.

Quote:
 Original post by ChrominiumIf the whole breeding process is totally random, then it seems like it is just wildly searching for the solution.

Take a look at simulated annealing ;) (not for this project, you just may find it interesting) It's an algorithm in the class of minimization/maximization (along with genetic algorithms and hill climbing/gradient descent). It is kind of similar to genetic algorithms with only mutation.

##### Share on other sites
Timkin    864
Quote:
 Original post by KylotanI like to think of crossover as coming up with ideas and mutation as refining them.

They are really both just methods of hypothesis generation by analogy. Remember that the selection operator is responsible for choosing good hypotheses. Crossover attempts to alter the selected hypotheses by switching substates on a submanifold of the state space determined by the crossover site, while mutation essentially nudges a point in the state space a small distance. Crossover has a predominant affect on the convergence rate of the population, however it is also responsible for the asymptotic performance of GAs.

##### Share on other sites
RPGeezus    216
Quote:
Original post by Timkin
Quote:
 Original post by KylotanI like to think of crossover as coming up with ideas and mutation as refining them.

They are really both just methods of hypothesis generation by analogy. Remember that the selection operator is responsible for choosing good hypotheses. Crossover attempts to alter the selected hypotheses by switching substates on a submanifold of the state space determined by the crossover site, while mutation essentially nudges a point in the state space a small distance. Crossover has a predominant affect on the convergence rate of the population, however it is also responsible for the asymptotic performance of GAs.

I'd like to add something to that. :)

Population management can radically affect the usefulness of crossover or mutation.

As an example:
The top scoring member moves on to the next generation AND a copy of this member (mutated or crossed with another member). Same for the second best, etc.. until we have 10 members for the next generation. No monte-carlo selection, we always take the fittest.

In this case, after the population matures a bit, cross-over will become little different from mutation. The members in the population will be nearly identical. A cross-over is more likely to break the member, since the changes tend to be more dramatic, and as the population is mature, it has likely evolved some internal structure that may not take kindly to big changes.

The last few times I've used EP for a problem I've had the program record the number of times a mutation improved fitness, and the number of times cross-over improved fitness. I realize that this is very subjective to the problem being addressed, and may fly in the face of conventional wisdom, but I found that in less than 5% of improvements cross-over was the contributing factor.

I wouldnt be too surprised if it were possible to show, mathmatically, that mutation and cross-over are equivilant under certain constraints.

##### Share on other sites
RunningInt    133
Quote:
 Original post by WeirdoFuWell, some believe that crossover is the only thing needed in GAs, while others believe that the whole crossover concept is worthless.

I think it's worthless. But im not an expert. Do you happen to know if the worth of crossover is contested in acedemia? Just to clarify - I realise crossover is useful in some model situations. But almost all of the problems i've tried GA's on I have never seen how crossover is anything more than a large mutation.

I sometimes get the impression that all the rage for crossover in GA's is simply because crossover exists in nature. Although I realise I might be spouting ignorant foolishness here.

##### Share on other sites
Timkin    864
Quote:
 Original post by RunningIntI realise crossover is useful in some model situations. But almost all of the problems i've tried GA's on I have never seen how crossover is anything more than a large mutation.

That's suggestive of trying to apply GAs to a search problem where they simply aren't appropriate... or of a bad encoding scheme (usually the latter).

Quote:
 I sometimes get the impression that all the rage for crossover in GA's is simply because crossover exists in nature. Although I realise I might be spouting ignorant foolishness here.

If you remove crossover from a GA, then you are left merely with random-walk-hillclimbing...certainly an inefficient search method, but one that is easily applied to many domains (and almost any encoding).

The skill in efficient and successful application of a GA is getting the encoding scheme correct.

I really should re-publish an old thesis of mine on the analysis of GA operators and algorithmic performance... maybe over my xmas downtime...

##### Share on other sites
RPGeezus    216
Quote:
 Original post by TimkinIf you remove crossover from a GA, then you are left merely with random-walk-hillclimbing...certainly an inefficient search method, but one that is easily applied to many domains (and almost any encoding).

Maybe you could explain this in more detail.

Like RunningInt, I too have found cross-over to be much like a large mutation.

Say were on generation 600-- all of the members of a population will be nearly identical. If I move sections from member B in to member A, I'm most likely duplicating code that already exists in member A.

Another way to look at it: Say my cross-over only works 1 instruction at a time (I know this is not how it works, but...) If I take 1 instruction from member B and copy it in to member A, how different is that than randomly adding a new instruction to member A?

I'm open to the very real possibility that my own implementations of this are biased towards mutation. I'd hate to be missing out on some added utility. :)

Thanks Tim.

Will

##### Share on other sites
Kylotan    9860
Quote:
Original post by RPGeezus
Quote:
 Original post by TimkinIf you remove crossover from a GA, then you are left merely with random-walk-hillclimbing...certainly an inefficient search method, but one that is easily applied to many domains (and almost any encoding).

Maybe you could explain this in more detail.

Like RunningInt, I too have found cross-over to be much like a large mutation.

Say were on generation 600-- all of the members of a population will be nearly identical. If I move sections from member B in to member A, I'm most likely duplicating code that already exists in member A.

If all the members are nearly identical, then crossover alone may not do you much good. However, can you say the same about generations 1 to 100?

Quote:
 Another way to look at it: Say my cross-over only works 1 instruction at a time (I know this is not how it works, but...) If I take 1 instruction from member B and copy it in to member A, how different is that than randomly adding a new instruction to member A?

As you said, that is not how it works, so your example doesn't help. Crossover typically exchanges large amounts of each solution, preserving some qualities of the previous solutions involved, which by their presence in the population and appearance at the selection stage, are assumed to have some degree of merit.

As Timkin has implied, the representation is key here. Crossover works well when your problem contains multiple dimensions which are largely independent. You might have one solution which has a good score in one dimension but has not found a good score in another. And a second solution may have the opposite. Crossover 'beats' mutation here in that crossover is able to combine the knowledge inherent in those two solutions into one improved solution. By comparison mutation loses that information.

Personally I think this might work poorly if you are swapping bytecode instructions in and out at an arbitrary point since there is a large degree of forward and backward referencing in imperative languages. I wouldn't think this was a good representation for GAs to optimise. It might work better as a symbol tree where you're swapping branches at random, as the structure of each solution is largely retained there.

##### Share on other sites
Timkin    864
There are two significant inefficiencies with GAs, Hitchhiking and Cloning.

Hitchhiking is the association of a week allele with a strong one through reproduction, whereby the presence of the weak allele is highly correlated with the presence of the strong allele. Because other values of the allele are not so present in the population, only mutation alone can break the correlation, which is difficult to do with low mutation rates seen in most implementations.

Cloning is the effect of crossing two parents and reproducing the same two parents as offspring, simply because of the similarity of subpatters in the parent strings. At full convergence, the probability of producing clones is 1. Indeed, even in initial populations, the probability of cloning is significant. This is the principle result of a thesis I produced back in the mid 90s. I produced a new version of Holland's Schemata Theorem that explains cloning and asymptotic convergence quite naturally. Cloning is the source of the asymptotic convergence of GAs and its cause is crossover.

However, if you remove crossover from your operator set, you are left with only reproduction and mutation. Since mutation will typically only affect a very small percentage of genes in the population in any given generation, then the population as a whole selects only the better members already present. This is hill climbing to a local minima. Any mutations that are not beneficial will usually be rejected by reproduction before they spread into the population. Any that are beneficial are quickly picked up. However, since crossover is not present, they don't spread as quickly as they otherwise might, because they only become associated with already strong alleles by pure chance.

Make sense? I hope so. 8)

Cheers,

Timkin

##### Share on other sites
owl    376
From what I've experienced with genetic programming, I think it feels like a randomly_feed_selective brute force method... and it tends to get stuck in local minimals A LOT. In a run it may be able to find the solution to a simple problem immediately, and in another run it may take lot of time, or simply not find it at all.

I also think that with lots of parallel/processing power it might be useful. But I don't think one can do a much for game development with it with a todays home PCs.

##### Share on other sites
RPGeezus    216
Quote:
 Original post by TimkinHowever, if you remove crossover from your operator set, you are left with only reproduction and mutation. Since mutation will typically only affect a very small percentage of genes in the population in any given generation, then the population as a whole selects only the better members already present. This is hill climbing to a local minima. Any mutations that are not beneficial will usually be rejected by reproduction before they spread into the population. Any that are beneficial are quickly picked up. However, since crossover is not present, they don't spread as quickly as they otherwise might, because they only become associated with already strong alleles by pure chance.Make sense? I hope so. 8)Cheers,Timkin

No, I still do not understand.

I do not understand how cross-over solves the hill-climbing dilema. It seems just as locked in to this problem as mutation alone is.

I'm not sure why you say that beneficial changes will not spread quickly through the population; the 'transfer' rate should be the same.

What if, instead of doing 1 mutation, every so often you would mutate an entire 10% of a member. How would this differ from cross-over with regards to the above?

Kylotan:
Sure, for the first few generations (100 isn't very many).... and after that??? You are right about copying branches in a tree, but you could also just make that a form of mutation.

i.e.
Is there a requirement to take the data from member B and put it in to member A? Could we, just as effectively, copy a section from member A and paste it in to member A? Would this be considered cross-over?