# tricky puzzle

This topic is 4544 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Here is a tricky question that I don't know if it has an answer or not. If you are given N floating point numbers all between 0..1 adding up to an integer. e.g.
{ 0.12345 ,  0.6789,  0.19765,  0.31415,  0.68585  } Total = 2

then question is this: "Decide whether any subset of these numbers adds up to an intger." In this case the answer is:
{ 0.12345,  0.6789,  0.19765 }  Total = 1.0
or
{  0.31415,  0.68585  }         Total = 1.0

Obviously one way of doing this is to simply try out every posible subset. The time this takes will grow exponentially in terms of the number of floats you start with. I wonder if there is a quicker trick to doing this. (Or alternatively if there is a way to prove that there isn't a quicker way of doing it.) For example maybe there is a way to make approximate guesses and then refine them. Or someway of discounting certain subsets so you don't have to check them all? P.S. Maybe you can guess why I wanted to know the answer to this?

##### Share on other sites
The first solution that came into my head was to sort the list and move an iterator foward and an iterator backwards, adding the two numbers until 1 is reached or exceeded. Unfortunatly this wouldn't work for "any integer" (just for 1.0) or for more than 2 numbers at once.

Because of the requirement for "any integer" and because it can be formed by any combination of any number of numbers, it leads me to believe that there is no solution that dosn't involve comming up with all the combinations of available numbers anyway.

Unless you can come up with some kind of clever, n-dimentional partitioning deal that may or may not work [smile].

I would be interested to know what this is for. Particularly because knowing the context may lead to better solutions probably involving different data structures or a slow solution that can be cached.

Also - what about re-use of the numbers? Can we reuse a number in two combinations? This could lead to some unexpected results if your vector was full of integers anyway.

##### Share on other sites
You'll probably have to try a large number of subsets by brute force, but you might be able to speed things up a bit. For example, first find the sum of the entire set: 2. In this case, none of the subsets will add up to any integer higher than 2. This will explode in a huge ball of flame if you introduce negative numbers, so I assume we only have positive input values for now.

For each entry in the list, track the difference between the entry and each integer up to the max. The first entry has 2 - 0.68585 = 1.31415 and 1 - 0.68585 = 0.31415. Since both of these results are positive, the entry can be used in subsets adding up to 1 and 2. For each integer (in this case, 1 and 2) store a list of the entries that can appear in a subset which totals up to that number. The max total (2 in this case) should have each entry in the original list.

Now you can simply look at each integer up to the max, and you only need to bother trying to build subsets out of those entries which can possibly add up to the integer in question. Obviously this only helps you if your input is distributed over the interval [0, max integer]. Within each grouping, you can also probably sort the list in descending order, and exclude pairs that total more than the search parameter. For instance, in the grouping for 1, 0.68685 will appear next to 0.6789, but obviously these two will never appear in the same subset totalling to 1. This should probably cut down the number of subsets you have to try by a large factor. I have half an inkling of a more powerful trick just under the surface here, but I'm too distracted to properly think it out at the moment.

Anyways, this should help a bit with very long lists, but will probably be a net slowdown in short lists (off the top of my head, say, 15 elements).

 My thoughts continue to try to form here [smile] Using the difference you calculated in the early step (2 - 0.68585 = 1.31415, 1 - 0.68585 = 0.31415) you can also exclude even more subsets. Suppose I have a subset S that totals 1.4. If I am considering adding 0.68685 to this subset to total 2, I can check and see that 1.4 > 1.31415, and I do not need to consider the set {0.68585, {A}}. However, I'm not sure if this could lead to something actually useful, or if it's just a goofy half-optimization that could be done by checking if (0.68585 + 1.4 > 2)... err... exercise to the reader [wink]

##### Share on other sites
The context is of taking the real roots of a polynomial (x-x0)(x-x1)(x-x2).. and of seeing if any combination of the factors gives integer coefficients by first seeing if any of the sums of the real numbers give integers. But we only need to consider the fractional part of the reals numbers.

Even though this is the context please ignore it because I think the puzzle is interesting in itself.

So you can't use any number twice, and only one subset is needed to be found.

 Thats right, ApochPiQ, I hadn't thought of that. For example if I was looking for 3 numbers that add up to an integer, I know that that integer must be 1 for a start. Then if any two numbers add up to more than one I can discount them straight away without adding a third number to it.

So that is indeed an optomization. I wonder if that kind of thing would still work if all the starting numbers were quite small? That's opened up a can of worms if ever I saw one.

[Edited by - paulbird on July 1, 2005 10:07:06 AM]

##### Share on other sites
Reminds me of the knapsack problem, which is NP-complete if I'm not mistaken, meaning that if you can solve it in polynomial time, you've just broken most of the cryptographic systems in the world.

##### Share on other sites
SamLowry - well I'm not one to duck away from a challenge!

But I think this problem may have a simpler solution since it is related, in a way, to factoring integer polynomials for which (so I believe) a polynomial time algroithm exists. Although, admittedly, this would require extra constraints on the numbers such that (x-x0)(x-x1)(x-x2) formed a polynomial with integer coefficients.

##### Share on other sites
It is like the knapsack problem (it is like many OR and CO problems), but not quite. The problem is that there is nothing to minimize or maximize contiuously: whether something is an integer or not is not a property that can be described by a linear constraint of the appropriate form, rendering most common approaches such as linear programming or branch-and-bound variants unuseable.

Greetz,

Illco

##### Share on other sites
**************

This just struck me:

transform all the number into complex numbers by:

Zn = exp(2*i*pi*Xn)

then if X1+X2+X3 == integer is equivelent to Z1*Z2*Z3 == 1

Would this help at all?

The question would then be "Given this list of complex unit numbers. Find which subset multiplies together to give 1."

****************

[Edited by - paulbird on July 1, 2005 11:43:43 AM]

##### Share on other sites
Your actually getting closer to the knapsack problem.
Which in case you're unfamiliar with it is: given a set of integers S and a (integer) goal G, which integers from S do you need to add together to get G?
Example: S = {1,6,8,14,29,31}, G = 53 => 8 + 14 + 31.

If you would take the logarithm of each Zn (Un = ln Zn) you need to find a sum again which amounts to 0. You can simplify Un = 2 * Pi * I * Xn, which you can simplify to just Un = Xn mod 1.0 (and still need to get to 0).

In any case, you can transform the problem all you want, it'll remain hard. However, there are special kinds inputs for which it's easy to find a solution (in polynomial time). In fact, there are so many such special/easy cases that a cryptographic encoding system based on the NP-ness of the knapsack problem was considered very unsafe (Merkle Hellman).

For example, it's easy to solve S = {1536, 768, 384, 192, 96, 48, 24, 12, 6, 3, 1}, G = 1008.

##### Share on other sites
if you were going to go about it the brute force way you could easily cut the checking down by half if you counted through the array first to see how many numbers you have and created 2 new dynamic arrays. one that has only the numbers from 0 - 0.5 and the one that contains the numbers > 0.5 && < 1.0.

then you could check from one array1 to array2 since any 2 numbers from the same array would never add up to an integer, assuming that you will never have numbers exactly 0 or exactly 1 which i gather is correct.

if you had 2 million numbers then obviously any brute force way would be slow i just was throwing out an alternative to it.

##### Share on other sites
To me it sounds like all you really need to do is find all the subsets, or the power set. See MathWorld. We had to do that in class once. The following code is straight from my assignment:

//------------------// Returns the power set //------------------Set * Set::operator ~ (void){	// gets the new size of 2^n and adds one to	// account for the element that stores the size	int nPowSize = (2 << (this->m_nSize-1)) + 1;	Set *Pow = new Set[nPowSize];		// store size in the first element of the first	// set in the array	Set cSet;	cSet.Add(nPowSize);	Pow[0] = cSet;		// go through the size of the power set	// and add the element in position j to the 	// current set if its bit is 1	for(int i=0; i < nPowSize; ++i)	{		for(int j=0; j < this->m_nSize; ++j)		{			if(i & (1<<j))			{				Pow[i+1].Add(this->GetElement(j));							}		}			}			return Pow;} // operator ~

We had to write our own set for this, and I know the set implementation isn't the best. The size of the powerset is 2^n, but I had to add the size in there, so that takes up the extra element. Remember that Pow[] returns a set, since it is a set of sets. Then you can go through those subsets in the Pow[] and add up the sum of the elements to check if it is within a reasonable tolerance to an integer.

A little more explaination. What this basically does is count in binary up to the size of the set. Say you have a set of size 3, which is {2,3,4}. The size of the powerset is 8;

000 - {}
001 - {4}
010 - {3}
100 - {2}
011 - {3,4}
101 - {2,4}
110 - {2,3}
111 - {2,3,4}

although in the code, the actual algorithm adds it in this order.

000 - {}
100 - {2}
010 - {3}
110 - {2,3}
001 - {4}
101 - {2,4}
011 - {3,4}
111 - {2,3,4}

[Edited by - njpaul on July 1, 2005 2:37:36 PM]

##### Share on other sites
paulbird says:
Quote:
 Obviously one way of doing this is to simply try out every posible subset .... I wonder if there is a quicker trick to doing this.

njpaul says:
Quote:
 To me it sounds like all you really need to do is find all the subsets

##### Share on other sites
Quote:
 Obviously one way of doing this is to simply try out every posible subset .... I wonder if there is a quicker trick to doing this.

If you have to go through every subset, there is no faster way. All you can hope for is an optimized algorithm.

##### Share on other sites
You don't have to go through every possible subset. I'd say a very large portion of the subsets can be eliminated with some cleverness and a little bit of simple logic.

##### Share on other sites
True, you don't have to go though every subset, but you still have to generate them. You can easily skip the empty set.

##### Share on other sites
let dec f = f -. floor f;;type result = None | Found of float list;;let solve nums =    let lst = ref []    and result = ref None in        let update n =        let todo = ref !lst        and adds = ref [] in        while !todo != [] && !result == None do            let (x, terms) = List.hd !todo in            let x' = dec (x +. n)            and terms' = n :: terms in            print_float x'; print_newline ();            if abs_float x' < sqrt epsilon_float            then result := Found terms'            else (                   adds := (x', terms') :: !adds;                   todo := List.tl !todo                 )        done;        lst := (n, [n]) :: (!adds @ !lst)    in    let todo = ref nums in    while !todo != [] && !result == None do        update (List.hd !todo);        todo := List.tl !todo;    done;    !result;;

Given a list L of reachable values and a number N, a new list L' of reachable values is constructed with L union { x + N | x in L }. The algorithm stops as soon as it hits an integer (x % 1 == +-eps).

##### Share on other sites
Quote:
 Original post by SamLowry *** Source Snippet Removed ***Given a list L of reachable values and a number N, a new list L' of reachable values is constructed with L union { x + N | x in L }. The algorithm stops as soon as it hits an integer (x % 1 == +-eps).
Sounds nice, although I can't immediately tell if it helps a whole lot.

It would be nice if paulbird confirmed that this wasn't homework btw. It's worded very much as though it is.

##### Share on other sites
The solution is to use dynamic programming. Sort the numbers from largest to smallest, and keep a table of all possible sums that you have found so far. Then, iterate through the numbers. For each number, add it to each sum found so far. If it's an integer, print out the set and exit. If it's already in the set, do nothing, otherwise add it to the set.

In the worst case, this will still run in exponential time (and take exponential memory), if all possible subset sums are different, but in most practical examples it will work pretty well.

##### Share on other sites
Quote:
 Original post by AprosenfThe solution is to use dynamic programming. Sort the numbers from largest to smallest, and keep a table of all possible sums that you have found so far. Then, iterate through the numbers. For each number, add it to each sum found so far. If it's an integer, print out the set and exit. If it's already in the set, do nothing, otherwise add it to the set.In the worst case, this will still run in exponential time (and take exponential memory), if all possible subset sums are different, but in most practical examples it will work pretty well.

And that's exactly what the OCaml code posted above does. (except the sorting)

##### Share on other sites
Quote:
 "Decide whether any subset of these numbers adds up to an intger."

My emphisis. Assuming we approach the problem of "try to find a subset", it might be possible to speed things up based on digit math. E.g. if you have a set like so:
6.234655.234537.348582.243885.349938

You can discard the last number, as there's no way you can supplement the 8 with another number to form a zero with carry. Similarly, if you have a set like so:
6.234655.234537.348582.243885.3499382.824762

You'd be able to know that you must include either both of the last two, or neither. Things get much more complicated as the number of numbers with a nonzero digit in that place increase, but it might be possible.

Of note is that this won't work very well if the results are allowed to be somewhat inexact - e.g. 0.50001 + 0.50000 = 0.50001 - which means the algorithm would discard the result due to the one. There's probably a similar algorithm that depends on results being within a delta, or simply optionally adding a fake number within the epsilon to round... (edit: actual FP math eps. would involve some O(N!) algorithms it seems, and probably more complicated than the brute force's O(N!) method))

I may fiddle around with this for awhile for the heck of it, but I won't be posting the actual implementation due to homework-smell, at least not for now.

EDIT: I've decided that this method is probably not very helpful. It still ramps ala O(N!). It may help for smaller sets of data, but with the repeated resortings by digit and whatnot, I doubt it. C'est la vie...

[Edited by - MaulingMonkey on July 2, 2005 8:56:08 AM]

##### Share on other sites
I can confirm this is not homework.

To clear up things the question in as few words as possible is this:

"Given N real numbers, decide whether any subset sums to an integer."

**********************************

I posted the same question on Google Math.Symbolic and this is what they said:

Quote:
 The empty set and the full set are also subsets. Presumably you don't want them. > In >this case the answer is: > >{ 0.12345, 0.6789, 0.19765 } Total = 1.0 >or >{ 0.31415, 0.68585 } Total = 1.0 > Multiplying by 10^d where d is the largest number of digits after the decimal point, you can say equivalently: given N positive integers whose sum is divisible by 10^d, decide whether any nonempty proper subset has a sum divisible by 10^d. I'm pretty sure this problem is NP-complete. If so, no polynomial-time algorithm is known and it's unlikely that there is one. Robert Israel isr...@math.ubc.ca Department of Mathematics http://www.math.ubc.ca/~israel University of British Columbia Vancouver, BC, Canada *******************This is, more or less, a subset sum problem. One way to attempt these is via lattice reduction. (Closely related is to try to find small integer relations and hope that some use all ones as multipliers). For the case of what are called "low density" problems one will typically meet with success. Below is Mathematica code for a formulation due to Lagarias and Odlyzko. A related method, with a better encoding (handles higher density cases) is due to Schnorr and Euchner. One might want to fiddle with the "expon" parameter, or with the criterion that selects rows from the reduced lattice based on whether the first element is sufficiently close to zero. integerSubsetSum[v_] := Module[ {n=Length[v], v2=Rationalize[v], v3, expon=20, lat, redlat, relations}, v3 = Flatten[{10^expon, Table[0,{n}]}]; lat = Transpose[Join[ {Round[10^expon*v2]}, IdentityMatrix[n]]]; lat = Join[lat,{v3}]; redlat = LatticeReduce[lat]; relations = Map[Drop[#,1]&, Select[redlat, Abs[First[#]]<=10^(expon/5) && Max[Abs[Rest[#]]]==1 &]]; relations = Select[relations, Apply[And,Thread[#<=0]] || Apply[And,Thread[#>=0]]&]; Map[v*#&, Abs[relations]] /. 0->Sequence[] ] For your example: vec = {0.12345, 0.6789, 0.19765, 0.31415, 0.68585}; In[82]:= integerSubsetSum[vec] Out[82]= {{0.31415, 0.68585}, {0.12345, 0.6789, 0.19765}} Daniel Lichtblau Wolfram Research

I'm not sure I quite know what they're talking about but I think what they're saying is there's no quick way of doing it.

##### Share on other sites
From my (limimted) understanding of this problem, here is my solution.

Create a new vector, in which new_vector[i] = old_vector[i] mod 1

You then use simulated anneling to find the answer.
The ''goodness' is how far away it is from 1.
if it goes over 1, you restart.

This should be okish.

From,
Nice coder

##### Share on other sites
Quote:
 Original post by Nice CoderFrom my (limimted) understanding of this problem, here is my solution.Create a new vector, in which new_vector[i] = old_vector[i] mod 1You then use simulated anneling to find the answer.The ''goodness' is how far away it is from 1.if it goes over 1, you restart.This should be okish.From,Nice coder

It will fail to match if you have a set of, say, all 0.75s, like so:

{ 0.75 , 0.75 , 0.75 , 0.75 , 0.75 , 0.75 , 0.75 , 0.75 }

In this case, each half of the set adds up to 3.0, but adding any one number to anotherwill bust the 1.0 mark, at 1.5

That said, this concept in tandem with some other algorithms could just fit the bill.

##### Share on other sites
Hi all. Just wanted to say, that I eventually found the solution to this problem on the internet. The algorithm in question is called the LLL-algorithm and it finds approximate integer relations in polynomial time. I implemented it in my computer program for factorising polynomials and it does very well. For example it is good at factoring things like:

6*x+3*x^4+14+7*x^3+2*x^5+x^8 = (x^3+2)*(x^5+3*x+7)

Where the input is the roots of the polynomial multiplied by 1000 and the output is a matrix with 1's and 0's indicating which roots add up to integers.

I haven't quite solved the problem of factoring yet since with

x^8 + 1

although some pairs of roots add up to integers this doesn't necessarily lead to an integer factor so I'll have to add some supplementary conditions.

**************

Now most of all I need to find a way to recombine the roots in polynomial time. Because at the moment using the most naiive method, mutiplying this out:

(x-x0)*(x-x1)*(x-x2)*....(x-xn)

where x0,x1...xn are numbers takes at least 2^n sums even though the result has only n terms which is obviously not very efficient.

##### Share on other sites
This is just one form of the subset sum problem. It is one of the most trivial to describe np-complete problems.

##### Share on other sites

This topic is 4544 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Create an account

Register a new account

• ### Forum Statistics

• Total Topics
628651
• Total Posts
2984043

• 10
• 9
• 9
• 10
• 21