Convert X inputs into numerical output

Started by
11 comments, last by phi 14 years, 8 months ago
Hi everyone, I may be overcomplicating things, but how would I go about taking X number of inputs and combining them all to create an 'output ID'? The key thing about the ID is that it should take all the inputs into account (preferably with weighting such as having X1 slightly more important than X2) and produce a number where number can be used to compare similarity. For example: A) Inputs 1,2,3 ---> <perform function> ---> 0123456 B) Inputs 1,2,4 ---> <perform function> ---> 0123457 C) Inputs 5,3,8 ---> <perform function> ---> 1354565 Notice A and B are similar, so the outputs are similar to each other than A/B and C. Thank you for any help [Edited by - phi on July 24, 2009 5:30:16 PM]
Advertisement
Depending on what you want to do, you may want to read about hamming distances.
Wow, thanks for the quick reply! The Hamming Distance seems like a good method of finding the 'distance' between the outputs, but how would I generate the outputs to use?
That depends on what you're trying to do.

1) Do the ID's have to be unique? If no, then the problem becomes easier.

2) Do the number of inputs vary during runtime? If yes, the problem seems to get a little harder. Comparison would also become complicated. How would compare 2 different sized strings? Is "123" closer to "1234" or to "223"?
Hi,
Thanks for the reply.

1) The ID's do not have to be unique (although it's preferable that fewer are to provide greater granularity)

2) For the time being, input sizes will be the same size for each input, but different from each input.
e.g:
Input 1 will always be 4 digits long
Input 2 will always be 6 digits long
Input 3 will always be 3 digits long
etc..

Ahh, I think I misunderstood slightly, I thought all inputs were a single digit (or char).

My naive solution would be to build a string of fields, one for each input, seperated by a delimiter, and an extra field as a 'counter' that increased for inputs that were all the same. For example:

Inputs [1234],[654321],[987] ---> "1234.654321.987.0"
Inputs [5678],[123456],[789] ---> "5678.123456.789.0"
Inputs [1234],[654321],[987] ---> "1234.654321.987.1" <-- same as 1st input, increase counter
...

I'm sure there are more optimal solutions, such as using hash codes (a search should reveal lots of articles on hashing).

Also, I still feel like maybe I'm misunderstanding your problem (must be lack of sleep...)
lol I may not have explained it properly, although I think you've got it. I initially had the same idea as you did where I could just literally append everything together, but I felt this may not work given the nature of the task.
Essentially what I'm trying to achieve is the ability to "encode" a certain state into a number. Similar states would have similar numbers. A state is something that is made up of various inputs (Time, Lives, Health etc..). To simplify this processing I'm going to convert all the inputs into numerical values.
However, things get complicated when a previous state is also taken into account.
E.G. A previous state acts as an input to the new state which means that a certain number will be produced even if all other inputs are the same.

I'll take a look at hashing. I need to ensure that the hash function doesn't generate a random number using artibtary salts as this would not allow for similarity between states.

Thanks again for the help.


Quote:Original post by phi
I'm going to convert all the inputs into numerical values.

If you are truly turning it into a numerical value, then you will need a type of varable that can store enough bits of info (for the 3 input 13-digit example you gave, a 32-bit integer wouldn't do). You would have to implement (or your language would have to) large/unlimited digit numeric types. This is assuming that you need to avoid collisions (occasionally different states have same ID)

Or just use a string "" of digits (potentially sacrificing memory and speed).

Or find a perfect hash function for your inputs.

Quote:
although I think you've got it
...
A previous state acts as an input to the new state which means that a certain number will be produced even if all other inputs are the same.


Nope, now I'm confused again :p Isn't that what the 3rd line of this
Quote:
Inputs [1234],[654321],[987] ---> "1234.654321.987.0"
Inputs [5678],[123456],[789] ---> "5678.123456.789.0"
Inputs [1234],[654321],[987] ---> "1234.654321.987.1" <-- same as 1st input, increase counter

did?

Also, I have to ask, what are trying to accomplish with all this? Perhaps there is an entirely different method that will provide a less complicated solution.


Hi,
Sorry for the late reply, I was unable to access the internet (but I survived! :D). For the time being, ignore any of my constraints mentioned in my other posts.
What I'm trying to do is to have a large number of 'states' and compare their similarities to a given state. So for example, a state consists of 'Lives, Time Played and Level'(although there will be a few more factors besdies these three). Another state would consist of the same thing and so I'd like to compare their similarities.
The slightly more confusing bit comes into a chain of states. I'd like the next state in the chain to incorporate the information from the previous state. To clarify: (read in columns)
1)State X1 = 10,30,50 1)State Y1 = 30,40,10 State Z1 = 1,2,3
2)State Z1 = 32,43,56 2)State Z1 = 36,16,75

As you can see, State Z1 would normally be 1,2,3 if it was the initial state. However, when preceded by something else, it changes. This way, I can then compare a chain of states rather than individual ones. In case a chain to extremely long, I'd rather not iterate through each one and compare.

Hope that's a bit clearer (although it's quite late, so I may be confusing things further lol)

Thanks
The solution to this kind of problems depends heavily on how you're going to use the data. So without any analogies or generalizations, what exactly do you hope to accomplish by converting these state combinations into values? If you're trying to rate players, there are easier ways to go about it :)

This topic is closed to new replies.

Advertisement