Jump to content

  • Log In with Google      Sign In   
  • Create Account


can i predict future data using a data history?


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
16 replies to this topic

#1 rouncer   Members   -  Reputation: 291

Like
1Likes
Like

Posted 02 August 2013 - 05:59 PM

An interesting program could be some method of taking a data stream then storing it in a way so i can compare it to the novel future and tell me how predictable it was, so it finds patterns in the data, its a pattern detector that works on bits.

 

this could be interesting, because the data could be anything, and would the exact same prediction technique work on any kind of data no matter what it is?



Sponsor:

#2 DJHoy   Members   -  Reputation: 357

Like
0Likes
Like

Posted 03 August 2013 - 01:49 AM

To answer the question of the title - you can't really predict the future based on past patterns and history, but, you can build probabilities of events happening if similar patterns emerge...

 

Basically, many stock predictors work on this model - some with surprising accuracy, however there is no good way to account for unforseen events, hence, you can only really build probabilities of what may occur. Of course the more event data you have, the more accurate of a model you can build.

 

Now, going about actually creating a program for this - you'll want to look into a number of AI techniques probably starting with neural network theory. 

 

Any specific ideas in terms of exactly what sort of data and/or events you are interested in? Or, is it more or less of just a curiosity?



#3 Álvaro   Crossbones+   -  Reputation: 12920

Like
0Likes
Like

Posted 03 August 2013 - 01:52 AM

Google for "time series analysis" and "data compression": Both things involve finding patterns in a stream of data.

Of course different kinds of data call for different techniques. Like in most other realms, there is no silver bullet.

Edited by Álvaro, 03 August 2013 - 01:53 AM.


#4 Paradigm Shifter   Crossbones+   -  Reputation: 5254

Like
0Likes
Like

Posted 03 August 2013 - 02:26 AM

It's not really AI either... it's statistics.

 

The accuracy of the predictions depends how well the data fits the underlying assumptions (which will usually be a statistical model based on probability density functions).


"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley

#5 Álvaro   Crossbones+   -  Reputation: 12920

Like
0Likes
Like

Posted 03 August 2013 - 02:43 AM

It's not really AI either... it's statistics.


Not to diminish the importance of statistics, but predicting what will happen next is most definitely within the realm of AI.

My own view of AI is that it consists of two things:
(1) Building a representation of the world, including how things are, how they behave and how they are likely to progress under different hypothetical scenarios.
(2) Selecting an action that maximizes the expected value of a utility function, based on the predictions set forth by (1).

I tend to think action selection is the core of intelligence, but other people (e.g. Jeff Hawkins) think prediction is the more important part.

#6 Makers_F   Members   -  Reputation: 728

Like
0Likes
Like

Posted 03 August 2013 - 06:32 AM

Well, the answer to your title's question is simple: no.
If you have no guarantees of how the data is collected and what phenomenon you are observing, there is absolutely no reason for which past data describe a future event or a model.
Since you do not have a model you can not use any method to infer future events.
You can notice a pattern in which whenever a bird sings, a dolphin in the ocean is jumping. But since this two actions are not correlated,  if you'll use it you'll have a wrong (or better, not reliable) prediction.
If you know the domain of the problem instead, you can start to build some rules (there are already some tools to do this. Search for "data mining", some programs do exactly want you want, they find patterns in data)

TL;DR
Without a knowledge of the domain you are observing, you can find patterns, but they do not mean anything and you can not safely use them to infer future events.
(Note: it can be that the data you are looking at actually belongs to a specific domain with some patterns, even if you don't know it. In that case you can use the pattern to predict future events, but since you do not know that the data belong to a specific domanin, you should not use them.)

@Alvaro: I'm not really informed about data compression, but i think the goal is to find the more common pattern and "replace" it with a shorter identifier. At least in the old days. It is not about predicting data that is not present. Am i right?


Edited by Makers_F, 03 August 2013 - 06:33 AM.


#7 Álvaro   Crossbones+   -  Reputation: 12920

Like
0Likes
Like

Posted 03 August 2013 - 10:46 AM


@Alvaro: I'm not really informed about data compression, but i think the goal is to find the more common pattern and "replace" it with a shorter identifier. At least in the old days. It is not about predicting data that is not present. Am i right?

 

One way to think about compression is, given the sequence up to a certain point, what is the distribution of probabilities for the next symbol? If you can answer that question (probabilistic prediction of the next symbol) well, you can use arithmetic coding to convert the whole string into a string whose length is the entropy of the original string, which is the best you can do in compression.

 

Some people view the link between compression and intelligence as being very close: http://prize.hutter1.net/hfaq.htm#compai



#8 brucedjones   Members   -  Reputation: 441

Like
0Likes
Like

Posted 03 August 2013 - 09:34 PM

I think a neural network is your best bet for something general. A neural network could easily be trained for many different types of data.

 

Training is the key however. You couldn't just show it a totally new type of data and expect it's answer to be reliable without first training with that type of data.



#9 AngleWyrm   Members   -  Reputation: 554

Like
0Likes
Like

Posted 04 August 2013 - 03:08 PM

storing [the data stream] in a way [to] tell me how predictable it was.

 

Principal Componant Analysis is the process of examining a set of data, and determining what are the most salient features of that data. And it's an algorithm, an automatic procedure of rotating the coordinate system to coincide with the length and density of the data.

 

The axes of the resulting new coordinate system represent recognition of measured as-yet unnamed features of that data.


Edited by AngleWyrm, 04 August 2013 - 03:17 PM.

--"I'm not at home right now, but" = lights on, but no ones home

#10 Alistair Sheehy Hutton   Members   -  Reputation: 133

Like
0Likes
Like

Posted 05 August 2013 - 07:38 AM

Google for "time series analysis" and "data compression": Both things involve finding patterns in a stream of data.

Of course different kinds of data call for different techniques. Like in most other realms, there is no silver bullet.

 

And don't try to second guess what complexity of solution you will need - start simple and work your way up as needed.  I did some work predicting results of rugby matches and a linear sum of 6 variables (3 per team) was enough to get results comparable with the bookmakers.



#11 Álvaro   Crossbones+   -  Reputation: 12920

Like
0Likes
Like

Posted 05 August 2013 - 12:06 PM

Oh, Alistair's post reminds me of something I read in the book "Thinking fast and slow", where Kahneman proposes using the sum of simple terms that are easy to measure, when trying to predict something. The examples from the book sound exactly like what Alistair described with rugby matches.

 

If you have a lot of data, you can try to estimate weights for each term. If you have a metric shit ton of data, you can get much fancier with non-linear schemes in (e.g., neural networks). There is a whole subject called Machine Learning about how to make predictions where you have an abundance of data. But Kahneman says that something simple like the sum often works just fine, and I think he might be right.



#12 Alistair Sheehy Hutton   Members   -  Reputation: 133

Like
0Likes
Like

Posted 06 August 2013 - 10:13 AM

I am genuinely proud of myself for starting as "stupid" as possible.  I started with a single variable per team and worked my way up.



#13 AngleWyrm   Members   -  Reputation: 554

Like
0Likes
Like

Posted 08 August 2013 - 11:40 PM

I am genuinely proud of myself for starting as "stupid" as possible.  I started with a single variable per team and worked my way up.

Good for you; people frequently mistake quantity of variables with resolution -- usually with terrible results.


--"I'm not at home right now, but" = lights on, but no ones home

#14 Paradigm Shifter   Crossbones+   -  Reputation: 5254

Like
0Likes
Like

Posted 09 August 2013 - 08:18 PM

Although there are methods in statistics to detect correlation between variables so you can remove them/make one dependent upon the other.


"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley

#15 AngleWyrm   Members   -  Reputation: 554

Like
0Likes
Like

Posted 09 August 2013 - 11:25 PM

Although there are methods in statistics to detect correlation between variables so you can remove them...

Can you give more details on this? It's a great idea.

 

I've seen a similar concept with boolean mathematics, used to factor out any random collection of IF/AND/OR logic and distill it down to the bare minimum that achieves the same input/output states.


Edited by AngleWyrm, 09 August 2013 - 11:27 PM.

--"I'm not at home right now, but" = lights on, but no ones home

#16 Paradigm Shifter   Crossbones+   -  Reputation: 5254

Like
0Likes
Like

Posted 10 August 2013 - 03:17 AM

Sure, have a look here

 

http://en.wikipedia.org/wiki/Correlation_and_dependence

 

Calculating the covariance between 2 variables is probably a good start, covariance is similar to the dot product of 2 vectors but for statistical observations instead

 

http://en.wikipedia.org/wiki/Covariance

 

(check out the "calculating the sample covariance" for how to calculate).


"Most people think, great God will come from the sky, take away everything, and make everybody feel high" - Bob Marley

#17 wodinoneeye   Members   -  Reputation: 748

Like
0Likes
Like

Posted 13 August 2013 - 03:36 AM

You can predict (or rather the program can), but whether it will do so reliably/accurately  is the question.

 

You will have to spot/identify in the data particular indicators (factoring) the data  which in itself is a difficult problem.  pAtterns of these factors will need to be assembled.

 

Assume that there is a sequence of cause and effect  - clues/pattern  that lead to some subsequent occurance (which you are trying to predict to be ready to counter or handle/whatever).

 

Buildling up this predictive knowledge about likely patterns to later recognize  is a problem.  Training data is needed (bracketing the problem space you are trying to solve for) .  Guidance by a human is usually needed to inform the system what is relevant (even in a self training system you previously need to tell it what the good/bad results are or which cases are to be looked for).

 

Any pattern not previously seen or conflicting/contradicting with known patterns will result in questionable results for the prediction.  Thus extensive cases  often has to be built up for even simple systems being evaluated/predicted.

 

.


--------------------------------------------Ratings are Opinion, not Fact




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS