Circa, a language with built-in learning

Started by
8 comments, last by Kambiz 16 years, 6 months ago
Hey guys, I'd like to introduce a project that I've been working on for a while. It's a programming language called Circa. It's a dataflow-based language, that features runtime code mutability and built-in learning/training. I think it has a lot of potential not just for AI, but for other things like procedural graphics. Here's a really simple example of the built-in learning.

a = 0.0?
b = (a + 2) * 3
train(b, target = 9)
When the program reaches the train() call, it asks itself, how can I change stuff so that b is 9? So it goes backwards and changes stuff (only changing values that have a question mark). In this case it would instantly change 'a' to 1.0 Internally it accomplishes this by storing a dataflow view of the equations. There are 3 expression entities (internally called terms) in the above code; one for the 0.0, one for the addition, one for the multiplication. The train() call tells the multiplcation term that it should be 9. This feedback continues backwards through the code in a way that's pretty much the same as the Backpropagation algorithm. For an example of how this functionality can be used in cool ways, I wrote the following demo code. Behold, the first working Circa program!

g = null   # java graphics object (passed externally)

center_x = 80.0

# eyes
eye_y = 40.0?
eye_spread = 30.0?

function eye(x, y)
{
  draggable(x,y)
  circle(g, x, y, 10)
  circle(g, x, y, 1)
}

eye(center_x - eye_spread, eye_y) # left eye
eye(center_x + eye_spread, eye_y) # right eye

# mouth
mouth_y = 100.0
mouth_v_scalar = 30.0?
mouth_width = 50.0?
mouth_steps = 10

start_lines()
for step in range(-mouth_steps,mouth_steps+1)
{
  x = center_x + mouth_width * step / mouth_steps
  y = mouth_y + mouth_v_scalar * sqr(step / mouth_steps)

  draggable(x,y)
  line_to(g, x, y)
}
This code draws a simple smiley face, with two bulging eyes and a parabolic mouth. Here's what it looks like (a Java 1.5 applet) The part that's cool is, you can drag around pieces of the face. The eyes can be moved around, and the smile can be stretched and inverted. And nowhere did I have to tell the system exactly how things should be dragged. I didn't tell it, for example, that moving one eye should move the other eye symmetrically. All I did was write stragically-placed question marks, and made calls to draggable(). The rest is implied from the code. Anyway, that's the concept. I'm planning to release the whole thing free and GPLed, but I don't think it's quite ready to be released yet. Right now it's at the proof-of-concept stage. Feedback, ideas for other applications, or requests for ports to the language of your choice are welcome!
Advertisement
Quote:Original post by pinacolada
I didn't tell it, for example, that moving one eye should move the other eye symmetrically.


No offense, but, yes, you did:

Quote:Original post by pinacolada
eye(center_x - eye_spread, eye_y) # left eye
eye(center_x + eye_spread, eye_y) # right eye


This explicitly define the eyes as symetrical around a vertical axis centered at (center_x, eye_y). In *any* language.

What you seem to be attempting to do is to create a symbolic language, like Maple. (its also the symbolic package in mathlab).
Maybe the eyeballs-being-symmetrical thing was a bad example. The point was, solving is implicit. I don't tell it how much it should change eye_spread, I just tell it how far I moved the mouse.

Maple's solver is definitely similar. The difference is in emphasis and expected usage. I don't know of any version of Maple that you can (or would want to) embed in your game, for example.
Quote:Original post by pinacolada
Maybe the eyeballs-being-symmetrical thing was a bad example. The point was, solving is implicit. I don't tell it how much it should change eye_spread, I just tell it how far I moved the mouse.

Maple's solver is definitely similar. The difference is in emphasis and expected usage. I don't know of any version of Maple that you can (or would want to) embed in your game, for example.


Im pretty sure you can.

But my point is, you are trying to sell what you did as something else. From what you show here, you dont have "runtime code mutability", nor "built-int learning/training". What you have is a language that keep expression with variables in symbolic form, and a function to evaluation those expressions.

Its still a great achievement. How do you handle regressions with multiple/infinite/no solutions?
Ah I see, fair points. :)

This code is just one example. There are other features that are implemented but don't show up here, and lots of planned, unimplemented features.

For runtime code mutability, I don't have an example that shows this off, but I can talk about Circa's internal data structures and how they make it easy.

Code is stored in a dataflow format. Every expression is a Term, every Term has a list of inputs and a TermFunction. So this code sample:

a = sin(1)
b = sqr(a)

produces a dataflow graph that looks like this:

1 ---> sin ---> sqr

When you evaluate a Term, it calls its TermFunction, which stores a result in a local field called "output". If it uses inputs, it reads the "output" from each of its input terms. Each term is only evaluated once per pass.

This format is less efficient than compiling to bytecode (in particular, it causes a lot of unnecessary data duplication). But it's much easier to work with. If I want to change that sin to be cosine, all I have to do is find that Term and change one field. If I want to instead divide the result of sin() by 2, what I can do is:
1) create a '2' term
2) create a 'div' term, use the 'sin' and '2' terms as input
3) find every term that's using 'sin' as input, and change that input to the 'div' term

(I'm also planning on adding special functions & syntax that make these kinds of operations easy to do)

In terms of built-in learning, I think it qualifies. I guess the smiley example is not that impressive, since it's only solving for one unknown at a time. But it can do more. The solving is done by backpropagation, with a gradient descent done on each function. So essentially a Circa program is like an oddly-shaped neural network.

In terms of multiple/infinite/no solutions, it probably won't do very well. If there's multiple solutions then it will just stop at the first one, if there's no solutions then it will get as close as it can. The emphasis isn't really on having it be a good general-purpose equation solver :)
Regression can be a tool used for learning, but learning is much more.

Anyway, what will happen with simple equations like this:

x = 1?
y = 2?
b = x*x;
c = x*0;
d = x*y;

train(b, target = -1) //zero solution
train(b, target = 1) //two solutions
train(c, target = 9) //no solutions
train(c, target = 0) //infinity of solutions
train(d, target = 9) //infinity of solutions

I doubt piecewise parallel gradient descent will produce good results on average... not to bring you down, but think about it. If the system is underconstrained, you have an infinity of solution, and your "training" is pretty much random. If the system is perfectly constrained and an unique solution exist, but might very well not even settle on the solution. If the system is over-constrained and no perfect solution exist, it doesnt seem like it will even look for a least-square solution.
It might be interesting to expose this functionality through a C or Java interface or something similiar, and then make the language a wrapper around that. I'm not entirely sure what this would look like, but I get the feeling that the stuff involved in making the language work would be just as useful as the language itself.

How does it decide when to stop performing gradient descent?

I get a bit confused reading the program. I'm not used to the data flowing backwards like that. I guess it's similiar to the confusion I feel when trying to read prolog. Hmmm..in some sense, maybe this is sort of a numerical prolog?

Idea for a demo: a spline curve that is manipulated by dragging the curve itself instead of the control points (put the control points in and make them dragable too).

[Edited by - Vorpy on October 11, 2007 2:56:07 AM]
Looks interesting! Have you exposed the ability to "train" based on multiple example values? Otherwise, you should rename it to "solve". It might be interesting to expose that functionality at runtime too (if you haven't already.)

Then you could learn new functions entirely based on dynamic examples rather than by fixed expressions. Though the problem I foresee is that a traditional language bound to machine learning libraries would expose more control over the learning process, and it's often necessary to do so to get good results...

Anyway, interesting idea. Is there a download somewhere?

Join us in Vienna for the nucl.ai Conference 2015, on July 20-22... Don't miss it!

Quote:Original post by Vorpy
It might be interesting to expose this functionality through a C or Java interface or something similiar, and then make the language a wrapper around that. I'm not entirely sure what this would look like, but I get the feeling that the stuff involved in making the language work would be just as useful as the language itself.


Yeah, I'm adding a lot of methods to let people manipulate a Circa program from the host language (which is currently Java). Embedability is definitely a priority.

Quote:
How does it decide when to stop performing gradient descent?


Right now it doesn't really know, it just does a single training pass with a learning rate of 1.0 (which is the default).

(Background information: the learning rate is a scalar representing how far we move from the "actual" output to the "desired" output on a single pass. A learning rate of 1.0 means that we try to move to the "desired" output immediately)

For these simple examples, this works great. For more complicated examples, it'll probably need to do multiple training passes, with a smaller learning rate. Right now you can specify the learning rate, but there's not (yet) any syntax to tell it to do multiple passes.

Quote:
I get a bit confused reading the program. I'm not used to the data flowing backwards like that. I guess it's similiar to the confusion I feel when trying to read prolog. Hmmm..in some sense, maybe this is sort of a numerical prolog?


Heh, could be. They are both forms of declarative programming.

Quote:
Have you exposed the ability to "train" based on multiple example values?


Yup, that's an example of something that you would need multiple passes and a smaller learning rate to do. This code would work: (although it would need a bunch of iterations)

a = 0.0?b = a*a + 5.0?train(a, target = 2, lrate = .1)train(b, target = 4, lrate = .1)


Side note, I misspoke before when I said that the call to train() does the actual training pass. The train() call just adds to a "feedback" field on the term. So, you can write lots of train() calls, and all that feedback will accumulate. The actual training pass happens when the script finishes.

Quote:
Though the problem I foresee is that a traditional language bound to machine learning libraries would expose more control over the learning process, and it's often necessary to do so to get good results...


Yeah, that's true. I'm trying to find a nice balance, I want to give the user as much control as they need, but also limit options so that the language is still (relatively) simple to use. It won't be a one-size-fits-all solution to machine learning.

Quote:Is there a download somewhere?


Not yet, there are still a lot of bugs and incomplete areas; trying to use the current version would probably be an exercise in frustration. I'm currently getting it ready for a 0.1 release.
Quote:Original post by pinacolada
a = 0.0?b = (a + 2) * 3train(b, target = 9)

Your programming language has a nice syntax but I can get the same functionalities easily any other language:
#include <iostream>void train(double (*f)(),double target,double& x);double a = 0.0;double b ;double _b_(){	b = a*a+0.5;	return b;}void main(){	train(_b_,9,a);	std::cout << "a = " << a << std::endl;// a = 2.92108	std::cout << "b = " << b << std::endl;// b = 9.03271}void train(double (*f)(),double target,double& x){	int maxSteps = 100;	double eps = 0.01;	double dx = 0.01;	double f1,f2,df;	for(int i=0; (i < maxSteps) && (abs(f()-target)>eps);i++)	{			f1 =  f();		x+=dx; f2 = f(); x -= dx ;		df = (f2-f1)/dx;		x -= (f1-target)/df; 	}}

Is the train function much more than a numerical root finder?
If the equation is analytically solvable I would use the analytical solutions to get better performance and more accurate results, if it’s not analytically solvable I would prefer a well known algorithm with guaranteed coverage. In a game it’s important to know that the train function is fast and wouldn’t fail if the problem is solvable. I don’t think this can be achieved by a single general train function.

This topic is closed to new replies.

Advertisement