Sign in to follow this  

RFC: Tangent Programming Language

This topic is 2049 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hey everyone. I have gone through and got the specification ([url="http://www.assembla.com/spaces/tangent-lang/documents/dykAuAHa4r4BviacwqjQWU/download/dykAuAHa4r4BviacwqjQWU"]doc[/url], [url="https://docs.google.com/document/d/1kKOUvMF9ZwJqBA8tEdeuHA-xIIposvzogyPN-KydGI4/edit"]html[/url]) done to a point where I can (re)start work on it based on firm-ish requirements.

Some of the details around code generation and base framework interoperability are not there; mostly because I need to prototype things to determine what they should be. There's also a smattering of TODOs, but quite a bit of the core idea as well as details are there.

At this point I'm looking for more 'meaty' commentary rather than document critique. [b]I want to know what you think.[/b] What do you like? What do you hate? Why?

I understand that the document is lengthy and pretty dry, and cannot express my appreciation for any feedback you can provide. I will look to answer any questions or provide any examples that come up.

Share this post


Link to post
Share on other sites
Not really a comment a question on the language itself, but was implementing a structurally typed language on top of .NET particularly difficult? It doesn't seem like a natural fit.

Share this post


Link to post
Share on other sites
The language defaults to structural typing, but emulates .Net style typing via tagging. If you don't have the tag (that can only be provided by explicit inheritance) then the structure doesn't match and the subtyping doesn't validate.

So importing .Net types isn't too terribly bad since their types can be expressed in Tangent types. The [i]opposite [/i]will likely be dodgey. Tangent types will likely go into .Net as either DynamicObject or some internal structure that loses type safety if consumed by other .Net code. Since the goal is to not have to write my own standard library, that's good enough.

Writing the type system in C# isn't particularly difficult, but I've not gotten programs complex enough to know if doing so [i]performantly [/i]is particularly difficult.

Share this post


Link to post
Share on other sites
Always good to see updates on Tangent!

I like the [i]idea[/i] of the phrase system from a theoretical perspective, although I have itchy reservations about whether or not it would really scale. I'm curious to see more programs written using extensive phrase style, especially beyond trivial demos. For instance, a couple of examples from Rosetta Code would be cool, just to see how Tangent shapes up against other syntactic philosophies.

I'd also love to see an example of literate programming or a good DSL implemented in Tangent, even if it's simple and theoretical, just to get an idea for how it works in real code. Again I really like the concept but it's hard (for me at least, being neck-deep in... ahem... "other" languages) to see how it would work on a substantial scale.


How much of this is implemented, by the way?

Share this post


Link to post
Share on other sites
Do your IEnumerable<T> functions allow both 'yield' and simple forms? For example in C#, you can do either of the following:

[CODE]List<T> memberlist;

public override IEnumerable<T> GetMembers()
{
return memberlist;
}
[/CODE]

or:

[CODE]
foreach (var member in memberlist)
yield return member;
[/CODE]

C# decides between them by seeing if the function contains any 'yield' statements whatsoever. Edited by Nypyren

Share this post


Link to post
Share on other sites
Have you considered implementing the structural type system by using multiple interface inference? This may let you interop more smoothly with other .Net languages.

Approaching a structural typing system from C# perspective, it seems like you could say:[list]
[*]Foreach class, generate a corresponding interface providing access to all public properties and functions.
[*]Whenever one interface's members is a superset of another interface, the superset interface derives from the subset interface.
[*]No class ever derives from another class; only from interfaces.
[*]All fields are properties so that they play nicely with interfaces.
[/list]
I only briefly skimmed the document and haven't considered this kind of type system in depth before, so you probably know cases where your language features would not work with this kind of underlying implementation. Edited by Nypyren

Share this post


Link to post
Share on other sites
[quote name='ApochPiQ' timestamp='1336186833' post='4937532']
For instance, a couple of examples from Rosetta Code would be cool, just to see how Tangent shapes up against other syntactic philosophies.
[/quote]

Yup, I hope to use Rosetta Code to get some examples done (rather, use them as goals for getting features done). Though it is likely some time off to have them done and working. Tangent as of two (three?) iterations ago had a number of [url="http://www.assembla.com/wiki/show/tangent-lang/ExampleCode"]working examples[/url] including Rosetta Code and a small DSL. The DSL will be shorter and probably designed differently now, and the syntax has changed a fair bit; I might go through tomorrow and do some Rosetta Code instances to get a better feel for writing the code (and to provide more/better examples).

[quote name='ApochPiQ' timestamp='1336186833' post='4937532']
I'd also love to see an example of literate programming or a good DSL implemented in Tangent, even if it's simple and theoretical, just to get an idea for how it works in real code.
[/quote]

Me too! An update to the simple dice DSL seems warranted.

[quote]
How much of this is implemented, by the way?
[/quote]

The current iteration is implemented through to code generation, but has a bug in code gen that is beyond me due to the wild complexity in it. .Net importing does not work, nor does a lot of the built-in operations. You can basically declare types and invoke phrases (though passing in parameters doesn't work).

Now that the requirements are done and stabilized in a form I can refer back to in a more reliable way than memory, I'm hoping to salvage bits and start in on another iteration with greater focus on automated testing and 'release-able' steps of work.

Basically, not much.

[quote name='Nypyren' timestamp='1336188212' post='4937536']
Do your IEnumerable functions allow both 'yield' and simple forms?
[/quote]

Yes, it's not in the table of contents, but there's a section 'Enumerable Return Context' that specifies that. For methods that return IEnumerable<T> (or whatever sequence type the base framework provides) you can do:[list=1]
[*][font=courier new,courier,monospace]yield T [font=arial,helvetica,sans-serif]- which behaves as yield return T in C#[/font][/font]
[*][font=courier new,courier,monospace][font=arial,helvetica,sans-serif][font=courier new,courier,monospace]return - [font=arial,helvetica,sans-serif]which behaves as yield break in C#[/font][/font][/font][/font]
[*][font=courier new,courier,monospace][font=arial,helvetica,sans-serif][font=courier new,courier,monospace][font=arial,helvetica,sans-serif][font=courier new,courier,monospace]return IEnumerable<T> [font=arial,helvetica,sans-serif]- which basically does foreach in param{ yield element } and then yield break.[/font][/font][/font][/font][/font][/font]
[/list]
The document is written in such a way that methods that return IEnumerable will always be lazy/yielding and I might have to fix that.

[quote]
Have you considered implementing the structural type system by using multiple interface inference?
[/quote]

Sure. I've even seen some libraries that do reflection to automagically generate new classes that suit the required interface. That is likely the route I'd look for (and might yet go). Right now I'd be thrilled to get it into a state where vaguely-non-trivial things could be written in it to see where the phrase + order of operation inference idea falls down. .Net interop with Tangent code is [i]way[/i] down there in the nice to haves. Edited by Telastyn

Share this post


Link to post
Share on other sites
[quote name='ApochPiQ' timestamp='1336186833' post='4937532']
For instance, a couple of examples from Rosetta Code would be cool, just to see how Tangent shapes up against other syntactic philosophies.
[/quote]

Got some of the rosetta code basics done, even if they're not particularly thrilling (though they did help point out that I forgot/ignored generic methods):

[url="https://docs.google.com/document/d/1AA2EtqlQtwa4Nxw0GxNlIaXtUquBxTYTvUqdvLj6qFw/edit"]Basics[/url]
[url="https://docs.google.com/document/d/12XfNGS_mj-j_86Ns9ePqSHYodDqET_4eyRDXy-GaHT8/edit"]Functions and SubRoutines[/url]

Will aim to do either the dice DSL from the example above or perhaps a quick port of the parser stuff I have floating around. Might be later tonight, but more likely sometime in the week.

Share this post


Link to post
Share on other sites
[quote]
I'd also love to see an example of literate programming or a good DSL implemented in Tangent, even if it's simple and theoretical, just to get an idea for how it works in real code.
[/quote]

I had a little time over the weekend, so threw together enough code to do the simple calculator parsing example that was running around the forums earlier. First, the actual calculator once the library is written. I've actually done two examples to show the relative ease that the syntax can be adjusted to be more symbolic or more verbose while using the same code under the hood.

No compiler, so there might be typos/bugs; and it's my first non-trivial actual written code using the new language things here might be idiomatically bad once it's been used more.

[i]Symbolic example[/i]
[source]
namespace TangentParserFramework.CalculatorExample {
input: parsing rule = whitespace expression whitespace eof;

expression: parsing rule =
whitespace factor ((whitespace ("*"|"/") whitespace factor)*);

factor: parsing rule =
whitespace term ((whitespace ("+"|"-") whitespace term)*);

term: parsing rule =
"(" whitespace expression whitespace ")"
| "-" whitespace expression
| digit+;

digit: parsing rule =
"0"
| "1"
| "2"
| "3"
| "4"
| "5"
| "6"
| "7"
| "8"
| "9";

whitespace: parsing rule = (" "|" "|"\r"|"\n")*;
}
[/source]

[i]Verbose example[/i]
[source]
namespace TangentParserFramework.VerboseCalculatorExample {
(a: ~> parsing rule) followed by (b: ~> parsing rule) => parsing rule {
return (input: string)(index: int) => parsing rule {
return a whitespace* b;
};
};

input: parsing rule = whitespace?
followed by expression
followed by whitespace?
followed by eof;

expression: parsing rule = whitespace?
followed by factor
followed by zero or more
(("*" or "/") followed by factor);

factor: parsing rule = whitespace?
followed by term
followed by zero or more
(("+" or "-") followed by term);

term: parsing rule =
"(" followed by expression followed by ")"
or "-" followed by expression
or one or more digit;

digit: parsing rule = "0" or "1" or "2" or "3" or "4" or "5" or "6" or "7" or "8" or "9";

whitespace: parsing rule = " " or " " or "\r" or "\n";
}
[/source]

The code to define the syntax is included in 2 basic parts. One defines a parsing result, and the other a parsing rule interface (and its implementations):

[i]parsing result[/i]
[source]
namespace TangentParsingFramework {
parsing result => abstract class {
(this).parsed text => string;
(this).start index => int;
(this).end index => int;
(this).parsing rule => ~> parsing rule;
(this) is successful => bool;
children of (this) => IEnumerable<parsing result>;
};

(rule: ~> parsing rule) parses (input: string)
from (start: int) to (end: int)
via (children: IEnumerable<parsing result>)
=> class {

(this).parsed text => string {
return input.SubString(start, end-start);
};
(this).start index => int { return start; };
(this).end index => int { return end; };
(this).parsing rule => ~> parsing rule { return rule; };
(this) is successful => bool { return true; };
children of (this) => IEnumerable<parsing result> { return children; };
};

(rule: ~> parsing rule) parses (input: string)
from (start: int) to (end: int)
=>
rule parses input from start to end
via Enumerable.Empty<parsing result>();

(rule: ~> parsing rule) does not parse (input: string)
at (start: int)
via (children: IEnumerable<parsing result>)
=> class {

(this).parsed text => string { return ""; };
(this).start index => int { return start; };
(this).end index => int { return end; };
(this).parsing rule => ~> parsing rule { return rule; };
(this) is successful => bool { return false; };
children of (this) => IEnumerable<parsing result> { return children; };
};

(rule: ~> parsing rule) does not parse (input: string)
at (start: int)
=> rule does not parse input
at start via Enumerable.Empty<parsing result>();
}
[/source]

[i]parser rules[/i]
[source]
namespace TangentParsingFramework {

parsing rule => abstract goose class {

parse (input: string) at (index: int) with (this) => parsing result;
parse (input: string) with (this) => parsing result {
return parse input at 0 with this;
};

(this) (following rule: parsing rule) => parsing rule {
return this followed by following rule;
};
};

(parser: string -> int -> parsing result) => parsing rule {
return parsing rule with class {
parse (input: string) at (index: int) with (this) => parsing result {
return parser(input)(index);
};
};
};

literal (target literal: string) => parsing rule {
return (input: string) (index: int) => parsing rule {
if input.SubString(index, target literal.Length) = (target literal)
then return this parses input
from index to index + target literal.Length
else return this does not parse input at index;
};
};

(target literal: string) => parsing rule {
return literal(target literal);
};

(first: ~> parsing rule) followed by (second: ~> parsing rule) => parsing rule {
return (input: string)(index: int) => parsing rule {
first result: parsing result = parse input at index with first;
if first result is not successful return first result;
second result: parsing result =
parse input at (first result.end index) with second;

results: List<parsing result> = default List<parsing result>;
results.Add(first result);
results.Add(second result);

if second result is successful then
return this parses input from index to (second result.end index)
via results
else
return this does not part input at index via results;
};
};

optional parser for (rule: ~> parsing rule) => parsing rule with class {
parse (input: string) at (index: int) with (this) => parsing result {
result: parsing result = parse input at index with rule;
if result is successful
then return result
else return this parses input from index to index;
};
};

(rule: ~> parsing rule)? => parsing rule {
return default optional parser for rule;
};

series parser for (rule: ~> parsing rule) => parsing rule with class {
parse (input: string) at (index: int) with (this) => parsing result {
results: List<parsing result> = default List<parsing result>;
working index: int = index;
result: parsing result = parse input at working index with rule;

while result is successful {
results.Add(result);
working index = result.end index;
result = parse input at working index with rule;
};

if results.Any()
then return this parses input
from index to (results.Last().end index) via results
else return this does not parse input at index;
};
};

(rule: ~> parsing rule)+ => parsing rule {
return default series parser for rule;
};

one or more (rule: ~> parsing rule) => parsing rule {
return rule+;
};

(rule: ~> parsing rule)* => parsing rule {
return (rule+)?;
};

zero or more (rule: ~> parsing rule) => parsing rule {
return rule*;
};

parser for either (a: ~> parsing rule) or (b: ~> parsing rule)
=> parsing rule with class {

parse (input: string) at (index: int) with (this) => parsing result {
first result: parsing result = parse input at index with a;
second result: parsing result = parse input at index with b;

if first result is not successful and second result is not successful
then {
results: List<parsing result> = default List<parsing result>;
results.Add(first result);
results.Add(second result);
return this does not parse input at index via results;
};

if first result is successful and second result is not successful
then return first result;

if first result is not successful and second result is successful
then return second result;

if first result.end index >= second result.end index
then return first result
else return second result;
};
};

(a: ~> parsing rule) or (b: ~> parsing rule) => parsing rule {
return default parser for either a or b;
};

(a: ~> parsing rule) | (b: ~> parsing rule) => parsing rule {
return a or b;
};

end of input parser => parsing rule with class {
parse (input: string) at (index: int) with (this) => parsing result {
if index >= input.Length
then return this parses input from index to index
else return this does not parse input at index;
};
};

eof => parsing rule {
return default end of input parser;
};
}
[/source] Edited by Telastyn

Share this post


Link to post
Share on other sites
Out of curiosity - do you use the semicolons for making parsing simpler, or for aesthetic reasons? I might be a rare exception, but I dislike having them all over the place. I tried copying and pasting the code examples into notepad and removing the semicolons and IMHO it looks a lot cleaner.

Aside from that, I kind of like the look of it. Will be interested to continue following developments.

Share this post


Link to post
Share on other sites
[quote name='ApochPiQ' timestamp='1336418620' post='4938154']
Out of curiosity - do you use the semicolons for making parsing simpler, or for aesthetic reasons?
[/quote]

For Tangent, it makes the parsing exceptionally simpler. The order of operation inference works on the whole statement, so it needs to find the 'correct' interpretation from all of the different permutations that the sequence of terms could be. It likely doesn't need to be a brute force search, but as soon as you get into 'permutations of N' then problems get intractable in a hurry. The semi-colons act as a hard separator of statements so the size of N is at least known.

The most likely alternative is syntactically significant whitespace, which I dislike (to be kind). Plus, since the target audience is C# programmers, I expect the semi-colon requirement is the 'least surprising' option.

Share this post


Link to post
Share on other sites

This topic is 2049 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this