Shunting Yard Algorithm

Started by
1 comment, last by StakFallT 12 years, 4 months ago
Quick question (But maybe not so quick an answer). I'm having some problems with an implementation I'm using of the shunting yard algorithm. It was basically copied and pasted from the wikipedia page, but -heavily- modified to fit the project I'm placing it in. Anyhow, the problem is basically it rolls along nicely until it reaches a double )) in an expression, that's when it seems to fall apart. There's no compile errors on it, and the function completes without any runtime exceptions; it's just some tokens seem to never get stored in the operand stacks. Here's the expression I'm parsing:

x = (5 + (3 * 8)) / y;



It seems to get up to the / ok, but the y never gets stored. I dunno... it seemed like the logic for the case block for ')' was "wonky" (Highly technical term :P ); however, the more and more I look at it, the more I can't help but to feel that it's the operator case block. Unfortunately, the code is too large to paste in so I used ideone (Like pastebin, only better; I think anyhow). Here's the code I'm using:



http://ideone.com/XEjwj - Shunting Yard code



There's another function that gets called after that, called execution_order, but the error is occuring during the shunting yard algorithm, so not much point in posting that; but if it looks like the operation isn't finished, it's because calculations and stuff are done in the execution_order function -- after the shunting yard function has been called and completed.



And yes I know, as a whole, I'm absolutely sure there's much more efficient, faster, and even possibly cleaner code to write. I'm not even at that stage though. I'm already planning on a multi-month optimization phase (Along with any optimizations that occur during the program's normal life cycle.). Right now, I'm looking to get the code logic right. I've been debugging this thing for hours; not counting the hours I spent on it prior to working on other sections (non-shunting yard related). It's utterly depressing how many hours one can spend on something and not get a single ounce closer. I'm sure it's something silly, but this algorithm (To me anyhow) is just weird, because it's not just a loop, not even a loop inside a loop. It's (sometimes) a loop inside a loop where one array is going up and another seems to go down, with two different storage containers, and the storage container elements are sometimes referenced -1 and sometimes not, it's being a real nightmare to track down. At least for me anyhow... Any help would be GREATLY appreciated! As I said I've spent hours trying to work out the problem. I'm sure it's a logic error, just not sure where.



-- StakFallT


EDIT: Changed the pastebin from pastebin.com to ideone.com; formatting is MUCH better (no ugly hard-to-read word-wrapping).Some of the indents seem a little extra but it's still easier to read than it being word-wrapped.
Advertisement
I've written a Shunting Yard algorithm implementation before, and if I had to guess, I'd say this is a bug:

StackElement = OperatorStack[OperatorStackIndex - 1].m_data;
if (StackElement == "(")
{
//InsideParenthesis = true;
ParenthesisDepth += 1; // here
break;
} else


Now, I haven't run it, but from looking at it, it looks like, based on the example, it'd hit the first left parenthesis, increment the depth to one,
hit the second, increment to two, hit the right parenthesis, decrement to one, then pop all the way back to the second left parenthesis, and increment to two again
for some reason. Then it hits the second right parenthesis, decrements to one, and pops until the first parenthesis, where it increments to two again.

Try running:
x = (5 + (3 * 8)))) / y;

And see if it throws any sort of error.

This may lead to solving your problem further down the road, but for now, correct me if I am mistaken in how this function works.




ok I think I finally have it nailed! knock on wood <Knocks on wood>

Here's the link to the new code:


http://ideone.com/LE8PK


I tried your suggesstion -- making the expression x = (5 + (3 * 8)))) / y;

I did get an error on that, a mismatched parenthesis error to be specific.

A new idea hit me as I was typing up a reply a few hours ago. I thought to myself, well, 'y' is never being stored, and yet I've seen the read-ahead character (Called peek; kinda an inconspicuous name which caused me to overlook the fact of it's purpose) be set to 'y' before. what kept throwing me off is that I neglected to keep in mind, that peek is somehwat of an insignificant class member. It's only purpose is to allow routines to flow in the direction they need to (I.e. it's not for data processing at all). Anyhow, so I knew it was reading it; my curiosity lied in whether or not 'y' was being skipped via math or a condition in the loop that would cause it to skip through multiple characters. Then I realized, "wait a minute."... The only way that can happen is if I issue a funciton call telling my scanner to read more characters from the actual file, which only occured in one location of that entire Shunting_Yard function, the very end (literally, not figuratively) of the loop. In other words, the switch/case stuff ONLY works with the arrays, not the files. It doesn't matter whether it's skipping the elements in the array. The arrays are filled from what is read from the file, and if the 'y' is not being read from the file, then the whole switch/case thing is off -- to be it even more simply: the buck starts with the file. Part of the problem too, was so many of the characters were being read in just fine, so I had no real obvious reason hitting me over the head that that could've been the case.

Then I realized, well if the entire loop is based off of tag, and every character (Whether it's an operand or an operator) comes into the loop as Tag at some point in time, then it stands to reason that tag MUST BE set to 'y' at some point. Combining this with the knowledge that the math of the loop, and the craziness of arrays and elements and plus this and minus that, blah blah blah is irrelevant for the time being. I set out to figure out if tag ever does equal y, and if it does, knowing it would be classified as an ID, break in the case for IDs and see what it does from there.

It turns out, the read ahead character, was being set to y, but the _MoveAndSet was never setting my more substiantied storage class to 'y'. After running through the loop several more times looking closely at the _MoveAndSet function, it exhibited some REALLY weird behavior (On par with a local function variable going out of scope and invalidating a pointer.). [size="1"]Which could very well have been the case since I was spinning up a temporary local variable using the object's class consutructor to set the data for it, and then setting my "super"-storage container to that object. Which, sure, yeah if I have a non-pointer variable and the stack frame changes, sure; it becomes invalidated. Problem was, was I could've sworn I had an operator custom defined to deal with this, by dereferncing the pointers and setting the data (not the memory locations) upon an equal operation. I guess not. It was this "weirdness" that I eventually figured out that it was possibly because of the whole non-pointer thing and going out of scope. Which again, I could've sworn I've had overloaded the = operator to deal with this. Anyhow, I fixed this up a little and ran it again, it was MUCH closer, now the super-container reflected the data being set, while I was inside the smaller loop at the end of the function.

This time, however, when the loop, that skips past things like \n and \t, etc, finished and the control went back up to the top of the main outer loop (That finishes when Tag is equal to ';'), the super-container had values as if it was never set! Fiddiling around with the function call some more and swapping around the function's local super-container object with the global super-container object I managed to get it to settle down some. At this point, tag was now starting to represent 'y'. Verifying first that the ID block was being "flowed into", my next batch of work began on determining why it wasn't now being stored, despite it showing up. As it so happends, I look above the block, while I'm in the block for the ID and find that when Tag is an integer or a float, it sets the output stack (In addition to the operand stack), yet the ID block wasn't doing that! Fixing it so it does, it ALL seems to be working now! (Well the Shunting_Yard function anyhow :P )

Only thing I gotta do now is do a fixup on the execution_order function so that it can handle just emitting the variable name if it doesn't equal anything and to halt calculation if it cannot go any further. Thanks again for the help!

This topic is closed to new replies.

Advertisement