• Advertisement
Sign in to follow this  

[.net] Parsing Utilities

This topic is 3579 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hello everyone, Scenario: a formattable textbox (for an UI). My goal is to parse a string which contains a particular format. I have just met a man called |a"journal:Jack"|fcFFFFFFJack|a|r. Formatting rules: |f starts formatting c color, 8 hex digits p pressed color, 8 hex digits h highlight color, 8 hex digits |a link start / link end string with link data, between apices |r restore normal text, ends formatting |g graphics start / end. includes a graphic element in the text. the filename of the graphic element to draw. || escaped | character I have just met a man called |a"journal:Jack"|fcFFFFFFJack|a|r. |g"Icons\\mail"|gYou have |fc00FF00FFtwo|r new messages. I was wondering if anybody knows of a tool (a "parser creator" for lack of better wording) or otherwise has some insights as to how proceed with a similar feature. Thanks in advance for any help. Regards.

Share this post


Link to post
Share on other sites
Advertisement
Check out String.Split() to split the string up using a delimiter (in your case '|'). Check out Convert.ToInt32("fcFFFFFF", 16) for converting your hex digits to a number. String.SubString() can be used for extracting parts of a string.

Share this post


Link to post
Share on other sites
Also, check out XML.

I have just met a man called
<link to="journal:Jack"><font color="FFFFFF">Jack</font></link>

Share this post


Link to post
Share on other sites
Thank you both for the prompt replies!

Yes, I've been considering both String.Split and XML, I'm not sure why I haven't made up my mind about them yet.
I'm starting to think that XML seems to be the most reasonable choice for such a feature - I had that impression before as well, but seeing as you're recommending it now, it might be the time to use it. I also guess that hand-editing a formatted string in XML is not too bad either.

Perhaps the real difficulty is not in the parsing itself, rather in the data structures I shall use to hold the format informations.

Thanks again for the input so far. More comments are more than welcome, especially if you have a previous experience with a similar feature.

Share this post


Link to post
Share on other sites
I'm also interested in the internal representation. I think one solution might be to, after parsing the XML, put the contents into a list, like this:

|-TextNode
||-text="I have just met a man named "
|
|-LinkNode
||-text="Jack"
||-target="journal:Jack"
||-color="FFFFFF"

Then when it comes time to draw the textbox, you just draw the nodes sequentially, each according to its type. How would you get the user clicking on the link node, though? I guess you could compute the position of the link node within the widget when drawing it, and check it against the mouse event. So each time you draw something, you would update the widget's internal list of clickable areas.

Share this post


Link to post
Share on other sites
I agree with going with xml, I didn't realise the format was a choice. The best way to represent xml internally is using classes. Consider each element a class and each attribute in the xml a value in the class. Use XmlTextReader to parse the xml and XmlTextWriter to write it.

class TextNode
{
public string Text = "I have just met a man named ";
}

class LinkNode
{
public string Text = "Jack";
public string Targer = "journal:Jack";
public Color = Color.White;
}

Share this post


Link to post
Share on other sites
The best way to represent XML internally is to not represent it at all. Use an XML parser to interpret the XML file directly, without storing the contents.

Share this post


Link to post
Share on other sites
This might come across as a stupid choice, but I thought about using XML and I remembered why I wasn't sure about it.

Imagine inserting text into the textbox. Once you've determined which "entity" will get the new, updated text, it shouldn't be a problem. The real problem lies into removing text. It will eventually come to the point that an entity has its related inner text empty, thus invalidating it - since it would be useless without text. I'd eventually have to remove the node.

My concern is that might be a bit too overkill performance-wise. Perhaps a custom parser seeking for simple format delimiters might be better - also to maintain.

I'll keep you updated about my findings while trying to come up with a working solution, should you be still interested.

Share this post


Link to post
Share on other sites
Quote:
Original post by ToohrVyk
The best way to represent XML internally is to not represent it at all. Use an XML parser to interpret the XML file directly, without storing the contents.


Depends on whether it will be modified or processed multiple times. If either of those apply, you want to load it up in a DOM or build up a game-specific tree structure. If you just process it once, or perhaps just infrequently, operate directly from the file as ToohrVyk says.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement