Sign in to follow this  
lc_lumos

[.net] LINQ in C#

Recommended Posts

lc_lumos    122
using System;
 using System.Collections.Generic;
 using System.Linq;
 using System.Text;
 using System.IO;
 
 namespace ListDemo
 {
    class ListDemo
    {
       static void Main(string[] args)
       {
          string path = null;
          string fileName = @"ListOfStrings.txt";
 
          if (path == null)
          {
             path = fileName;
          }
          else
          {
             path = path
                + "\\"
                + fileName;
          }
 
          if (!File.Exists(path))
          {
             Console.WriteLine("Error: File is missing: " + path);
             return;
          }
 
          StreamReader sr = new StreamReader(path);
          String inputLine;
          List<string> myList = new List<String>();
 
 
          while ((inputLine = sr.ReadLine()) != null)
          {
             myList.Add(inputLine);
          }
          sr.Close();
 
 
          foreach (var item in myList)
          {
             Console.WriteLine("\n Element is {0}", item);
          }
       }
    } 
 }

I try to load this text file:
 987
 10
 12
 276
 My My
 21.5
 16.75
 30
 This
 11
 9
 7
 is
 1
 18
 bad data
 7
 12

My question is how to skip the bad data in the text file (My My, This, is, bad data) and store them in a string list (using linq query). Thanks.

Share this post


Link to post
Share on other sites
kanato    568
probably something like

List<string> data = new List<string>();

data.AddRange(from someString in myList
select someString where IsValidData(someString ));

and then write the function IsValidData to do your data validation.

Share this post


Link to post
Share on other sites
capn_midnight    1707
regex ftw

var filePath = "<my file path>";
var regx = new System.Text.RegularExpressions.Regex(@"^[-]?[1-9]\d*\.?[0]*$");
var list = from line in System.IO.File.ReadAllLines(filePath)
where regx.IsMatch(line.Trim())
select line; // you could also do double.Parse() at this point, since the value is validated.


Share this post


Link to post
Share on other sites
phresnel    953
Quote:
Original post by capn_midnight
@"^[-]?[1-9]\d*\.?[0]*$


Without really knowing the exact format the OP wants to parse (but only what he does not want to parse), this is quite dangerous, plus it seems buggy to me (e.g. it fails for the provided example 16.75, as you only allow zero or more zeros before the end of the string [it is also strange to see an optional \., followed by non-optional zero or more zeros, why not make the whole part behind the dot optional?]).

You allow 1., but disallow .1. Further, why does it have to start with 1-9? How will you parse 0.15 then? I18N? Scientific Notation?

All that told, as sorry as I am, but you are a very good example for the story told in this article. Or you must love geek irony.

Share this post


Link to post
Share on other sites
ranakor    439
Quote:
Original post by capn_midnight
regex ftw
*** Source Snippet Removed ***


If the goal is to end up with a list of doubles and to select double.parse then he might as well skip the regex and do this:


decimal d;
File.ReadAllLines("myfile.txt")
.Where(l=>decimal.TryParse(l,d))
.Select(l=>decimal.Parse(l));

Share this post


Link to post
Share on other sites
davepermen    1047
Quote:
Original post by ranakor
Quote:
Original post by capn_midnight
regex ftw
*** Source Snippet Removed ***


If the goal is to end up with a list of doubles and to select double.parse then he might as well skip the regex and do this:


decimal d;
File.ReadAllLines("myfile.txt")
.Where(l=>decimal.TryParse(l,d))
.Select(l=>decimal.Parse(l));


that's cool, learned something new today :) an idea that sure can get useful..


double buffer;
var doubles = from line in File.ReadAllLines("TextFile1.txt")
where double.TryParse(line, out buffer)
select double.Parse(line);
foreach (var d in doubles)
{
Console.WriteLine(d);
}



such a simple idea, and still very powerful.

if the op wants just that, it's perfect.

Share this post


Link to post
Share on other sites
capn_midnight    1707
Quote:
Original post by phresnel
Quote:
Original post by capn_midnight
@"^[-]?[1-9]\d*\.?[0]*$


Without really knowing the exact format the OP wants to parse (but only what he does not want to parse), this is quite dangerous, plus it seems buggy to me (e.g. it fails for the provided example 16.75, as you only allow zero or more zeros before the end of the string [it is also strange to see an optional \., followed by non-optional zero or more zeros, why not make the whole part behind the dot optional?]).

You allow 1., but disallow .1. Further, why does it have to start with 1-9? How will you parse 0.15 then? I18N? Scientific Notation?

All that told, as sorry as I am, but you are a very good example for the story told in this article. Or you must love geek irony.


My file path value won't really work either. The point is to demonstrate the availability of the tool. He can tweak the regex as he wishes. Also, Jeff Atwood is an idiot.

Share this post


Link to post
Share on other sites
phresnel    953
Quote:
Original post by capn_midnight
Also, Jeff Atwood is an idiot.


Sure, we all are to some degree. He who is not an idiot is an asshole, and an idiot as well, with the difference that he doesn't know that.

But in that point he is right. Too many times I've seen credulous programmers crippling up whole applications, and introducing security holes, with bad regexes like yours, which is not even a good starting point.

Admittedly, scientific notation and i18n are probably negigible in his case. If regexes would have been mandatory (but they are not, as for the Parse()-family of functions), something like (pseudo regex) "[+-]?([0-9]+\.[0-9]*)|([0-9]*\.[0-9]+)|[0-9]+", which allows .1, 1., 0.0, 42, +5, -5, et al., would have been a better startpoint.

Share this post


Link to post
Share on other sites
Spodi    642
Its all about what you want. capn_midnight gave an example that would be great if you wanted to accept only certain kinds of values. From the OP's example, he just wanted anything formatted ###(.###), in which case Regex would be a great approach. If he wants anything that is an actual number, and have localization support, then yeah, he should use TryParse().

So I see nothing wrong with capn_midnight's suggestion. This is far from an example of "Regex fetishism".

Share this post


Link to post
Share on other sites
phresnel    953
Quote:
Original post by Spodi
So I see nothing wrong with capn_midnight's suggestion. This is far from an example of "Regex fetishism".


But also far away from correctness and with questionable content (e.g. as mentioned: optional dot, followed by non-optional zero or more numbers [totally]).

The only thing valid was to show that there is a tool called regexp (don't get me wrong, this is totally okay in itself, though preferably with a warning), but it was demonstrated as if a carpenter shows off a hammer to tighten screws, which is my point.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this