views:

62

answers:

3

I have an matrix in this format that I am trying to validate and remove first row:

3 4 
0 0 0 
0 0 0
0 0 0
0 0 0

Where the first line is and the other lines are the actual data.

Width Height

What is the best way to A remove the first row, and B validate that all rows meet the Width Height Criteria specified? I could do a simple for loop and copy them but I am looking for a more elegant way to do it? Maybe with Linq or one of the Collection Methods?

So far I have:

 //add the split for correctness 
 string[][] lines = File.ReadAllLines(fileName).Select(x=>x.Split(' ')).ToArray();
 //first line is width/hight
 int length  = lines.Length ==0 ;
 if(|| (length > 0 && lines[0].Length !=2 ) ){
     throw new InvalidDataException("File is not correctly formated:" + fileName);
 }

 int width = lines[0][0];
 int hieght = lines[0][1];

 //Check Row count       
 if(length != height -1){
   throw new InvalidDataException("Invalid missing rows in the Matrix definition");
 }

 //make sure the file is correctly formated with width and height:    
 if(lines.Any(x=>x.Length != Width)){
     //I know this fails because of first line
     throw new InvalidDataException("Invalid Width in a row in the Matrix");
 }

Any suggestions on a better way to validate input?

+1  A: 

sscanf would have been nice but I have done it with reg. exp. and it will check whether width and height is integers and it also checks every following number for being a integer:

static bool isValid(string path)
{
    var data = File.ReadAllText(path);

    var first = Regex.Match(data, @"\A *(\d+) +(\d+) *([\r\n|\n]|\Z)");

    if (!first.Success) return false;

    int width = int.Parse(first.Groups[1].Value);
    int height = int.Parse(first.Groups[2].Value);

    return Regex.Match(data, @"\A *\d+ +\d+ *((\r\n|\n)((^ *| +)\d+){" + width + @"} *){" + height + @"}\Z", RegexOptions.Multiline).Success;
}

I can make it more strict with regards to spaces.

Added A:

If you want to save all the lines, except the first, to a new path then this will do:

var lines = File.ReadLines(path);
File.WriteLines(path2, lines.Skip(1));

Or if you just want a array of the lines except the first, use this:

var linesExceptFirst = File.ReadLines(path).Skip(1);
lasseespeholt
Dont you also need to add to the regex that the first line is size 2? (or did i miss that? )
Nix
They will always be intergers... 0 1 2 3
Nix
@Nix `\d+` is there twice. So it will only be accepted if they is two integers.
lasseespeholt
I just realized that there were a error in the last regex. It is fixed now - but I might not have made the simplest possible regex.
lasseespeholt
Updated it again - cutted some lines and made the last part a regex instead of `Each line->Regex`.
lasseespeholt
I used your first half for the validation, but you didn't show how to skip first row. I actually ended up with a hybrid solution wrapping yours and @svick
Nix
@Nix I have updated it now :) I´m glad you could use it.
lasseespeholt
A: 

I would pull it in as two separate arrays.

using(StreamReader sr = new StreamReader(fileName)
{
  string header[] = sr.ReadLine().Split(' ');
  if(header.Length != 2) throw new InvalidDataException("yadda, yadda");

  List<string> lines = new List<string>(); 
  //you'll probably want to move that declaration outside the using statement...

  while(sr.Peek() != -1)
  {
    lines.Add(sr.ReadLine());
  }

  if(lines.Count() != int.Parse(header[1])) //this is wrong so...
    throw new InvalidDataException("yadda, yadda");

  if(lines.AsQueryable().Any(x => x.length != int.Parse(header[0]))// this, too
    throw new InvalidDataException("yadda, yadda");
}

The problem with this is that your sample data has spaces, and this code assumes no spaces in the data. So we need to fix that...

List<string[]> separatedLines = new List<string[]>();

lines.ForEach(x => separatedLines.Add(x.Split(' ')));

if(separatedLines.AsQueryable().Any(s => s.Length != int.Parse(header[0])))
  throw new InvalidDataException("yadda, yadda");

Some of that will change if I've misunderstood your sample data, but that will take your header row first, and use its values to validate the rest of your data. Double-check me on the .AsQueryable() calls, though, I haven't gotten as much chance to use Linq as I'd like, so I'm going of relatively limited experience on that one. I do know that when I've tried, using the Linq Extension methods on a List takes some minor acrobatics...

AllenG
+2  A: 
string[][] lines = File.ReadAllLines(fileName)
  .Select(line => line.Split(' ')).ToArray();
if (lines[0].Length != 2)
  throw new SomeException();

int width = int.Parse(lines[0][0]);
int height = int.Parse(lines[0][1]);

int[][] matrix = lines.Skip(1)
  .Select(line => line.Select(n => int.Parse(n)).ToArray())
  .ToArray();

if (matrix.Length != height || matrix.Any(line => line.Length != width))
  throw new SomeException();
svick
Very elegant - but a little brittle - consider the case where two elements are separated by multiple spaces - probably want a preprocess step that gets rid of multispace runsAlso, you can certainly drop the height initializer and just use the function in last line - maybe the width too, but I don't know whether ilm would recognize it as constant or recalc it on every itemBut elegant nonetheless
Mark Mullin