Sign in to follow this  
_Sigma

[Perl] Output not as expected...

Recommended Posts

_Sigma    792
I need to parse a file in list format, such as:
1
2
3
4
5
1
2
3
4
5
1
2
3
4
5
into a tabular format, like:
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
I took a bit of perl last term so I wrote this:
#!/usr/bin/perl -w
use strict;

my $m_cols=5;
my $m_rows=3;

my $row_counter=0;
my $col_counter=0;

open(INFILE, "< dem.txt") or die "Can't open the infile";
     
open(OUTFILE,"> tempout.txt") or die "Can't open the outputfile";

#read in from file
while(<INFILE>)
{
    if($col_counter < $m_cols)
    {
        print OUTFILE $_, " ", # suppress new line
        $col_counter++; 
    }
    else
    {
        $col_counter = 0;
        $row_counter++;
        print OUTFILE '';
        print OUTFILE $_, " ",  #suppress new line
        
    }
}

if($m_rows != $row_counter)
{
    print "Row counter and specified rows do not match. Corrupt output\n";
}

Am running it under active perl on windows. Anyways, this generates this gimpy output:
1
 02
 13
 24
 35
 41
 2
 03
 14
 25
 35
 44
 3
 02
 11 2
Which isn't quite what I had in mind! I've stepped using a debugger, and nothing jumped out at me (ie variables setting to zero early as I suspected was the case). Any help would be much appreciated! Cheers

Share this post


Link to post
Share on other sites
tstrimp    1798
I'm not familiar enough with perl to know how that is supposed to suppress the newline. I would try running then through a trim function to remove the newline character that was present in the original file and manually specifying the newline ("\n") when appropriate.

Share this post


Link to post
Share on other sites
tstrimp    1798
But you have a newline that was copied from the original document.

source.txt
1
2
3
4


is actually
source.txt
1\n
2\n
3\n
4\n


So when you get the first line, $_ == "1\n". The , will prevent the print statement from adding it's own newline character, it won't remove the one present in the string.

Share this post


Link to post
Share on other sites
rollo    366
as someone who has to take care of an ancient perl system I'm not too happy about your $_ usage... its much much better with named variables... anyway, your problem is that you dont increment the column after each output, also to get rid of the newline confusion I chomp:ed the string right away.


#!/usr/bin/perl -w
use strict;

my $m_cols=5;
my $m_rows=3;

my $row_counter=0;
my $col_counter=0;

open(INFILE, "< dem.txt") or die "Can't open the infile";

#read in from file
while(my $input = <INFILE>) {
chomp($input);
print $input, " ";
$col_counter++;

if ($col_counter >= $m_cols) {
$col_counter = 0;
$row_counter++;
print "\n";
}
}

if($m_rows != $row_counter)
{
print "Row counter and specified rows do not match. Corrupt output\n";
}





EDIT: or maybe I just got confused hacking at the code. it might just have been the chomp that was missing.

Share this post


Link to post
Share on other sites
_Sigma    792
@tstrimp: yes sorry, was being silly! Thanks for pointing that out.

Quote:

EDIT: or maybe I just got confused hacking at the code. it might just have been the chomp that was missing.

The chomp and putting the , after print made '0's apear everywhere :S

Anyways, thanks for pointing me in the right direction! Works all nice and fancy now...

The file I have to parse is 16mb and 2.5 million lines long...wonder how well perl will handle that?

Cheers

Share this post


Link to post
Share on other sites
rollo    366
yeah, its sorely missed. I think most perl coders end up writing their own though ;)
chomp is a bit different though, it only kills trailing line breaks.

Anways, perl is a pestilence and will forever screw up your coding. it just makes it too easy to do bad things, and so hard to do good... do yourself a favor and write whatever it is you need in a better language like Ruby or Python... perl makes me feel dirty just thinking about it. (sorry, couldn't resist a perl-bashing)

Share this post


Link to post
Share on other sites
rollo    366
Quote:

The file I have to parse is 16mb and 2.5 million lines long...wonder how well perl will handle that?


shouldnt be a problem since you only read a line at a time. Is it something time-critical that needs to be done many times, or just a one-off?

Share this post


Link to post
Share on other sites
Zahlman    1682
Me neither, rollo. :)


expected_rows = 3

out = file('tempout.txt', 'w')

for (item, line) in enumerate(file('dem.txt')):
out.write(line.strip() + [' ', '\n'][(item % 5) / 4])

count = item + 1
assert count % 5 == 0 and count / 5 == expected_rows, "Corrupt output"

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this