<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>
<channel>
	<title>Comments on: Optimizing Searches In Perl</title>
	<atom:link href="http://pthree.org/2006/12/18/optimizing-searches-in-perl/feed/" rel="self" type="application/rss+xml" />
	<link>http://pthree.org/2006/12/18/optimizing-searches-in-perl/</link>
	<description>Linux.  GNU.  Freedom.</description>
	<pubDate>Tue, 02 Dec 2008 15:33:22 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.7-RC1</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Harley Pig</title>
		<link>http://pthree.org/2006/12/18/optimizing-searches-in-perl/#comment-22252</link>
		<dc:creator>Harley Pig</dc:creator>
		<pubDate>Tue, 19 Dec 2006 17:45:32 +0000</pubDate>
		<guid isPermaLink="false">http://www.pthree.org/2006/12/18/optimizing-searches-in-perl/#comment-22252</guid>
		<description>That first if block, where you're reading in the first file would be much faster with this:

%phone = map { ( unpack $layout1, $_ ) , $_ } &#60;$IN1&#62;</description>
		<content:encoded><![CDATA[<p>That first if block, where you&#8217;re reading in the first file would be much faster with this:</p>
<p>%phone = map { ( unpack $layout1, $_ ) , $_ } &lt;$IN1&gt;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Harley Pig</title>
		<link>http://pthree.org/2006/12/18/optimizing-searches-in-perl/#comment-22251</link>
		<dc:creator>Harley Pig</dc:creator>
		<pubDate>Tue, 19 Dec 2006 17:37:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.pthree.org/2006/12/18/optimizing-searches-in-perl/#comment-22251</guid>
		<description>Jayce--arghh ... yeah ... I'm still trying to internalize that one.

&#62; &#60;</description>
		<content:encoded><![CDATA[<p>Jayce&#8211;arghh &#8230; yeah &#8230; I&#8217;m still trying to internalize that one.</p>
<p>&gt; &lt;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jayce^</title>
		<link>http://pthree.org/2006/12/18/optimizing-searches-in-perl/#comment-22249</link>
		<dc:creator>Jayce^</dc:creator>
		<pubDate>Tue, 19 Dec 2006 17:18:30 +0000</pubDate>
		<guid isPermaLink="false">http://www.pthree.org/2006/12/18/optimizing-searches-in-perl/#comment-22249</guid>
		<description>Matthew - In fixed width data especially, unpack is much faster, it has no need to save state, load a complex engine, or other background tasks it would need.  He still *could* have used them, but this is much more efficient.

Harleypig - as for  : print “Match: ” . $counter++ . “\r”

just to be pedantic to you :)  dont' use the concat operator (.) in a print.  If you think about it the print operator is set to handle arrays, so why not just give it one.  Benchmark the difference to verify, but using concat will force it to concat a new string first, then print, instead of print array.</description>
		<content:encoded><![CDATA[<p>Matthew - In fixed width data especially, unpack is much faster, it has no need to save state, load a complex engine, or other background tasks it would need.  He still *could* have used them, but this is much more efficient.</p>
<p>Harleypig - as for  : print “Match: ” . $counter++ . “\r”</p>
<p>just to be pedantic to you <img src='http://pthree.org/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  dont&#8217; use the concat operator (.) in a print.  If you think about it the print operator is set to handle arrays, so why not just give it one.  Benchmark the difference to verify, but using concat will force it to concat a new string first, then print, instead of print array.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Aaron Toponce &#187; Blog Archive &#187; Using GeSHi</title>
		<link>http://pthree.org/2006/12/18/optimizing-searches-in-perl/#comment-22236</link>
		<dc:creator>Aaron Toponce &#187; Blog Archive &#187; Using GeSHi</dc:creator>
		<pubDate>Tue, 19 Dec 2006 15:49:38 +0000</pubDate>
		<guid isPermaLink="false">http://www.pthree.org/2006/12/18/optimizing-searches-in-perl/#comment-22236</guid>
		<description>[...] One of the strong features in GeSHi, is the ability to recognize keywords and functions built into the language, and provide a link to the official documentation about that keyword. You may have noticed this with my last post about optimizing searches in Perl. Clicking on one of the functions will take to the Perl documentation site about that function. Try &#8216;print&#8217; or &#8216;open&#8217; to see what I am talking about. [...]</description>
		<content:encoded><![CDATA[<p>[...] One of the strong features in GeSHi, is the ability to recognize keywords and functions built into the language, and provide a link to the official documentation about that keyword. You may have noticed this with my last post about optimizing searches in Perl. Clicking on one of the functions will take to the Perl documentation site about that function. Try &#8216;print&#8217; or &#8216;open&#8217; to see what I am talking about. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Aaron</title>
		<link>http://pthree.org/2006/12/18/optimizing-searches-in-perl/#comment-22229</link>
		<dc:creator>Aaron</dc:creator>
		<pubDate>Tue, 19 Dec 2006 15:06:10 +0000</pubDate>
		<guid isPermaLink="false">http://www.pthree.org/2006/12/18/optimizing-searches-in-perl/#comment-22229</guid>
		<description>Matthew-

In this case, no.  The files that I am searching have no rhyme and reason to the layout.  Well, they do, but it differs from file to file, and I don't know before hand what the layout will be.  I only know where certain variables are in the layout.  So, with that, I can just use substr() or unpack(), as seen here, to get right to where I need to be in the file.</description>
		<content:encoded><![CDATA[<p>Matthew-</p>
<p>In this case, no.  The files that I am searching have no rhyme and reason to the layout.  Well, they do, but it differs from file to file, and I don&#8217;t know before hand what the layout will be.  I only know where certain variables are in the layout.  So, with that, I can just use substr() or unpack(), as seen here, to get right to where I need to be in the file.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matthew Kimber</title>
		<link>http://pthree.org/2006/12/18/optimizing-searches-in-perl/#comment-22227</link>
		<dc:creator>Matthew Kimber</dc:creator>
		<pubDate>Tue, 19 Dec 2006 15:01:56 +0000</pubDate>
		<guid isPermaLink="false">http://www.pthree.org/2006/12/18/optimizing-searches-in-perl/#comment-22227</guid>
		<description>Couldn't you have just used regular expressions?</description>
		<content:encoded><![CDATA[<p>Couldn&#8217;t you have just used regular expressions?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Aaron</title>
		<link>http://pthree.org/2006/12/18/optimizing-searches-in-perl/#comment-22149</link>
		<dc:creator>Aaron</dc:creator>
		<pubDate>Mon, 18 Dec 2006 23:04:45 +0000</pubDate>
		<guid isPermaLink="false">http://www.pthree.org/2006/12/18/optimizing-searches-in-perl/#comment-22149</guid>
		<description>harleypig- thx for your response. A couple of things:

1) Wordpress likes to lowercase anything that sits between &#60; and &#62;. As in the case with this code, it got lower case, but I do use uppercase filehandles. Also, you’ll notice “# &#60;/in1&#62;”. This is because Wordpress will automatically close my tags if I don’t, so I just put it in the code.
2) I store the line of the key in the hash, because I actually am using it later. I probably shouldn’t have put it in my example code.
3) I think some of your example is preference to the programmer, but at any case, goes to show that there is more than one way to skin a cat! But I agree that it could get a bit more efficient.</description>
		<content:encoded><![CDATA[<p>harleypig- thx for your response. A couple of things:</p>
<p>1) Wordpress likes to lowercase anything that sits between &lt; and &gt;. As in the case with this code, it got lower case, but I do use uppercase filehandles. Also, you’ll notice “# &lt;/in1&gt;”. This is because Wordpress will automatically close my tags if I don’t, so I just put it in the code.<br />
2) I store the line of the key in the hash, because I actually am using it later. I probably shouldn’t have put it in my example code.<br />
3) I think some of your example is preference to the programmer, but at any case, goes to show that there is more than one way to skin a cat! But I agree that it could get a bit more efficient.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Harley Pig</title>
		<link>http://pthree.org/2006/12/18/optimizing-searches-in-perl/#comment-22144</link>
		<dc:creator>Harley Pig</dc:creator>
		<pubDate>Mon, 18 Dec 2006 22:20:59 +0000</pubDate>
		<guid isPermaLink="false">http://www.pthree.org/2006/12/18/optimizing-searches-in-perl/#comment-22144</guid>
		<description>#!/usr/bin/perl -w

use strict;
       
# I realize a lot of these are not
# specifically related to your problem
# but I'm pedantic.

# First, filehandles in perl are
# idiomatically uppercase, just to
# help distinguish them.
#
# Second, use variable filehandles. It
# makes error handling easier.
#
# Third, predeclare as much as possible.
# It'll save a little bit of time. Right
# now you're declaring a variable every
# time through both loops, but it's not
# changing.

# I don't use (un)pack that often so I
# can't comment on that.

my %phone;

my $layout1 = '@1384 A10';
my $layout2 = '@0 A10';

if ( open my $IN1, "P1514.FIN" ) {
       
  while(  ) {

    chomp;

    # You don't need to assign a variable
    # for temporary use, it just slows you
    # down and takes up memory.

    # Why are you saving the line if you're
    # not using it later?
    $phone{ unpack $layout1, $_ } = '';

  }

  # $IN1 loses scope here.  Perl will
  # automatically close the file.

} else {

  # It's confusing when you get bogus data,
  # or no data.

  die "Unable to open P1514.FIN: $!\n";

}

if ( open my $IN2, "p1514.afo" ) {
 
  my $counter = 0;
       
  while(  ) {

    # I'm not sure what you're trying to
    # accomplish here, but taking what you
    # have, you can increase the efficiency
    # quite a bit by using a little more
    # idiomatic perl.

    print "Match: " . $counter++ . "\r"
      if exists $phone{ unpack $layout2, $_ };

  }
} else {

  die "Unable to open p1514.afo: $!\n";

}
 
print "\n";

# Not having a copy of the data I can't test this
# but it passes perl -c</description>
		<content:encoded><![CDATA[<p>#!/usr/bin/perl -w</p>
<p>use strict;</p>
<p># I realize a lot of these are not<br />
# specifically related to your problem<br />
# but I&#8217;m pedantic.</p>
<p># First, filehandles in perl are<br />
# idiomatically uppercase, just to<br />
# help distinguish them.<br />
#<br />
# Second, use variable filehandles. It<br />
# makes error handling easier.<br />
#<br />
# Third, predeclare as much as possible.<br />
# It&#8217;ll save a little bit of time. Right<br />
# now you&#8217;re declaring a variable every<br />
# time through both loops, but it&#8217;s not<br />
# changing.</p>
<p># I don&#8217;t use (un)pack that often so I<br />
# can&#8217;t comment on that.</p>
<p>my %phone;</p>
<p>my $layout1 = &#8216;@1384 A10&#8242;;<br />
my $layout2 = &#8216;@0 A10&#8242;;</p>
<p>if ( open my $IN1, &#8220;P1514.FIN&#8221; ) {</p>
<p>  while(  ) {</p>
<p>    chomp;</p>
<p>    # You don&#8217;t need to assign a variable<br />
    # for temporary use, it just slows you<br />
    # down and takes up memory.</p>
<p>    # Why are you saving the line if you&#8217;re<br />
    # not using it later?<br />
    $phone{ unpack $layout1, $_ } = &#8221;;</p>
<p>  }</p>
<p>  # $IN1 loses scope here.  Perl will<br />
  # automatically close the file.</p>
<p>} else {</p>
<p>  # It&#8217;s confusing when you get bogus data,<br />
  # or no data.</p>
<p>  die &#8220;Unable to open P1514.FIN: $!\n&#8221;;</p>
<p>}</p>
<p>if ( open my $IN2, &#8220;p1514.afo&#8221; ) {</p>
<p>  my $counter = 0;</p>
<p>  while(  ) {</p>
<p>    # I&#8217;m not sure what you&#8217;re trying to<br />
    # accomplish here, but taking what you<br />
    # have, you can increase the efficiency<br />
    # quite a bit by using a little more<br />
    # idiomatic perl.</p>
<p>    print &#8220;Match: &#8221; . $counter++ . &#8220;\r&#8221;<br />
      if exists $phone{ unpack $layout2, $_ };</p>
<p>  }<br />
} else {</p>
<p>  die &#8220;Unable to open p1514.afo: $!\n&#8221;;</p>
<p>}</p>
<p>print &#8220;\n&#8221;;</p>
<p># Not having a copy of the data I can&#8217;t test this<br />
# but it passes perl -c</p>
]]></content:encoded>
	</item>
</channel>
</rss>
