How to convert HTML file into a hash in Perl? -
is there simple way convert html file perl hash? example working perl modules or something?
i search on cpan.org did'nt find can want. wanna this:
use example::module; $hashref = example::module->new('/path/to/mydoc.html');
after want refer second div element this:
my $second_div = $hashref->{'body'}->{'div'}[1]; # or this: $second_div = $hashref->{'body'}->{'div'}->findbyclass('.myclassname'); # or this: $second_div = $hashref->{'body'}->{'div'}->findbyid('#myid');
is there working solution this?
html::treebuilder::xpath gives lot more power simple hash would.
from synopsis:
use html::treebuilder::xpath; $tree = html::treebuilder::xpath->new; $tree->parse_file( "mypage.html");
$nb=$tree->findvalue('/html/body//p[@class="section_title"]/span[@class="nb"]'); $id=$tree->findvalue('/html/body//p[@class="section_title"]/@id'); $p= $html->findnodes('//p[@id="toto"]')->[0]; $link_texts= $p->findvalue( './a'); # texts of elements in $p $tree->delete; # avoid memory leaks, if parse many html documents
more on xpath.
Comments
Post a Comment