Coder Social home page Coder Social logo

ooguz / html-parser Goto Github PK

View Code? Open in Web Editor NEW

This project forked from gisle/html-parser

0.0 2.0 0.0 1.43 MB

The HTML-Parser distribution is is a collection of modules that parse and extract information from HTML documents

Home Page: http://search.cpan.org/dist/HTML-Parser/

Perl 63.43% C 20.64% XS 6.45% HTML 1.45% Perl 6 8.04%

html-parser's Introduction

OVERVIEW

The HTML-Parser distribution is is a collection of modules that parse
and extract information from HTML documents.  The modules present in
this collection are:

  HTML::Parser - The parser base class.  It receives arbitrary sized
        chunks of the HTML text, recognizes markup elements, and
        separates them from the plain text.  As different kinds of markup
        and text are recognized, the corresponding event handlers are
        invoked.

  HTML::Entities - Provides functions to encode and decode text with
        embedded HTML <entities>.

  HTML::HeadParser - A lightweight HTML::Parser subclass that extracts
        information from the <HEAD> section of an HTML document.

  HTML::LinkExtor - An HTML::Parser subclass that extracts links from
        an HTML document.

  HTML::PullParser - An alternative interface to the basic parser
        that does not require event driven programming.

  HTML::TokeParser - An HTML::PullParser subclass with fixed
        token setup and methods for extracting text.  Many simple
        parsing needs are probably best attacked with this module.

In addition take a look at the HTML-Tree package that build on
HTML::Parser to create and extract information from HTML syntax trees
(similar to HTML DOM).


PREREQUISITES

In order to install and use this package you will need Perl version
5.8 or better.  The HTML::Tagset module should be installed.

If you intend to use the HTML::HeadParser you probably want to install
libwww-perl too.


INSTALLATION

Just follow the usual procedure:

   perl Makefile.PL
   make
   make test
   make install


REPORTING BUGS

Bug reports and issues for discussion about these modules can be sent
to the <[email protected]> mailing list.


COPYRIGHT

  © 1995-2016 Gisle Aas. All rights reserved.
  © 1999-2000 Michael A. Chase.  All rights reserved.

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.

html-parser's People

Contributors

aradici avatar barbie avatar bulk88 avatar demerphq avatar dsteinbrunner avatar fperrad avatar gisle avatar jacquesg avatar jonjensen avatar msouth avatar nwc10 avatar real-dam avatar scop avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.