A
Ajax
Aspect-Oriented
 
B
Bloggers
Build Systems
Business Intelligence
ByteCode
 
C
CMS
Cache Solutions
Charting & Reporting
Chat Servers
Code Analyzers
Code Beautifiers
Code Coverage
Collections
Command Line
Connection Pools
Crawlers
 
D
Databases
 
E
EJB Servers
ERP & CRM
ESB
Eclipse Plugins
Expression Languages
 
F
Financial Soft
Forum Soft
 
G
General Purpose
Geospatial
Groupware
 
H
HTML Parsers
 
I
IDEs
Installers
Inversion of Control
Issue Tracking
 
J
J2EE Frameworks
JDBC
JMS
JMX
JSP Tag Libraries
Job Schedulers
 
L
Localization
Logging Tools
 
M
Mail Clients
 
N
Network Clients
Network Servers
NoSQL Databases
 
O
Obfuscators
 
P
PDF Libraries
Parser Generators
Persistence
Portals
Profilers
Project Management
 
R
RSS & RDF Tools
Rule Engines
 
S
SQL Clients
Scripting Languages
Search Engines
Security
Source Control
Swing
 
T
Template Engines
Testing Tools
Text Processing
 
U
UML & Modeling
 
V
Validation
 
W
Web Frameworks
Web Mail
Web Servers
Web Services
Web Testing
Wiki Engines
Workflow Engines
 
X
XML Parsers
XML UI Toolkits
 

TagSoup

TagSoup is a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. By providing a SAX interface, it allows standard XML tools to be applied to even the worst HTML.

 
Category HTML Parsers
License GNU General Public License (GPL)
HomePage http://mercury.ccil.org/~cowan/XML/tagsoup/

Articles, Tutorials, Resources

(Suggest new resource)

See also






Java is a trademark or registered trademark of Sun Microsystems, Inc. in the United States and other countries. This site is independent of Sun Microsystems, Inc.