This is version . It is not the current version, and thus it cannot be edited.
[Back to current version]   [Restore this version]

Introduction#

A lot of information is available in text format and needs to be parsed to be used by programs. There are multiple solutions to do that:
  • use regular expressions and scripting
  • use a parser generator
  • use a language that has a parser and access the parse tree.

Parsing workshop#

The idea would be to use some friendly and interactive parsing solution such as pyparsing and provide a GUI that makes it easy to develop parsers from examples of data. Parsers tend to be very binary in that what some data is not parsed there is no information on why it failed. With pyparsing it would be nice to show the different rules that match and where they match. E.g. allocate a color to rules and color the areas they recognize. Or put markers on the side? Or color text with varying intensity depending on the number of levels of rules that recognize the text (i.e. a single rule recognizes the text => light green, a rule containing the first rule recognizes the text => darker green).

There is a need to make some rules fuzzy easily. That is make them accept many variants of text effortlessly but without making them eat up all the text that comes afterwards.

Add new attachment

Only authorized users are allowed to upload new attachments.
« This particular version was published on 20-Sep-2011 13:56 by pgaillard.  
Welcome (anonymous guest) Wiki Prefs
JSPWiki v2.8.5-svn-6