NOTE: The latest VisualText is required for running and modifying TAIParse. Even the compiled version needs runtime libraries bundled with VisualText. DOWNLOAD HERE.
Download TAIParse here (geared to VisualText 2).
TAIParse also performs part-of-speech tagging at 94% accuracy on a blind test business article corpus.
Natural language processing (NLP) generally refers to the complete linguistic and conceptual processing of a text.
To facilitate the construction of natural language processing products, TAI is now making available some general text analysis prototypes that can be used as a starting point for a host of applications, such as information extraction, categorization, summarization, and question parsing.
TAIParse is a general analyzer that emphasizes the minimal use of knowledge ("just-in-time" knowledge) to perform part-of-speech tagging, entity extraction, and shallow parsing. TAIParse is an excellent
starting point for customizing your own text analysis capabilities.
For one thing, TAIParse includes a full lexicon with part-of-speech information within its knowledge base. For another, it illustrates the latest features of the NLP++® language in action. TAIParse further illustrates the ease of implementation of NLP systems with the VisualText® IDE (SDK, tools, etc.). TAIParse includes these capabilities and more:
- Zoning and "parsing-per-line" to characterize regions and formats in text
- Dynamic and context-dependent part-of-speech assignment and parsing
- Successive segmentation of text in a "divide-and-conquer" strategy
- Treatment of unknown words
- Noun phrase extraction
- A semantic and discourse processing framework that ties into an ontology and dynamic representation of the analysis within the knowledge base
- Processing INDEPENDENT of capitalization, so that, for example, all-uppercase text regions can be analyzed.
- Robust analysis in the face of errors, misspellings, and ungrammatical text
keywords: natural language processing products, nlp, integrated development environment, ide, information extraction, sdk, tool set, software tools.