Analyzer project definition file. Used by the VisualText GUI to load an analyzer.
A file associated with a user pass in the analyzer sequence which holds NLP++ rules and code.
An NLP++ function specialized for a particular region in a pass file. For example, the single action only works in the @POST region, after a rule has matched.
A region related to rule matching. Includes the @PRE, @CHECK, and @POST regions.
In general programming, a method or procedure for accomplishing a task. In VisualText, sometimes short for a pass algorithm.
The part of the VisualText interface in which you control the text analyzer sequence, passes and pass files.
A text analyzer, i.e., a program that takes text as input and processes it in some way. Text analyzers typically transform, critique, or extract information from text.
The full set of folders and files associated with a text analyzer. Includes files that define the analyzer, input files, and other data files used by the analyzer.
The file that is used by the VisualText GUI to load an analyzer. Also called a .ana file.
A table that displays the decimal and hexadecimal numbers and their corresponding ASCII characters.
(1) Abbreviation for attribute. (2) a rule element modifier.
A knowledge base object attached to a concept and representing a property of the concept. Consists of a key (or name) and one or more values. A concept may have multiple attributes.
A VisualText tool that allows you to change, add and edit the attributes and/or values of a concept (or node) in the knowledge base.
Refers to rule elements that don't overrun an adjacent rule element. Only an _xWILD with no lists (match, fail, except) is \"backup aware\".
Refers to the absolute basic or minimal analyzer. The Bare template comes with the system passes tokenize and lines.
Refers to running a group of input files non-interactively.
Gnu freeware for writing grammars with associated code. Similar to YACC (Yet Another Compiler Compiler).
A single whitespace character.
When a matched rule keeps another rule from matching.
A C Programming Language code file.
Abbreviation for Conceptual Grammar.
a VisualText tool that shows a count of characters in a line of text and the ASCII characters for each line of text.
A function restricted to operate in the @CHECK region, i.e., after a rule's right-hand-side phrase has matched.
An action region of a pass file. NLP++ code in this region apply after the matcher has succeeded in matching a rule.
A concept that is the child of another concept.
A function that operates in the @CODE region. Some code actions are retained because they are still useful, but they may be overhauled in future VisualText releases.
The region delimited by @CODE and optionally @@CODE. Executes NLP++ code prior to any rule matching in the pass file.
The part of a pass file where NLP++ code that is independent of rule matching is written.
A batch file containing commands that edit or add to the knowledge base.
A knowledge base object. The VisualText knowledge base consists of a hierarchy of concepts.
A method of programming that deals with the concepts underlying a task.
A parse tree containing only nodes that have been built in the selected analyzer pass.
A knowledge representation framework consisting of concepts, attributes, and phrases combined into knowledge hierarchies and graphs
The part of a pass file where methods for selecting nodes of the parse tree are specified.
Abbreviation for Concept Oriented Programming
Region for user-defined NLP++ functions; delimited by @DECL and (optionally) @@DECL.
A concept in the dict hierarchy of the knowledge base. Also called word concept.
A VisualText tool that allows you to make, add and edit a dictionary database. [Currently unavailable.]
Remove a top-level node and more (or splice) its children to the top-level.
An element modifier that specifies an action to be performed when the current rule has matched. For example, \"rename=noun\" specifies that the node matching the current element is to be renamed \"noun.\"
A keyword or keyword and value pair that affects the matching or follow-on actions of a rule element. For example, \"plus\" specifies that one or more nodes must match the current element.
(1) to prefix a character with an escape character. (2) a character used as an escape character.
A character that indicates the succeeding characters should be taken literally rather than interpreted as a special character
A list of elements that are exceptions to a match or fail list. For example, _xWILD [fail=( A B C ) except=( D )] will fail on A, B, or C except in the case that it also matches D. For example, could be used to fail on nouns except for humanNouns.
Remove a sequence of nodes from a parse tree.
A list of elements that will cause a match to fail. For example, _xWILD [fail=( A B C )] will match nodes until it encounters one named A, B, or C.
When a rule match causes another rule to succeed.
A concept to help organize related sets of samples in the Gram Tab. folder concepts contain other folder concepts and/or rule concepts.
The hierarchy of concepts visible in the Gram Tab, and used to manage concepts for stubs, folders, rules, labels, and samples.
The part of the VisualText interface in which you manage samples taken from input texts.
The part of the pass file where rules and actions are written.
The part of a pass file where the main rules of the pass file are written.
Views text files as hexadecimal characters.
Abbreviation for integrated development environment. Refers to a user interface (GUI) program with tools that work in concert to support developers.
A user interface (GUI) program with tools that work in concert to support developers.
A parse tree as modified by a selected pass in the Ana Tab.
A node that is not a leaf of the parse tree. See nonterminal node.
Abbreviation for knowledge base
A VisualText tool that allows you to edit the knowledge base.
Abbreviation for knowledge base management system
(1) A hierarchical database. (2) A repository for concepts, relationships, and other knowledge, typically organized in a meaningful way.
A basic unit of knowledge in the Knowledge Base.
A software system for managing a knowledge base. Analogous to a database management system.
A member of a list of nodes called a phrase. Each concept in the knowledge base may own one phrase. A node is very similar to a concept, except that it is not attached to the hierarchy. A node is often treated as an instance of a concept in the hierarchy.
A concept in the Gram Tab hierarchy that holds subsamples of larger samples. Label concepts can occur only under rule concepts. For example, a label concept called \"AreaCode\" might be placed under a rule concept named \"PhoneNumer\".
Same as leaf node.
A token node of the parse tree. A leaf token has no children. Also called a terminal node.
Same as leaf node.
(1) The lefthand side element of a rule, also called the suggested element. (2) The suggested node built for the lefthand side of a rule.
A prebuilt pass that may be copied into the current analyzer sequence.
A parse tree node that collects a list of related nodes.
A node or token that represents a literal string in the input text.
A list of elements that must be matched. For example, _xWILD [match=( A B C )] will match nodes as long as they are named A, B, or C.
The part of a pass file where nested minipasses can be specified.
A selector used in the SELECT Region of a pass file which specifies a list of node names for the rule matcher. The whole subtree under the specified node is searched.
Principled method for constructing NLP systems.
Abbreviation for natural language engineering
Abbreviation for natural language processing, a subfield of Artificial Intelligence concerned with human languages.
Proprietary programming language of Text Analysis International. A general programming language with specializations for natural language processing.
(1) A parse tree node. (2) A knowledge base node.
See parse tree node variable
A selector used in the SELECT Region of a pass file which specifies a list of node names for the rule matcher. The phrase immediately under the specified node is searched.
A parse tree node that may represent more than one token. The name of a nonliteral always starts with an underscore character.
A node that dominates another node in a parse tree. That is, it has children.
A reduce action in the @POST region that does nothing. Used to override the default action of an empty @POST region. (When a @POST region is non-empty, noop becomes the default action.)
A number that indicates character position of a node in text.
A hierarchy of concepts, typically used to categorize the world.
Part of an NLP++ expression that performs a mathematical, logical, or other function. For example, \"+\" is an operator used to add two numbers or catenate two strings.
Element modifier denoting an optional element.
Element modifier denoting an optional rule element.
An action that causes output to be written to a file. For example, ndump writes all the variables and values for a node out to a file.
(1) To ascribe structure to a linear sequence of words or symbols according to the rules of a grammar. (2) To create a parse tree representing a text. (3) A parse tree or interpretation of a text.
A data structure that tracks patterns matched in an input text.
A unit or data structure representing a piece of text or an idea that the text represents. Nodes are combined to form a parse tree.
A program that takes a text as input and produces a parse tree as output.
A discrete step in the text analyzer with its own pass algorithm.
The method to be executed for a pass in the analyzer. Examples are: tokenize, pattern, and recursive algorithms.
A file of NLP++ code and rules associated with a pass of the text analyzer. Pattern and Recursive pass algorithms make use of pass files.
See Pattern algorithm.
A selector used in the SELECT Region of a pass file which specifies a path of node names for the rule matcher.
A pass algorithm that executes the rules in a pass file by traversing the parse tree once.
Name for the parse tree node data type.
An action that occurs in the @POST region. For example, single tells the rule matcher to build a new node for a matched rule.
An action region of a pass file. NLP++ code in this region apply after a rule match has been accepted.
Action that occurs in the @PRE region. Further constrains the matching of a rule element.
An action region of a pass file. Actions in this region represent additional conditions on the matching of individual rule elements.
An action that prints to a file.
See Recursive algorithm.
A named region that is marked by an initial @RECURSE and specifies a minipass within a pass file. The minipass is invoked by a recurse element modifier in a rule element, e.g., in the main Grammar Zone.
An algorithm whereby the same action is taken repeatedly until some condition or goal is achieved.
A pass algorithm that executes the rules in a pass file by traversing the parse tree multiple times, till no rules match.
A set of rules that is executed repeatedly, till no rules match.
See Recursive algorithm.
(1) Place a sequence of one or more nodes under a new node. Often refers to placing the nodes that match the righthand side of a rule under a new node named with the lefthand side of a rule. (2) A reduce action.
An action in the @POST region that builds a new node for a matched list of nodes.
A reduce action.
Same as recursive grammar.
Rule File Analyzer. A VisualText analyzer that reads the pass files for a text analyzer. A bootstrapping parser that reads a simplified dialect of NLP++.
Similar to the RFA, but parses the full NLP++ language.
(1) The righthand side phrase of a rule. (2) The nodes that matched the righthand side.
The top-level node of the parse tree. A node with no parent.
Abbreviation for the automated Rule Generation machinery.
An NLP++ construct consisting of a phrase of elements and a suggested element. Actions for a matched rule are governed by preceding @PRE, @CHECK, and @POST regions, if present.
A specialized NLP++ function that knows about particular contexts.
A Gram hierarchy concept that owns a set of related samples, from which the rule generator will generalize, merge, and generate rules. A rule concept may have label concepts under it and may own a set of samples.
A literal or nonliteral unit in the phrase of a rule.
Synonym for pass file.
See RFA and RFB.
A region delimited by @RULES and optionally @@RULES, for holding NLP++ rules.
A piece of a larger text, representing a discrete idea. For example, a \"310-555-1212\" is a sample telephone number.
A concept that stores a single sample or subsample in the Gram Tab. Sample concepts are placed under rule concepts and label concepts.
Same as Gram hierarchy.
A region optionally delimited by @SELECT and @@SELECT that specifies how context nodes are to be selected in the parse tree.
A marker such as @NODES which specifies which nodes will be subjected to rule matching.
The standard reduce action in the @POST region. Specifies that a new node is to be built, with the matched phrase of nodes to be placed underneath the new node.
A phrase element modifier.
A program that automatically searches the World Wide Web and retrieves documents and links to those documents.
The process of searching the World Wide Web and automatically retrieving documents and links to those documents.
Same as parse tree.
A placeholder for a sequence of passes that will be generated automatically.
A Gram Tab concept that is associated with a region of automatically generated passes in the analyzer sequence.
A sequence of automatically generated passes within the overall analyzer sequence.
A node to which matched elements of a rule are reduced.
An attribute assigned by the system to a concept in the knowledge base. System attributes include algo, type, active, passnum etc. Altering system attributes of a concept is not recommended.
A pass in the analyzer sequence that is defined by the system not the user. The tokenize pass is an example of a system pass.
A starter or skeletal analyzer that can be copied and modified.
A node that does not dominate another node in a parse tree. Also called a leaf node.
the part of the VisualText interface where you manage input and output files.
A node representing a literal text string.
(1) Pass algorithm for converting an input text to an initial parse tree. (2) The default system and first pass in an analyzer sequence.
A node being traversed by the rule matcher.
An element modifier that causes a rule to be matched by first matching the associated rule element.
See node variable.
An action that operates on a node variable.
also whitespace; any of space, tab, carriage return or newline
See dictionary concept.
Yet Another Compiler Compiler. See Bison.