1 Introduction
Sav Associative Processor is a software Java classes library for direct data access. Associative Processor provides developers with a low-level Java API (Java language Application Programming Interface) for designing electronic dictionary, knowledge base, or data indexer in more complete text/database/voice processors. The Sav Processor makes it possible to associate quickly from one set of objects to another.
2 Operations of the Set Theory
It is easy to describe the Sav Processor by terms of the set theory. Processor's objects make up two classes: 1.Association class represents a set of the elements, 2.Concept class matches an element of the set. There are three base methods defining the Association class: 1. set() forms union of concept associations, 2. get() creates intersection of associations, 3. clear() differences associations. Parameter of these methods may be an association, a concept, or a string. The following Java program demonstrates using the Processor's methods. Notice that all the Processor's classes are imported from the Sav.Processor package.
3 Extraction
There is possibility of extraction one association from another via see() method. The see() method declares first symbols for concepts' strings of a target association. The method has one parameter, namely Concept, String, or constants of the PN (Processsor's Notation) class. To extract every concept from an association, we can use getFirst() and getNext() methods in a loop.
If need to extract concepts with a first character of the determined code range, we can use next Processor's constants, setting concept notations: PN.NATURAL for an English small letter in 96 - 127 code range, PN.CAPITAL for letter in 64 - 95 code range, PN.ARITHMETICAL for sign in 32 - 63 code range. The PN.NUMBER constant denotes integer, in 0 - 1073741823 numerical range. See this fragment.
a.set("9").set("10").set("#10").set("#9");//a = {10, 9, #9, #10}
a.see(PN.NUMBER); //a = {#9, #10}
c = a.getFirst(); //c = #9, because 9 numeric < 10 numeric
a.see(PN.ARITHMETICAL); //a = {10, 9}
c = a.getFirst(); //c = 10, because code of the '1' < code of the '9'
The possibility exists of extracting concepts with foreign letters, such
as Russian letters in unicode range from 1072 to 1103 through PN.RUSSIAN_N
and in 1040-1071 range by PN.RUSSIAN_C.
Consider the program example of grouping tokens according to the first letters. We can get a concept's string by the toString() method.
4 Connection
Ability to connect concepts is incredibly important for systems managing grammar checking, translation, database working, and speech recognition. The Sav Processor works with connections of the types defined by the constants: PN.NAME, PN.RELATION, PN.VALUE. These constants are used as parameter of a con() method. A Concept or a String parameter can be used in con().
Consider a complete example demonstrating using con() in combination with methods: clone() – clones object, fix() - fixes connection pass, regain() – regains fixed connection pass, and store() – stores association in a file. The next program indexes file if it contains "JDK", "VisualJ++" or "JBuilder" instances of a "Java tool" class.
5 Data Base Indexing Sample
The Sav Processor can be powerful core in data base systems. The capability tests showed the Processor is highly competitive with quality relational data base systems, for example the Microsoft Access, in access time, indexing speed and required index area. The next example demonstrates the table data indexing and subsequent access to the set of indexes.
After running DBSession class we can view the following work output:
READING ... # Program type Program name Article thema Article file Frequency 0 SQL tool Delphi comparison c1.txt 011 1 PC OS Merlin review r5.txt 005 2 Java tool Visual Age review r19.txt 004 3 Java tool Visual Cafe review r19.txt 005 INDEXING ... STORING ... SELECT * FROM Soft.Reference WHERE Program = Java tool AND Program name = review AND Article thema = r19.txt RESULT SET 2 Java tool Visual Age review r19.txt 004 3 Java tool Visual Cafe review r19.txt 005