Designing Reasoners

Like many aspects of developing applications with Prajna, developing a reasoner requires you to identify both the characteristics of the data you are enhancing, and what the enhancement process will provide. Because the range of information and goals that reasoners might encounter, it is impossible to provide a complete set of reasoners. Instead, Prajna supplies a framework for developing your own reasoners, and some examples.

There are (at least) two things you need to consider when designing your reasoner. The first is the information granularity - what level of information does your reasoner need to perform its reasoning process? The second is to identify where in the information pipeline you need the reasoner to operate?

Information Granularity

Does the reasoner operate on a specific field within each record? Or does it work across multiple fields, correlating information among several data elements within each record? Or does it develop new information based upon the data structure (graph, tree, grid) or multiple different data records? Each of these types of reasoning might require a different approach.

The FieldHandler classes work on a specific field. Typically, field-level reasoning can be performed when data is ingested, so Prajna enables its DataAccessors to include a number of FieldHandlers which operate on incoming data. This is among the simplest approaches to enhancing data, though it is also the most limited.

SemanticReasoners operate on the entire record. They can examine multiple values in different fields to determine how a particular record should be enhanced. This type of reasoner is incredibly varied.

Structure Reasoners provide some form of reasoning across a data structure. This could be a graph, tree, grid, or dataset of records. It could also be a document corpus. Prajna does not yet have any interfaces for structure reasoners, but they are in development. They will be included in a future release.

Information Pipeline

Within any application, information flows from the input, through some form of processing, and ultimately to an output. The input might include data files, streaming data, or databases. Similarly, the output stage includes sending information back to data stores, writing output files, generating summaries, or even providing a visual representation. In some applications, the information flow might cycle. For instance, a visualization application might load some data, perform some processing on it, and present it to the user for analysis. Then the user selects different data, or enters new information, and the cycle starts again. Reasoners typically fit in this information flow at any stage, but it is important to determine the most effective place to add a reasoner.

If the reasoner operates at a field or record level, then it is possible to include a reasoner in the data ingestion process. Typically, these reasoners perform the same reasoning regardless of the state of the application. Reasoners which operate at the data ingestion step should also be relatively fast, to avoid degradations in performance. A structure reasoner could not operate at ingestion time, since the entire contents of the data structure may not have been loaded.

Most reasoners operate during the main processing phase of information flow. Typically, any additional information processing is performed in this step, including reasoning. Some of these reasoners might be configurable, or trigger optionally depending on the application's state.

The output process rarely incorporates any form of reasoning process. During this stage, the only reasoners which should be included are those which enhance the data for the specific output format or medium. Visual displays might include some form of color coding or other representation, or an output format might need additional information to properly encode the data.