Represents a collection of document data elements. This class provides
utility methods for generating different data views of the corpus. It is
designed to be sub-classed, with subclasses implementing specific interfaces
to particular document sets.
This method clears all data from the corpus. Use with care
public DocCorpus createSubCorpus(java.util.Collection<java.lang.String> keys)
Derive a sub-corpus of documents which match the given set of keys. The
resulting DocCorpus will contain a subset of the documents within this
corpus. A document will be added to the sub-corpus if it matches any of
the keys provided.
Get the similarity graph. The similarity graph is a graph with a node
for each document. The edges between document nodes are NamedWeights.
The weights are equal to the number of keys that the two nodes have in
the Similarity graph
public boolean hasMoreData()
Return whether the underlying data source has more data to be loaded
into this document corpus. The default behavior always returns false.
public boolean isGraphDynamic()
Get whether the graphs are dynamically created by this corpus.
true if the graphs are dynamically created, false if they are
created at request time.
public void loadCorpus()
Load the document corpus from a data repository. The default
implementation does nothing. Specific implementations of a document
corpus should override this method.
Set whether the graphs are dynamically created. If the dynamic flag is
set, the graphs are updated whenever a new document is added to the
corpus. This means the updates may take a bit longer, and the corpus
requires more memory, but the responses to retrieve the graph are
instantaneous. If the dynamic flag is not set, the corpus loads faster,
but creating the graphs might take significant time. A particular corpus
can set the flag based upon expected usage patterns. By default, the
flag is true.
dynamic - the flag indicating whether to generate the key graph and
similarity graph dynamically.