Microsoft

Microsoft products include Word, Excel, and many other desktop applications. Its Email client, Outlook, is one of the most widely used email clients. Microsoft products pervade the entire internet.

Prajna extensions include a Microsoft Word Document Reader, which converts Word documents into DocData objects. This reader currently simply lumps the entire content of the document into one large text string. Future enhancements will retain some of the document structure.

The Prajna extensions also include a PST Doc Reader. Microsoft Outlook stores a local email cache using a PST file format, and Prajna can read this format. The PSTDocReader will convert each email into a separate document, and include the email metadata in the DocData headers. It does not (yet) handle live receipt of email.

The MSWordDocReader relies on the Apache POI jars, while the PSTDocReader relies upon the libpst.jar file from Google. More functionality will be added in the future.