Fix Search and Indexing of PDF Attachments

Revision as of 09:22, 15 September 2021 by Heera Singh Koranga (talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

How to fix search and indexing of PDF attachments?


   KB 24354        Last updated on 2021-09-15  




0.00
(0 votes)

Problem

Search and indexing with the PDF files are broken because Convertd is unable to parse and index PDF files on Ubuntu OS.

This problem does not occur on RHEL and CentOS.

These are the log entries generating while search fails with the PDF files.

Entries in Convertd Log:

# tail -f /opt/zimbra/log/convertd.log.2021-09-09 

[Thu Sep 09 16:43:28.269358 2021] [:info] [pid 496] mod_convert: clean dir=/opt/zimbra/convertd/convert/convert
[Thu Sep 09 16:43:28.269713 2021] [:info] [pid 496] mod_convert: clean dir=/opt/zimbra/data/tmp/convert
/opt/zimbra/common/bin/httpd: symbol lookup error: /opt/zimbra/keyview/FilterSDK/bin/pdfsr.so: undefined symbol: atan


Entries in Mailbox Log:

# tail -f /opt/zimbra/log/mailbox.log 

2021-09-09 16:43:28,294 WARN  [qtp231977479-116] [name=testuser6@example.com;mid=26;oip=192.168.2.148;port=41628;ua=ZimbraWebClient - FF91 (Win)/8.8.15_GA_4059;soapId=5d4a7c3;] ParsedMessage
- Unable to parse part=2 filename=download1.pdf content-type=application/pdf message-id=<1813334151.23.1631205655825.JavaMail.zimbra@example.com>
com.zimbra.cs.mime.MimeHandlerException: extraction failed
        at com.zimbra.cs.mime.handler.ConverterHandler.getContentImpl(ConverterHandler.java:122)
        at com.zimbra.cs.mime.MimeHandler.getContent(MimeHandler.java:180)
        at com.zimbra.cs.mime.ParsedMessage.analyzePart(ParsedMessage.java:1152)
        at com.zimbra.cs.mime.ParsedMessage.analyzeNonBodyParts(ParsedMessage.java:442)
        at com.zimbra.cs.mime.ParsedMessage.analyzeFully(ParsedMessage.java:473)
        at com.zimbra.cs.mailbox.Message.generateIndexData(Message.java:1405)


Solution

These are the steps to fix the Convertd.

1) Take a backup of the file "/opt/zimbra/convertd/conf/httpd.conf".

2) Open the file "/opt/zimbra/convertd/conf/httpd.conf" with vi or any editor and find the section "<IfModule !mpm_winnt_module>" then add the following highlighted line and save the file.

Original entries:

 <IfModule !mpm_winnt_module>
     LoadModule convert_module /opt/zimbra/convertd/lib/libmod_convert.so
     ConvertHTMLDir      /opt/zimbra/keyview/ExportSDK/bin
     ConvertTextDir      /opt/zimbra/keyview/FilterSDK/bin 
     ConvertTempDir      /opt/zimbra/data/tmp
     #LockFile           /opt/zimbra/convertd/tmp/accept.lock
 </IfModule>


Entries after modification:

 <IfModule !mpm_winnt_module>
     LoadFile /lib/x86_64-linux-gnu/libm.so.6 
     LoadModule convert_module /opt/zimbra/convertd/lib/libmod_convert.so
     ConvertHTMLDir      /opt/zimbra/keyview/ExportSDK/bin
     ConvertTextDir      /opt/zimbra/keyview/FilterSDK/bin
     ConvertTempDir      /opt/zimbra/data/tmp
     #LockFile           /opt/zimbra/convertd/tmp/accept.lock
 </IfModule>


3) Now restart Convertd service.

su - zimbra 
zmconvertctl restart 


Submitted by: Heera Singh Koranga
Verified Against: Date Created:
Article ID: https://wiki.zimbra.com/index.php?title=Fix_Search_and_Indexing_of_PDF_Attachments Date Modified: 2021-09-15



Try Zimbra

Try Zimbra Collaboration with a 60-day free trial.
Get it now »

Want to get involved?

You can contribute in the Community, Wiki, Code, or development of Zimlets.
Find out more. »

Looking for a Video?

Visit our YouTube channel to get the latest webinars, technology news, product overviews, and so much more.
Go to the YouTube channel »


Jump to: navigation, search