- Posts: 45
- Thank you received: 0
All questions about EDocman extension
Empty pdf's
- bestcons
- Topic Author
- Offline
- Senior Member
-
Less
More
2 days 13 hours ago #175565
by bestcons
Empty pdf's was created by bestcons
We have a large number of pdfs, handled via Edocman. They are indexed and we see them in Joomla Smart Search if they contain the searched phrase.
However if we View the Search results, a small number of pdf's have 'empty pages' although the number of pages is correct, i.e. they do not show the pdf. Searching through these 'empty pages' shows vaque where results are located. The Download is OK.
I hope you have an explanation and can offer a solution.
However if we View the Search results, a small number of pdf's have 'empty pages' although the number of pages is correct, i.e. they do not show the pdf. Searching through these 'empty pages' shows vaque where results are located. The Download is OK.
I hope you have an explanation and can offer a solution.
Please Log in or Create an account to join the conversation.
- Dang Thuc Dam
-
- Offline
- Administrator
-
Less
More
- Posts: 13345
- Thank you received: 1748
2 days 3 hours ago #175569
by Dang Thuc Dam
Replied by Dang Thuc Dam on topic Empty pdf's
Hi,
We would like to clarify that if a PDF file contains only images and does not have any embedded text, the system is unable to read or index the content of the file for search purposes. As a result, these files may appear as “empty pages” in the search results because there is no searchable text available.
Additionally, if the PDF files are secured, restricted from reading, or encrypted, the system may also be unable to access and display their content. This can further result in blank or empty pages when viewing search results, even though the files can still be downloaded.
We recommend checking whether the affected PDF files contain selectable text or are protected in any way.
Thanks
Dam
We would like to clarify that if a PDF file contains only images and does not have any embedded text, the system is unable to read or index the content of the file for search purposes. As a result, these files may appear as “empty pages” in the search results because there is no searchable text available.
Additionally, if the PDF files are secured, restricted from reading, or encrypted, the system may also be unable to access and display their content. This can further result in blank or empty pages when viewing search results, even though the files can still be downloaded.
We recommend checking whether the affected PDF files contain selectable text or are protected in any way.
Thanks
Dam
Please Log in or Create an account to join the conversation.
- bestcons
- Topic Author
- Offline
- Senior Member
-
Less
More
- Posts: 45
- Thank you received: 0
1 day 23 hours ago #175575
by bestcons
Replied by bestcons on topic Empty pdf's
They certainly are. It all concerns scanned Newspapers, including text and images. All are handled after the scanning the same way, according to a strict protocol. The pages are not empty, it seems as if they have white coloured text. Searching through these pages is recognized, as 'the result' jumps to right page.
Please Log in or Create an account to join the conversation.
- Dang Thuc Dam
-
- Offline
- Administrator
-
Less
More
- Posts: 13345
- Thank you received: 1748
1 day 2 hours ago - 1 day 2 hours ago #175617
by Dang Thuc Dam
Replied by Dang Thuc Dam on topic Empty pdf's
Hi,
If the PDF files contain scanned images, it is not possible to read or extract the text content directly, as the text is part of the image. Currently, there is no tool available in our system that can read the contents of scanned image files.
Solution: Converting Scanned PDFs to Searchable Text PDFs
Before uploading your scanned PDF files to Edocman, you will need to use Optical Character Recognition (OCR) software to convert the images into searchable text. This process creates a text layer within the PDF, allowing Edocman’s PDF Indexer to read and index the content.
Recommended OCR Tools:
Thanks
Dam
If the PDF files contain scanned images, it is not possible to read or extract the text content directly, as the text is part of the image. Currently, there is no tool available in our system that can read the contents of scanned image files.
Solution: Converting Scanned PDFs to Searchable Text PDFs
Before uploading your scanned PDF files to Edocman, you will need to use Optical Character Recognition (OCR) software to convert the images into searchable text. This process creates a text layer within the PDF, allowing Edocman’s PDF Indexer to read and index the content.
Recommended OCR Tools:
- Adobe Acrobat (Paid, highly accurate)
- ABBYY FineReader (Paid, professional-grade OCR)
- Online OCR services (Free or paid, such as onlineocr.net )
- Open your scanned PDF file in your chosen OCR software.
- Run the OCR function to recognize and convert the images into text.
- Save the resulting PDF file. It should now contain selectable and searchable text.
- Upload the processed PDF to Edocman as usual.
Thanks
Dam
Last edit: 1 day 2 hours ago by Dang Thuc Dam.
Please Log in or Create an account to join the conversation.
- bestcons
- Topic Author
- Offline
- Senior Member
-
Less
More
- Posts: 45
- Thank you received: 0
22 hours 49 minutes ago #175623
by bestcons
Replied by bestcons on topic Empty pdf's
They are all text searchable. We used Abby FineReader. All documents followed the same process.
You are familiar with our website: www.dexxxxxxxx.yy . So please search for CVO07 you find 2 results. You can see with the green View button the empty/blank results. If you download the documents, they are searchable..
Searching for CVO01 delivers numerous correct examples.
You are familiar with our website: www.dexxxxxxxx.yy . So please search for CVO07 you find 2 results. You can see with the green View button the empty/blank results. If you download the documents, they are searchable..
Searching for CVO01 delivers numerous correct examples.
Please Log in or Create an account to join the conversation.
- Dang Thuc Dam
-
- Offline
- Administrator
-
Less
More
- Posts: 13345
- Thank you received: 1748
21 hours 23 minutes ago #175628
by Dang Thuc Dam
Replied by Dang Thuc Dam on topic Empty pdf's
Hi,
Please submit ticket in category: Edocman and send us some pdf files for debugging.
Thanks
Dam
Please submit ticket in category: Edocman and send us some pdf files for debugging.
Thanks
Dam
Please Log in or Create an account to join the conversation.
Moderators: Dang Thuc Dam
Support
Documentation
Information
Copyright © 2025 Joomla Extensions by Joomdonation. All Rights Reserved.
joomdonation.com is not affiliated with or endorsed by the Joomla! Project or Open Source Matters.
The Joomla! name and logo is used under a limited license granted by Open Source Matters the trademark holder in the United States and other countries.
The Joomla! name and logo is used under a limited license granted by Open Source Matters the trademark holder in the United States and other countries.