Download the SysInfo PDF Extractor Tool on the system and run it as an administrator.
Click the File icon or Folder icon at the top and browse the PDFs on your system. Click Open to add them.
All the selected PDF files are displayed on the screen. You can see them in the panel and select either of the three options for Preview:
After selecting the option, double-click on the file whose data you want to view and check for the same in the Preview Pane. It also displays information like File Name, Path, Size, and Password.
Additionally, you can remove the files individually or all together by clicking the Remove icon and Remove All icon. Also, there is an option to extract data from password-protected PDF files, discussed at the end.
Furthermore, the tool offers several Data Extraction filters. To extract email addresses from PDF, choose Email Address and the desired format to save into.
Next, click the Text filter to extract textual content from your PDF files and save it into a desired format. Say, TEXT.
Additionally, you get a Page Filter feature to choose the selective pages for which you need to extract the data. The basis for Pager Selection are: All Pages, Even Pages, Odd Pages, Page Range, and Page Number. You can opt for the same in Text and the other five components.
Now, to extract the embedded attachments data from the PDF, choose the Attachment filter and select the File Size based on two options: Upto or More Than. In addition, you can choose to Create a Folder for Each PDF File by checkboxing this filter option.
Select the Images filter and choose the desired format to save into. Let’s say JPEG. Also, select the Page Filter if need to extract data from selective pages.
Afterward, click the Bookmark filter to retrieve it from the PDF file and save it in a file format of your choice. Also, select the Page Filter, like, we choose Page Range and specify a particular range for the pages.
To extract the comments present in your PDF file, press the Comment filter and select the format (say HTML) and page filter.
You can click Hyperlink filter to get the important links embedded in your PDF file and save them in a desired format. Let’s say DOCX. Along with that, select the Page filter for selective link extraction.
Finally, to retrieve only the properties of the PDF file, click MetaData filter and select a format to save it into. Like, we choose to save as PDF.
In case you do not need the previously extracted data, check the box beside Skip Previously Processed Data, and it will filter and extract only the new data from the PDF. Hit Extract.
If you need to save at a desired location, then uncheck the Skip filter and, finally, press the Extract button. It will open a dialog, and you can select a Destination where you need to save the extracted components. Press Open to save them in the specified format from selected PDF files.
At last, when the process completes, a dialog appears. Click OK, and you will be redirected to the location where the extraction report is saved.
As stated above, you can also add the protected PDFs into the tool to extract their data. To manage multiple PDFs Password Pooling, follow the steps below and later add the files and proceed with the above steps.
You can press the Lock icon in the tool window, and the Password Pooling dialog will appear. There, click Upload CSV >> Select CSV to browse and select a CSV file containing all the passwords for your protected PDFs. Hit Open to add them.
Alternatively, if you do not have a CSV, select the Input Password radio option to enter the password and click Add to include it in the list.
At last, select the passwords for your PDF from the list and click Apply. You can remove the unrequired passwords and then proceed to Add PDF Files and Extract the data.