How to Extract Images from PDF in Java

In this how-to article, we provide detailed step-by-step instructions to extract images from PDF in Java along with the information for configuring the required library. This tutorial also provides a working example to extract images from PDF using Java to show the implementation. We will complete the image extraction from the PDF document with a few lines of code that consists of simple API calls.

Steps to Extract Images from PDF in Java

Setup GroupDocs.Parser for Java from the Maven repository in the Java project to extract images from PDF document
Import essential classes for developing the functionality for extracting images from PDF document
Initialize the Parser class for loading the input PDF document
Call getImages method of the Parser class and obtain a collection of the image objects
Finally, iterate through the collection of image objects for getting the size, type, and contents of the image

By following the above points, you can easily create the Java extract images from PDF application. You can commence the image extraction process by installing the required library and importing the necessary classes in the code. Then, the Parser class allows you to load the input PDF file and the getImages method of it enables you to collect the image objects for further usage in your project.

Code to Extract Images from PDF in Java

	import com.groupdocs.parser.Parser;
	import com.groupdocs.parser.data.PageImageArea;

	public class ExtractImagesFromPdfInJava {
	public static void main(String[] args) { // Main function to extract images form PDF in Java
	// Create an instance of Parser class
	try (Parser parser = new Parser("sample.pdf")) {
	// Extract images
	Iterable < PageImageArea > images = parser.getImages();
	// Check if images extraction is supported
	if (images == null) {
	System.out.println("Images extraction isn't supported");
	return;
	}

	// Iterate over images
	for (PageImageArea image: images) {
	// Print a page index, rectangle and image type:
	System.out.println(String.format("Page: %d, R: %s, Type: %s", image.getPage().getIndex(), image.getRectangle(), image.getFileType()));
	}

	}

	}

	}

view raw How to Extract Images from PDF in Java.java hosted with ❤ by GitHub

We have used a few API calls to develop the extract images from PDF Java application. You can also use this sample code for extracting images from other document formats including DOC, DOCX, XLSX, PPTX, and many more. Moreover, you can execute this example on any operating system such as MS Windows, Linux, and macOS without setting up any third-party software.

We have discussed the detailed process to get images from PDF in Java and produced a sample code for it. Recently, we published an article on extracting text from PDF using Java, have a look at how to Extract Text from PDF in Java guide for more information.

GroupDocs Knowledge Base

Find Answers by API

How to Extract Images from PDF in Java

Steps to Extract Images from PDF in Java

Code to Extract Images from PDF in Java