Read Metadata from PPTX using Java

In today’s digital age, understanding the details within files is becoming very important. Metadata, the hidden store of information within digital documents, holds valuable insights ready to be discovered. Among the many file formats, PPTX stands out as a widely used presentation format filled with rich metadata, ready to be explored. This article ventures into the domain of programming to unveil the techniques of how to read metadata from PPTX using Java. Presented below are the steps alongside a code to read metadata of PPTX using Java.

Steps to Read Metadata from PPTX using Java

  1. Configure your coding environment to utilize GroupDocs.Metadata for Java for obtaining information from PPTX files
  2. Create an instance of the Metadata class, passing the file path of the PPTX file as an argument to its constructor
  3. Set guidelines to check all the collected metadata details
  4. Specify a condition for the Metadata.findProperties method
  5. Iterate through all the properties you’ve collected individually

Metadata can be described as information about other data, including details like authorship, creation and modification dates, and more. For PPTX files, metadata reveals important insights into presentations, such as their origins, revisions, and structural content. To extract metadata from PPTX files, we use Java, a versatile and powerful programming language, along with GroupDocs.Metadata, a robust library for handling metadata in Java applications. This library simplifies accessing and managing metadata across various file formats, including PPTX. The following code example demonstrates how to extract metadata of PPTX in Java.

Code to Read Metadata from PPTX using Java

import com.groupdocs.metadata.Metadata;
import com.groupdocs.metadata.core.FileFormat;
import com.groupdocs.metadata.core.IReadOnlyList;
import com.groupdocs.metadata.core.MetadataProperty;
import com.groupdocs.metadata.core.MetadataPropertyType;
import com.groupdocs.metadata.licensing.License;
import com.groupdocs.metadata.search.FallsIntoCategorySpecification;
import com.groupdocs.metadata.search.OfTypeSpecification;
import com.groupdocs.metadata.search.Specification;
import com.groupdocs.metadata.tagging.Tags;
import java.util.Calendar;
import java.util.Date;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class ReadMetadataFromPPTXUsingJava {
public static void main(String[] args) {
// Set License to avoid the limitations of Metadata library
License license = new License();
license.setLicense("GroupDocs.Metadata.lic");
Metadata metadata = new Metadata("input.pptx");
if (metadata.getFileFormat() != FileFormat.Unknown && !metadata.getDocumentInfo().isEncrypted()) {
System.out.println();
// Fetch all metadata properties that fall into a particular category
IReadOnlyList<MetadataProperty> properties = metadata.findProperties(new FallsIntoCategorySpecification(Tags.getContent()));
System.out.println("The metadata properties describing some characteristics of the file content: title, keywords, language, etc.");
for (MetadataProperty property : properties) {
System.out.println(String.format("Property name: %s, Property value: %s", property.getName(), property.getValue()));
}
// Fetch all properties having a specific type and value
int year = Calendar.getInstance().get(Calendar.YEAR);
properties = metadata.findProperties(new OfTypeSpecification(MetadataPropertyType.DateTime).and(new ReadMetadataFromPPTXUsingJava().new YearMatchSpecification(year)));
System.out.println("All datetime properties with the year value equal to the current year");
for (MetadataProperty property : properties) {
System.out.println(String.format("Property name: %s, Property value: %s", property.getName(), property.getValue()));
}
// Fetch all properties whose names match the specified regex
Pattern pattern = Pattern.compile("^author|company|(.+date.*)$", Pattern.CASE_INSENSITIVE);
properties = metadata.findProperties(new ReadMetadataFromPPTXUsingJava().new RegexSpecification(pattern));
System.out.println(String.format("All properties whose names match the following regex: %s", pattern.pattern()));
for (MetadataProperty property : properties) {
System.out.println(String.format("Property name: %s, Property value: %s", property.getName(), property.getValue()));
}
}
}
// Define your own specifications to filter metadata properties
public class YearMatchSpecification extends Specification {
public YearMatchSpecification(int year) {
setValue(year);
}
public final int getValue() {
return auto_Value;
}
private void setValue(int value) {
auto_Value = value;
}
private int auto_Value;
public boolean isSatisfiedBy(MetadataProperty candidate) {
Date date = candidate.getValue().toClass(Date.class);
if (date != null) {
Calendar calendar = Calendar.getInstance();
calendar.setTime(date);
return getValue() == calendar.get(Calendar.YEAR);
}
return false;
}
}
public class RegexSpecification extends Specification {
private Pattern pattern;
public RegexSpecification(Pattern pattern) {
this.pattern = pattern;
}
@Override
public boolean isSatisfiedBy(MetadataProperty metadataProperty) {
Matcher matcher = pattern.matcher(metadataProperty.getName());
return matcher.find();
}
}
}

By following the given instructions, you can get metadata of PPTX in Java on widely-used systems like Windows, macOS, and Linux, assuming Java is installed. No additional software installations are necessary. In summary, exploring the hidden metadata within PPTX files using Java is a valuable endeavor. By leveraging the capabilities of the suggested library and the versatility of Java, developers can reveal important insights hidden within digital presentations. Once you’ve configured the recommended library and modified the file paths accordingly, integrating the following code into your projects should proceed smoothly without any complications.

During our previous discussion, we provided an in-depth guide on extracting metadata from XLSX files using Java. For a more thorough understanding of this topic, we recommend referring to our comprehensive tutorial on how to read metadata from XLSX using Java.

 English