How to Extract Metadata from PDF using C#

In this how-to tutorial, we describe the step-by-step procedure to extract metadata from PDF using C# by using .NET API to extract document data. This guide also provides instructions on how to configure the required library and a sample code to get metadata from PDF in C#. Here are the key steps and working code for extracting the metadata from documents.

Steps to Extract Metadata from PDF using C#

Setup GroupDocs.Parser for .NET package from the NuGet package manager in the .NET application to extract metadata from PDF
Add a reference to the essential namespaces for extracting the metadata from PDF document
Create an instance of the Parser class and load the input PDF document
Call the GetMetadata method and obtain a collection of document metadata objects
Finally, iterate through the collection and get metadata names and values

To implement the functionality to read PDF metadata C#, simply follow the above steps in order. First of all, load the source PDF by initiating the Parser class after installing the required package from NuGet and importing the essential namespaces. After that, the GetMetadata method allows one to get the metadata objects for the document and then iterate over the collection for displaying the metadata name and values.

Code to Extract Metadata from PDF using C#

	using System;
	using System.Collections.Generic;
	using GroupDocs.Parser;
	using GroupDocs.Parser.Data;

	namespace ExtractMetadataFromPdfUsingCSharp
	{
	class Program
	{
	public static void Main(string[] args) // Main function to extract metadata from PDF using C#
	{
	// Remove the watermark from output
	string licensePath = "GroupDocs.Parser.lic";
	GroupDocs.Parser.License lic = new GroupDocs.Parser.License();
	lic.SetLicense(licensePath);

	// Create an instance of Parser class
	using (Parser parser = new Parser("sample.pdf"))
	{
	// Extract metadata from the document
	IEnumerable<MetadataItem> metadata = parser.GetMetadata();
	// Check if metadata extraction is supported
	if (metadata == null)
	{
	Console.WriteLine("Metatada extraction isn't supported");
	}

	// Iterate over metadata items
	foreach (MetadataItem item in metadata)
	{
	// Print an item name and value
	Console.WriteLine(string.Format("{0}: {1}", item.Name, item.Value));
	}
	}
	}
	}
	}

view raw How to Extract Metadata from PDF using C#.cs hosted with ❤ by GitHub

We have developed the C# read PDF metadata application to demonstrate the working of extracting metadata from the document feature. This application is created by consuming a couple of API calls of the document data extraction library without setting up any additional software. Further, this sample code can be executed on any operating system like MS Windows, Linux, and macOS that support a .NET environment.

GroupDocs Knowledge Base

Find Answers by API

How to Extract Metadata from PDF using C#

Steps to Extract Metadata from PDF using C#

Code to Extract Metadata from PDF using C#