Read Metadata from DOCX using C#

In today’s digital age, understanding the hidden information embedded within documents is paramount for efficient document management and analysis. Document metadata, such as authorship, creation date, and revision history, can provide valuable insights into the document’s origin and evolution. In this article, we’ll explore how to read metadata from DOCX using C# with the power of Metadata library, enabling developers to delve deeper into their document repositories and enhance their document processing workflows. Following key steps and code example show how to read metadata of DOCX using C#.

Steps to Read Metadata from DOCX using C#

  1. Set up your IDE to utilize GroupDocs.Metadata for .NET to retrieve DOCX metadata information
  2. Create an instance of the Metadata class, passing the file path of the DOCX file as a parameter to the constructor
  3. Create a predicate to inspect all retrieved metadata properties
  4. Provide a predicate to the Metadata.FindProperties method
  5. Iterate through the retrieved properties

Whether it’s automating document categorization, implementing search functionalities within document repositories, or ensuring compliance with organizational standards, the ability to extract metadata of DOCX in C# opens up a plethora of possibilities for developers. Adhering to the given instructions is compatible with widely used operating systems like Windows, macOS, and Linux, as long as .NET is installed. No additional software installations are required for this procedure. Once you’ve set up the recommended library and adjusted the file paths accordingly, incorporating the following code into your projects should proceed smoothly and without any hitches.

Code to Read Metadata from DOCX using C#

using GroupDocs.Metadata;
using GroupDocs.Metadata.Common;
using GroupDocs.Metadata.Tagging;
using System.Text.RegularExpressions;
namespace ReadMetadataFromDOCXUsingCSharp
{
internal class Program
{
static void Main(string[] args)
{
// Set License to avoid the limitations of Metadata library
License lic = new License();
lic.SetLicense(@"GroupDocs.Metadata.lic");
// Pass absolute or relative path of document to Metadata's constructor
using (Metadata metadata = new Metadata(@"input.docx"))
{
if (metadata.FileFormat != FileFormat.Unknown && !metadata.GetDocumentInfo().IsEncrypted)
{
Console.WriteLine();
// Fetch all metadata properties that fall into a particular category
var properties = metadata.FindProperties(p => p.Tags.Any(t => t.Category == Tags.Content));
Console.WriteLine("The metadata properties describing some characteristics of the file content: title, keywords, language, etc.");
foreach (var property in properties)
{
Console.WriteLine("{0} = {1}", property.Name, property.Value);
}
// Fetch all properties having a specific type and value
var year = DateTime.Today.Year;
properties = metadata.FindProperties(p => p.Value.Type == MetadataPropertyType.DateTime &&
p.Value.ToStruct(DateTime.MinValue).Year == year);
Console.WriteLine("All datetime properties with the year value equal to the current year");
foreach (var property in properties)
{
Console.WriteLine("{0} = {1}", property.Name, property.Value);
}
// Fetch all properties whose names match the specified regex
const string pattern = "^author|company|(.+date.*)$";
Regex regex = new Regex(pattern, RegexOptions.IgnoreCase);
properties = metadata.FindProperties(p => regex.IsMatch(p.Name));
Console.WriteLine("All properties whose names match the following regex: {0}", pattern);
foreach (var property in properties)
{
Console.WriteLine("{0} = {1}", property.Name, property.Value);
}
}
}
}
}
}

In conclusion, reading metadata from DOCX empowers developers to gain deeper insights into their document collections. By following the steps outlined in this article and leveraging the provided code example, you can seamlessly get metadata of DOCX in C#, enabling better organization, analysis, and management of your documents. Experiment with different metadata properties and explore the library to further expand your document processing capabilities in C#. Congratulations! You’ve now become proficient in extracting DOCX metadata using C#.

In an earlier conversation, we offered an extensive tutorial on retrieving document information using C#. For a deeper comprehension of this subject, we advise consulting our detailed guide on how to get document information using C#.

 English