Extract Text from XLS using C#

Extracting text from Excel (XLS) files is a common requirement for developers working on data processing, report generation, or information management tasks. Excel files are often used to store and manage large amounts of data. Sometimes, you may need to extract specific information from these files to use it in different contexts, such as generating reports, performing data analysis, or migrating data to other formats. By automating the text extraction process, you can save time and reduce the risk of manual errors. In this article, we’ll explore how to extract text from XLS using C#. We’ll walk through the necessary steps and provide a sample code to help you easily integrate this functionality into your C# projects. The text extraction from XLS in C# is a straightforward process when you follow the steps outlined below.

Steps to Extract Text from XLS using C#

  1. Prepare your development environment by including GroupDocs.Parser for .NET, which enables text extraction from XLS files
  2. Create a Parser instance and provide the path to your XLS file during initialization
  3. Utilize the GetText method on the Parser instance to retrieve a TextReader object
  4. Utilize the ReadToEnd method on the TextReader to read the entire text content from the XLS file

The steps described above work seamlessly across Windows, macOS, and Linux operating systems without the need for any extra software beyond what’s typically included with these platforms. This method not only simplifies the development process but also enhances the portability of your applications, allowing them to function consistently across different environments. The flexibility offered by XLS text extraction in C# means you can easily adapt the code to suit specific project requirements, whether you’re dealing with large data sets or need to automate routine tasks. The following code example is provided for use in your applications to read text from XLS files.

Code to Extract Text from XLS using C#

using System;
using System.IO;
using GroupDocs.Parser;
using GroupDocs.Parser.Options;
namespace ExtractTextfromXLSusingCSharp
{
internal class Program
{
static void Main(string[] args)
{
// Apply the license to remove the limitations of the Parser library
License lic = new License();
lic.SetLicense(@"GroupDocs.Parser.lic");
// Instantiate the Parser class
using (Parser parser = new Parser("input.xls"))
{
// Retrieve formatted text into the reader
using (TextReader reader = parser.GetFormattedText(
new FormattedTextOptions(FormattedTextMode.Html)))
{
// Output the formatted text from the document
// If formatted text extraction is not supported,
// the reader will be null
Console.WriteLine(reader == null ?
"Formatted text extraction isn't supported"
: reader.ReadToEnd());
Console.ReadLine();
}
}
}
}
}

After setting up the recommended library and configuring the file paths, incorporating the provided code into your projects should be straightforward. The code is simple and straightforward. Integrating C# read text from XLS into your applications is straightforward, allowing you to easily manage and process data stored in Excel files. Whether you’re building a data analysis tool or automating report generation, this approach will save you time and effort. Well done! You’ve successfully mastered the process of reading text from XLS files using C#.

Previously, we provided a detailed guide on extracting text from PPT files using C#. For a more thorough examination, please check out our complete tutorial on how to extract text from PPT using C#.

 English