Easy way to extract text from various formats for .NET


Gears.IFilterHelper is a .NET component that allows you to use IFilters to extract text from various formats like Adobe PDF, Microsoft Office, Zip, CHM and other formats.

With Gears.IFilterHelper you can add text extraction capabilities to your .NET applications in a couple of lines of code.

The next C#/VB.NET sample demonstrate how to use Gears.IFilterHelper API to extract text from "HelloWorld.doc" document and print it to console.

using System;
using NineRays.Gears;

namespace NineRays.IFilters.Samples
  class HelloWorld
    static void Main()
      String text = IFilterHelper.GetText("HelloWorld.doc");

Imports System;
Imports NineRays.Gears;

Namespace NineRays.IFilters.Samples 

  Class HelloWorld 

  Public Shared Sub Main()
    Dim text As String = IFilterHelper.GetText("HelloWorld.doc")
  End Sub

 End Class
End Namespace

Key Features


  • Extracts text from files
  • Supports .NET 1.1, .NET 2.0, .NET 3.5, 4.0
  • C#, VB.NET and other CLS-compliant languages
  • Works with any well-written IFilters
    • Note: 64 bit applications support only 64 bit IFilters and 32 bit applications support 32 bit IFilters only
  • Supports more 50 popular document formats
  • Also supports Adobe PDF IFilter 4.*, 5.*, 6.*, 7.*, 8.*, 9.* for text extraction from PDF files from .NET
  • Supports Office 2000, Office 2003 and Office 2007, Office 2010 document formats
  • Automatically detects installed IFilters

Current Version