HiQPdf Documentation

Search Text in PDF

Quickly Create High Quality PDFs

With HiQPdf Library you can search a text in a PDF document using the PdfTextExtractSearchText(String, String, Int32, Int32, Boolean, Boolean) method HiQPdfPdfTextExtract class. You can choose to match the case or to match the whole word only when searching using this method parameters.

Search Text in PDF Demo

In this demo you can learn how to search a text in a PDF document. You can choose the text to search, you can choose to match or not the case or whole word only and you can choose the range of PDF pages where to search the text. The found text is highlighted in the original PDF.

Demo Source Code

C#
// get the PDF file
string pdfFile = Server.MapPath("~") + @"\DemoFiles\Pdf\InputPdf.pdf";

// get the text to search
string textToSearch = textBoxTextToSearch.Text;

// create the PDF text extractor
PdfTextExtract pdfTextExtract = new PdfTextExtract();

int fromPdfPageNumber = int.Parse(textBoxFromPage.Text);
int toPdfPageNumber = textBoxToPage.Text.Length > 0 ? int.Parse(textBoxToPage.Text) : 0;

// search the text in PDF document
PdfTextSearchItem[] searchTextInstances = pdfTextExtract.SearchText(pdfFile, textToSearch,
            fromPdfPageNumber, toPdfPageNumber, checkBoxMatchCase.Checked, checkBoxMatchWholeWord.Checked);

// load the PDF file to highlight the searched text
PdfDocument pdfDocument = PdfDocument.FromFile(pdfFile);

// highlight the searched text in PDF document
foreach (PdfTextSearchItem searchTextInstance in searchTextInstances)
{
    PdfRectangle pdfRectangle = new PdfRectangle(searchTextInstance.BoundingRectangle);

    // set rectangle color and opacity
    pdfRectangle.BackColor = Color.Yellow;
    pdfRectangle.Opacity = 30;

    // highlight the text
    pdfDocument.Pages[searchTextInstance.PdfPageNumber - 1].Layout(pdfRectangle);
}

// write the modified PDF document
try
{
    // write the PDF document to a memory buffer
    byte[] pdfBuffer = pdfDocument.WriteToMemory();

    // inform the browser about the binary data format
    HttpContext.Current.Response.AddHeader("Content-Type", "application/pdf");

    // let the browser know how to open the PDF document and the file name
    HttpContext.Current.Response.AddHeader("Content-Disposition", String.Format("attachment; filename=SearchText.pdf; size={0}",
                pdfBuffer.Length.ToString()));

    // write the PDF buffer to HTTP response
    HttpContext.Current.Response.BinaryWrite(pdfBuffer);

    // call End() method of HTTP response to stop ASP.NET page processing
    HttpContext.Current.Response.End();
}
finally
{
    pdfDocument.Close();
}
See Also

Other Resources