Text of individual pages can also be saved individually in single text filesĮxtractor.GetNextPageText(dataDir + () + "_out_.txt") Bind PDF file with the extractor objectĮxtractor.BindPdf(dataDir + "inFile.pdf") Įxtractor.GetText(dataDir + "PdfExtractorFeatures_text_out_.txt") Specify start and end pages of the PDF Create an instance of PdfExtractor class If (containsText = true & containsImage = false)Ĭonsole.WriteLine("PDF contains text only") Įlse if (containsText = false & containsImage = true)Ĭonsole.WriteLine("PDF contains image only") Įlse if (containsText = true & containsImage = true)Ĭonsole.WriteLine("PDF contains both text and image") Įlse if (containsText = false & containsImage = false)Ĭonsole.WriteLine("PDF contains neither text or nor image") Now find out whether this PDF is text only or image only Calling HasNextImage method in while loop. Extract images from the input PDF document Check if the MemoryStream length is greater than or equal to 1 Save the extracted text to a text file Extract text from the input PDF document ![]() ![]() Bind the input PDF document to extractorĮxtractor.BindPdf(dataDir + "FilledForm.pdf") PdfExtractor extractor = new PdfExtractor() ![]() Instantiate a memoryStream object to hold the extracted text from Document String dataDir = RunExamples.GetDataDir_AsposePdfFacades_TechnicalArticles()
0 Comments
Leave a Reply. |