visual studio 2010 - How to extract multiple lines from an image using Tesseract OCR? -
we have passed image single line having text "hello world" , tesseract ocr show result 'hello world'.
but when passed image multiple lines text
hello world
how you
it doesn't show anything. whats problem? can please me on that? need urgent help.please answer.thanks in advance :)
here our codes:
#include "stdafx.h" #include <iostream> #include <baseapi.h> #include <allheaders.h> #include <fstream> using namespace std; int _tmain(int argc, _tchar* argv[]) { tesseract::tessbaseapi api; api.init("", "eng", tesseract::oem_default); api.setpagesegmode(static_cast<tesseract::pagesegmode>(7)); api.setoutputname("out"); cout<<"file name:"; char image[256]; cin>>image; pix *pixs = pixread(image); string text_out; api.processpages(image, null, 0, &text_out); cout<<text_out.string(); ofstream files; files.open("out.txt"); files << text_out.string()<<endl; files.close(); cin>> image; return 0; }
page segmentation mode 7 treats image single text line. try 3, automatic page segmentation, no osd (default).
Comments
Post a Comment