ocr - How to make tesseract to recognize only numbers, when they are mixed with letters? -

- September 15, 2015

want use tesseract recognize numbers. problem have mixture of
numbers & letters , when use setvariable("tessedit_char_whitelist", "0123456789")
every symbol tesseract returns wrong digit.

can set threshold value tesseract omits symbols low resemblance?

note: set tesseract recognize digits there no confusion between o , 0.

recognizing numbers answered on tesseract faq page. see page more info, if have version 3 package, config files set up. specify on commandline:

tesseract image.tif outputbase nobatch digits

as threshold value, i'm not sure mean. if input unusual font, perhaps might retrain sample of input. alternative change tesseract's pruning threshold. both options mentioned in faq.

Search This Blog

Aleternatvie

ocr - How to make tesseract to recognize only numbers, when they are mixed with letters? -

Comments

Post a Comment

Popular posts from this blog

java - netbeans "Please wait - classpath scanning in progress..." -

python - Scipy curvefit RuntimeError:Optimal parameters not found: Number of calls to function has reached maxfev = 1000 -

openxml - Programmatically format a date in an excel sheet using Office Open Xml SDK -