What OCR?
OCR is the automatic process of converting typed, handwritten, or printed text to machine-encoded text that we can access and manipulate via a string variable.
Tesseract
Tesseract is a free and open-source optical character recognition (OCR) engine that can recognize text in images. It is widely used in the industry for extracting text from images and is available for various platforms, including Mac, Windows, and Linux.
- Install Tesseract
- Validate that Tesseract has been installed
- https://pyimagesearch.com/2017/02/20/text-skew-correction-opencv-python/
brew install tesseract
# If you’re using the Ubuntu operating system simply use
sudo apt-get install tesseract-ocr
tesseract -v
# bash: tesseract: command not found
# this mean that it is not installed or you need to
# configure you PATH variable