Text_Extract
Text extraction form Images, OCR, Tesseract, Basic Image manipulation are all important yet very basic scripts.
This script uses pytesseract for text extraction from images, considering it only recognizes text and can
only print it, this script additionally adds a functionality to write the text in a txt and/or csv file.
Setup instructions
- Setup a
python 3.xvirtual environment. Activatethe environment- Install the dependencies using
pip3 install -r requirements.txt - You are all set and the script is Ready to run.
- Carefully follow the Instructions.
Further Readings
Some newcomers for the first time struggle with Tesseract, this is a direct link to the installer
Setting up OCR can be found here
PATH env variable can help in optimizing the code. This and this link will help you in order to achieve that.
Usage
Just make sure that Tesseract is in proper directory, run the code according the comments and guidelines.
Smaple -
Enter the Folder name containing Images: <Name of Folder>
Enter your desired output location: <Name of Folder>
Output
Output

Image containing Text

After Extraction

Author(s)
Made by Vybhav Chaturvedi