This repo contains Arabic OCR App. The APP can be used to extract the Arabic text from the images. This was built based on the EasyOCR library. EsayOCR built detection/recognition model to detect and recognize the characters and words. For detection part they used the pretrained model for CRAFT algorithm. For recognition they built a CRNN model. For our case, we used the two pretrained model for Arabic language. To create the wep app, we used the Streamlit library.
There are many option to run or install the app we will show three of them:
you can run colab notebook and go through the ngrok link to run the app.
,
In this step we assume that conda is preinstalled on the machine. If conda is not installed you can follow the steps on the that link
- At first we need to clone the repo to the local machine.
git clone https://github.com/maidaly/Arabic_OCR.git
- Create a new conda enviroment to run the app inside it.
conda create --name arabic_ocr
conda activate arabic_ocr
- Install the required python packages
pip install -r requirements.txt
- Run the app
streamlit run app.py
The command need to run from the folder that contains the repo files. It will generate two links you can go throgh http://localhost:8501 to run the app on the local host.
In this step we assume that Docker is running on your machine.
- Clone the repo silmilar to conda installtion.
- Convert directory to the repo location.
- Build a docker image.
docker image build -t arabic_ocr:app
- Run the image
docker run -p 8501:8501 arabic_ocr:app
After running the image we can go to http://localhost:8501 to run the app.
- The first time running the app it may take time (some moments) to download the pretrained models that used. The time depends on the network speed. Then the pretrained models will be saved to used later.
- The app is running faster with the machine that contains Nvidia gpus. If the gpu is not availble the app will run but with slow performance.