ABU 25-mar-2019
The problem: extract text from images using a web-service.
Let's take a picture like this
and try to extract the text ....
We are going to use an OCR web-service provided by "ocr.space" web-site.
WARNING The following code should be run from a wish shell, since results , especially if they are in a non-latin alphabet, may look gibberish
We need a registration key ( it's free, see https://ocr.space/ocrapi ), a bunch of code (attached at the end of thi page), then we could run:
set MYAPIKEY "xxxxxx" ;# <-- insert your free API-key here set URL "http://api.ocr.space/Parse/Image" # prepare parameters for http::POST ... set header [list apikey $MYAPIKEY] set form { {file -file c:/tmp/myImage.png} {language eng} {scale true} } set token [http::POST $URL -header $header -post $form] set response [http::data $token] http::cleanup $token set txt [ocr.space.decodeResponse $response] puts "-------------------------" puts $txt puts "-------------------------"
You can see the result below ; it's not 100% perfect, I highlighted some *errrrors* ...
CHAPTER Software starts as an Idea. Let's assume It's a good *Ideaâan* idea that could make the world a better place, or at least make someone some money. *me* challenge of the software developer is to take the idea and make *It* real. into something that actually delivers that benefit. The original Idea is perfect, beautiful. *Ifthe* person who has the idea happens to be a talented software developer. then we might be in luck: the *Idea* could be turned Into working software without ever needing to be explained to anyone else. More often, though, the person with the original idea doesn't have the necessary programming skill to make It real. Now the idea has to travel from that person's mind *Into* other people's. It needs to be communicated.
Response is fast, less than one second (including the upload time), but if you don't want to wait for the response, you can issue an asynchronous call.
Just add a -command option to http::POST and write a callback proc :
# callback for the -command option proc onResponse {token} { ... get the token, .. extract the response ... don't forget to cleanup/free the token ... decode the token and store the result somewhere } set header { .. same as previous ..} set form { .. same as previos ... } http::POST http://api.ocr.space/Parse/Image -headers $header -form $form -command onResponse puts "OCR launched ... result will be saved later somewhere ..."
Of course you need two special commands: http::POST and ocr.space.decodeResponse ; you can download them later, but first, let's examine the above code deeper ...
The core of the above scripts are two commands ; the generic command http::POST, and the very specific command ocr.space.decodeResponse.
ocr.space.decodeResponse' is responsibile for decoding the json response and for extracting the bare text.
Here we will neglegt ocr.space.decodeResponse since it's nothing more than a crafted 'json to Tcl' conversions;
http::POST is a new general-purpose command, and it is fully described at the following page http::POST.
Now if you are still interested in, you can download everything ...
In order to run the above scripts, a few non-standard packages are required. At this link [L1 ] you can find the full source code with the required packages, and short demo with a primitive GUI.