Announcement

Collapse
No announcement yet.

Ocrad

Collapse
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    [DESKTOP] Ocrad

    I have downloaded and installed OCRAD (for optical character recognition) but it does not run, not even from the command line - could anyone suggest why not, please? Or suggest an alternative? Thank you.

    #2
    You say you "downloaded and installed" this application. Did you do so from the regular Ubuntu repositories, i.e., did you use Muon Package Manager or the CLI (sudo apt-get install ocrad)? Also, here is the link to the online manual: https://www.gnu.org/software/ocrad/m...ad_manual.html
    Using Kubuntu Linux since March 23, 2007
    "It is a capital mistake to theorize before one has data." - Sherlock Holmes

    Comment


      #3
      I used the Muon Package Manager.

      Meantime I have checked permissions. I found it was using Advanced Permissions (but why?). So I went to Konsole with 'ls -l' and found : Owner read and write ; Group read; and Others read. Used CHOWN and CHGROUP to change owner and group to 'keith' - that took me back from advanced to normal permissions. Back to desktop, Owner Can Read and Write, Group Can Read, Others Can Read and the 'is executable' block filled in. Tried again to run it and it tried for a moment or two but then stopped. I seem to be still stuck ...

      That reference to the manual - true, but that helps only when you have got it up and running.

      Comment


        #4
        Well, in my 20.04 installation, ocrad doesn't have a GUI; it's a CLI tool only. Given that, from the CLI, ocra requires one (or more) options and a specified input file(s). In a konsole type: man ocrad
        Using Kubuntu Linux since March 23, 2007
        "It is a capital mistake to theorize before one has data." - Sherlock Holmes

        Comment


          #5
          I gave ocrad a try, and I wonder what you meant by "does not run". For me, it ran but did nothing. If you just run "ocrad" it silently processes standard input and does nothing. I gave it some screenshots, and even with very big 20 point font text it produced nothing.

          xsane, the scanner programme I use, uses gocr for optical character recognition. I gave it a try and it struggled on a letter from a bank, for example:
          Code:
          Changes _o help pratect you and your account - yauil start to notice extra checks when
          you're shoppìng and banking antine, and also when you_e paying someone new to hetp you
          It did better with the 20 point font screenshot, not perfect but only 4 errors in 800 bytes.
          Regards, John Little

          Comment


            #6
            > Well, in my 20.04 installation

            Hmmmmm - is Kubuntu 20.04 now available? I had thought of waiting for that.

            > I wonder what you meant by "does not run".

            That when I click the icon I briefly get an icon down on the task manager with the blue ring circling, but that then stops and nothing more happens. If I use the K symbol at bottom left corner and search for OCRAD I am offered 'run OCRAD' but a click on that just drops the pop up list - no other reaction on screen.

            > it ran but did nothing

            How long did you have to wait before it ran?

            > xsane, the scanner programme I use

            I might have a look at that but as the original need has now passed I might just give up, but my thanks to everyone who has tried to help.

            Comment


              #7
              20.04 has been available for a while. It hasn't been released yet, but you can get a pre-release ISO.
              I find it quite solid as it is. Better than 18.04 in just about every respect.

              I tried gocr (command line) a while ago with pretty dismal results.

              Comment


                #8
                Maybe it's not recognizing the character set created from an image made from at least a 300 dpi resolution?
                From the OCRAD manual:
                2 Character sets

                The character set internally used by ocrad is ISO 10646, also known as UCS (Universal Character Set), which can represent over two thousand million characters (2^31).
                As it is unpractical to try to recognize one among so many different characters, you can tell ocrad what character sets to recognize. You do this with the '--charset' option.
                If the input page contains characters from only one character set, say 'ISO-8859-15', you can use the default 'byte' output format. But in a page with 'ISO-8859-9' and 'ISO-8859-15' characters, you can't tell if a code of 0xFD represents a 'latin small letter i dotless' or a 'latin small letter y with acute'. You should use '--format=utf8' instead. Of course, you may request UTF-8 output in any case.
                Around 2006 I was asked to ocr about 800 images of legal documents and was given the opportunity to choose the OCR app that I wanted. I tried the one that came with SuSE 6.3 at the time, ocrad. My first few scans product about one error per sentence, sometimes more. Very sensitive and touchy to scan settings, and no column or block recognition at the time. So, I got permission to purchase Abbyy FineReader, from Russia. It bound itself to the GPU of the machine I was using and couldn't be installed on a second machine. It was amazing. IIRC, the error rate was about one error per 10 pages. It included spell checking, which increased its accuracy significantly, and the output was laid out like the image, in blocks and columns, in formats which included .doc, .xls, html, pdf, cvs and more. The process was so easy that they quickly gave the job to a clerk and I got a new monitor.

                The Abbyy site says that they are "out of stock".
                "A nation that is afraid to let its people judge the truth and falsehood in an open market is a nation that is afraid of its people.”
                – John F. Kennedy, February 26, 1962.

                Comment

                Working...
                X