Windows 64ビットでJava 8でtesseract 3を使用して、OCRでスキャンしたPDFを作成しています。Tess4j ページの指示に従い、必要な DLL の 64 ビット バージョンを使用し、64 ビットの Ghostscript をインストールしました。
通常の @Test (引数なし) で単体テストを実行すると、コードは正しく実行されるため、すべてが正しくインストールされていると思います。
2 つのスレッドを並行して実行すると (以下を参照)、例外が発生します。
関連するスレッドhereを読みましたが、使用しているTesseract1を使用することをお勧めします(両方を試しました)。
何か案は?
これはコードです:
// @Test // works
@Test(invocationCount = 2, threadPoolSize = 2)
public void testOcr() throws OcrException, TesseractException {
File scannedPdf = new File(this.getClass().getClassLoader().getResource("scanned.pdf").getFile());
// Tesseract instance = Tesseract.getInstance(); // JNA Interface Mapping
Tesseract1 instance = new Tesseract1(); // JNA Direct Mapping
String str = instance.doOCR(scannedPdf);
System.out.println("OCR Result: " + str);
}
これは例外です:
log4j:WARN No appenders could be found for logger (org.ghost4j.Ghostscript).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Ιουλ 16, 2014 6:22:23 ΜΜ net.sourceforge.vietocr.PdfUtilities convertPdf2Png
SEVERE: Cannot initialize Ghostscript interpreter. Error code is -21
org.ghost4j.GhostscriptException: Cannot initialize Ghostscript interpreter. Error code is -21
at org.ghost4j.Ghostscript.initialize(Ghostscript.java:365)
at net.sourceforge.vietocr.PdfUtilities.convertPdf2Png(Unknown Source)
at net.sourceforge.vietocr.PdfUtilities.convertPdf2Tiff(Unknown Source)
at net.sourceforge.vietocr.ImageIOHelper.getIIOImageList(Unknown Source)
at net.sourceforge.tess4j.Tesseract1.doOCR(Unknown Source)
at net.sourceforge.tess4j.Tesseract1.doOCR(Unknown Source)
at OcrUtilsTest.testOcr(OcrUtilsTest.java:19)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:84)
at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
java.lang.Error: Invalid memory access
at com.sun.jna.Native.invokeInt(Native Method)
at com.sun.jna.Function.invoke(Function.java:383)
at com.sun.jna.Function.invoke(Function.java:315)
at com.sun.jna.Library$Handler.invoke(Library.java:212)
at com.sun.proxy.$Proxy3.gsapi_init_with_args(Unknown Source)
at org.ghost4j.Ghostscript.initialize(Ghostscript.java:350)
at net.sourceforge.vietocr.PdfUtilities.convertPdf2Png(Unknown Source)
at net.sourceforge.vietocr.PdfUtilities.convertPdf2Tiff(Unknown Source)
at net.sourceforge.vietocr.ImageIOHelper.getIIOImageList(Unknown Source)
at net.sourceforge.tess4j.Tesseract1.doOCR(Unknown Source)
at net.sourceforge.tess4j.Tesseract1.doOCR(Unknown Source)
at OcrUtilsTest.testOcr(OcrUtilsTest.java:19)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:84)
at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
net.sourceforge.tess4j.TesseractException: javax.imageio.IIOException: I/O error reading header!
at net.sourceforge.tess4j.Tesseract1.doOCR(Unknown Source)
at net.sourceforge.tess4j.Tesseract1.doOCR(Unknown Source)
at OcrUtilsTest.testOcr(OcrUtilsTest.java:19)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:84)
at org.testng.internal.Invoker.invokeMethod(Invoker.java:714)
at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: javax.imageio.IIOException: I/O error reading header!
at com.sun.media.imageioimpl.plugins.tiff.TIFFImageReader.readHeader(TIFFImageReader.java:224)
at com.sun.media.imageioimpl.plugins.tiff.TIFFImageReader.locateImage(TIFFImageReader.java:231)
at com.sun.media.imageioimpl.plugins.tiff.TIFFImageReader.getNumImages(TIFFImageReader.java:279)
at net.sourceforge.vietocr.ImageIOHelper.getIIOImageList(Unknown Source)
... 18 more
Caused by: java.io.EOFException
at javax.imageio.stream.ImageInputStreamImpl.readShort(ImageInputStreamImpl.java:229)
at javax.imageio.stream.ImageInputStreamImpl.readUnsignedShort(ImageInputStreamImpl.java:242)
at com.sun.media.imageioimpl.plugins.tiff.TIFFImageReader.readHeader(TIFFImageReader.java:199)
... 21 more
更新: これに関連しているようです。