JHove Batch Processing
What is JHove?
JHove is a java utility that can be used to identify document types. It is used extensively in the digital preservation field.
The problem
At work I was set the task of getting JHove to batch process all files in a folder. It does individual documents correctly using the GUI however there were no examples of it being used through the command line on a batch of files. After looking at it for a while it seemed that we would have to build a separate application in order to call JHove for each file in a folder.
Finally we worked out how to do batch processing by using JHove alone.
Here is the batch file we created
java -jar c:\jhove\bin\JHoveApp.jar -o c:\output.xml -h audit -c C:\jhove\conf\jhove.conf "C:\Documents and Settings\DigitalArchives\Desktop\testjhove"
Parameters:-
Send the results to c:\output.xml
-o c:\output.xml
The type of output you what to have. You need to set it to 'audit' in order to do batch processing
-h audit
Use the configuration file c:\jhove\conf\jhove.conf (This is the default config file and it can be changed to specify which types of files you want to test for)
-c c:\jhove\conf\jhove.conf
The folder that contains the files to audit
"C:\Documents and Settings\DigitalArchives\Desktop\testjhove"
Improvements
Thing to improve the process are as follows
- Allow multiple folder to audit
- Create a front end for the process that will allow you to pick a folder to audit and choose an output file.
Please let me know if you are using JHove and how you are using it.
Comments
Best,
Yvonne
also, check Java course in Pune