What is JHove?
JHove is a java utility that can be used to identify document types. It is used extensively in the digital preservation field.
At work I was set the task of getting JHove to batch process all files in a folder. It does individual documents correctly using the GUI however there were no examples of it being used through the command line on a batch of files. After looking at it for a while it seemed that we would have to build a separate application in order to call JHove for each file in a folder.
Finally we worked out how to do batch processing by using JHove alone.
Here is the batch file we created
java -jar c:\jhove\bin\JHoveApp.jar -o c:\output.xml -h audit -c C:\jhove\conf\jhove.conf "C:\Documents and Settings\DigitalArchives\Desktop\testjhove"
Send the results to c:\output.xml
The type of output you what to have. You need to set it to 'audit' in order to do batch processing
Use the configuration file c:\jhove\conf\jhove.conf (This is the default config file and it can be changed to specify which types of files you want to test for)
The folder that contains the files to audit
"C:\Documents and Settings\DigitalArchives\Desktop\testjhove"
Thing to improve the process are as follows
- Allow multiple folder to audit
- Create a front end for the process that will allow you to pick a folder to audit and choose an output file.
Please let me know if you are using JHove and how you are using it.
You can also drag a folder to the JHOVE GUI window, and it will process all the files in the folder.
@gary thanks for your comment. I didn't know you could do that. I'll try it out tomorrow.
Thank you Gary that helped a lot! very convinient :-)
Post a Comment