What is JHove?
JHove is a java utility that can be used to identify document types. It is used extensively in the digital preservation field.
The problem
At work I was set the task of getting JHove to batch process all files in a folder. It does individual documents correctly using the GUI however there were no examples of it being used through the command line on a batch of files. After looking at it for a while it seemed that we would have to build a separate application in order to call JHove for each file in a folder.
Finally we worked out how to do batch processing by using JHove alone.
Here is the batch file we created
java -jar c:\jhove\bin\JHoveApp.jar -o c:\output.xml -h audit -c C:\jhove\conf\jhove.conf "C:\Documents and Settings\DigitalArchives\Desktop\testjhove"
Parameters:-
Send the results to c:\output.xml
-o c:\output.xml
The type of output you what to have. You need to set it to 'audit' in order to do batch processing
-h audit
Use the configuration file c:\jhove\conf\jhove.conf (This is the default config file and it can be changed to specify which types of files you want to test for)
-c c:\jhove\conf\jhove.conf
The folder that contains the files to audit
"C:\Documents and Settings\DigitalArchives\Desktop\testjhove"
Improvements
Thing to improve the process are as follows
- Allow multiple folder to audit
- Create a front end for the process that will allow you to pick a folder to audit and choose an output file.
Please let me know if you are using JHove and how you are using it.
3 comments:
You can also drag a folder to the JHOVE GUI window, and it will process all the files in the folder.
@gary thanks for your comment. I didn't know you could do that. I'll try it out tomorrow.
Thank you Gary that helped a lot! very convinient :-)
Best,
Yvonne
Post a Comment