Page tree
Skip to end of metadata
Go to start of metadata

The FITS installation is moving to Docker.  A new article will be written to update how to build and deploy FITS for SIdora.

The File Information Tool Set (FITS) identifies, validates, and extracts technical metadata for various file formats. It wraps several third-party open source tools, normalizes and consolidates their output, and reports any errors. Note, that our operational experience has shown that FITS has sensitivities to Java and the other languages used by its individual tools so caution is suggested when performing updates.

The FITS Web Service (FITSservlet) is a project that allows FITS to be deployed as a service on either Tomcat or JBoss. The project has been built with Java 8. (It has been tested on Tomcat 7, Tomcat 8, and minimally tested on JBoss 7.1.)

FITS Install (Step-by-step guide)

  1. Install the raw FITS package where <fits-home> is the installation directory:

    cd <fits-home>
    sudo wget http://projects.iq.harvard.edu/files/fits/files/fits-1.2.0.zip
    sudo unzip fits-1.2.0.zip
    sudo chown -R fedora:fedora fits-1.2.0
    sudo ln -s /<fits-home>/fits-1.2.0 fits
    sudo ln -s /<fits-home>/fits/fits.sh /usr/bin/fits
    sudo chmod a+x /<fits-home>/fits/fits.sh
    sudo chown -h fedora:fedora fits
  2. Edit <fits-home>/fits/fits-env.sh:

    # Comment out FITS_HOME and set it to the install directory
    #FITS_HOME=`dirname $FITS_ENV_SCRIPT`
    FITS_HOME="/<fits-home>/fits"
     
    # OPTIONAL FOR SOME SYSTEMS (SEE FITS ERROR TIP BELOW)
    LD_LIBRARY_PATH="$FITS_HOME/tools/mediainfo/linux"
    export LD_LIBRARY_PATH

    ParallelGCThreads

    By default, JVM allocates set of threads for the garbage collection during the JVM start-up and the number of threads is calculated based on the number of CPU cores/threads. This behavior can interfere with the running user's max number of process limits especially if multiple JVMs are created in short period where threads aren't released fast enough. We are setting the ParallelGCThreads java option in the fits.sh to limit the threads and an alternative method would be increasing the system limit on the max number of process for the given user.

    There was an internal FITS configuration primarily in EXIF tools for handling specific cameras for the Camera Trap project. Testing has indicated that it is no longer needed and so has been dropped from this installation for FITS 1.0 and after. A copy of the old code has been archived at SI should there ever be a need to reference it.

    fits.sh
    cmd="java -classpath \"$APPCLASSPATH\" edu.harvard.hul.ois.fits.Fits $args"
  3. To better support metadata for RAW image (NEF, DNG) edit <fits-home>/xml/fits.xml and move the Exiftool line to the top of the tools list. This will allow fits to give priority the the Exiftool when there are conflicts between other tools when generating metadata. Normally, especially for NEF images, the FITS output will have multiple identification sections for tools that were able to process the file. Some tools may not identify the mimetype correctly and the first mimetype listed in the identification section will take priority when generating the metadata for the FITS output.

    <fits-home>/xml/fits.xml
    <?xml version="1.0" encoding="UTF-8"?>
    <fits_configuration>
    	<!-- Order of the tools determines preference -->
    	<tools>
    		<!-- exclude-exts attribute is a comma delimited list of file extensions that the tool should not try to process -->
            <!-- include-exts attribute is a comma delimited list of file extensions that are the only ones the tool will process -->
            <!-- classpath-dirs attribute is a list of directories where any tool-specific Java JAR files and configuration files used solely by these JAR files -->
            <tool class="edu.harvard.hul.ois.fits.tools.exiftool.Exiftool" exclude-exts="txt,wps,vsd,jar,avi,mov,mpg,mpeg,mkv,mp4,mxf,ogv,mj2,divx,dv,m4v,m2v,ismv,m2ts,mpeg4" classpath-dirs="lib/exiftool" />
            <tool class="edu.harvard.hul.ois.fits.tools.mediainfo.MediaInfo" include-exts="avi,mov,mpg,mpeg,mkv,mp4,mxf,ogv,mj2,divx,dv,m4v,m2v,ismv,m2ts,mpeg4" classpath-dirs="lib/mediainfo" />
            <tool class="edu.harvard.hul.ois.fits.tools.oisfileinfo.AudioInfo" include-exts="wav" classpath-dirs="lib/audioinfo" />
            <tool class="edu.harvard.hul.ois.fits.tools.oisfileinfo.ADLTool" include-exts="adl" classpath-dirs="lib/adltool" />
            <tool class="edu.harvard.hul.ois.fits.tools.oisfileinfo.VTTTool" include-exts="vtt" />
            <tool class="edu.harvard.hul.ois.fits.tools.droid.Droid"  exclude-exts="odm" classpath-dirs="lib/droid" />
            <tool class="edu.harvard.hul.ois.fits.tools.jhove.Jhove" exclude-exts="dng,mbx,mbox,arw,adl,eml,java,doc,docx,docm,odt,rtf,pages,wpd,wp,epub,csv,avi,mov,mpg,mpeg,mkv,mp4,mpeg4,m2ts,mxf,ogv,mj2,divx,dv,m4v,m2v,ismv,pcd" classpath-dirs="lib/jhove" />
            <tool class="edu.harvard.hul.ois.fits.tools.fileutility.FileUtility" exclude-exts="dng,wps,adl,jar,epub,csv" classpath-dirs="lib/fileutility" />
            <!--<tool class="edu.harvard.hul.ois.fits.tools.exiftool.Exiftool" exclude-exts="txt,wps,vsd,jar,avi,mov,mpg,mpeg,mkv,mp4,mxf,ogv,mj2,divx,dv,m4v,m2v,ismv,m2ts,mpeg4" classpath-dirs="lib/exiftool" />-->
            <tool class="edu.harvard.hul.ois.fits.tools.nlnz.MetadataExtractor" include-exts="bmp,gif,jpg,jpeg,wp,wpd,odt,doc,pdf,mp3,bfw,flac,html,xml,arc" classpath-dirs="lib/nzmetool,xml/nlnz"/>
            <tool class="edu.harvard.hul.ois.fits.tools.oisfileinfo.FileInfo" classpath-dirs="lib/fileinfo" />
            <tool class="edu.harvard.hul.ois.fits.tools.oisfileinfo.XmlMetadata" include-exts="xml" classpath-dirs="lib/xmlmetadata" />
            <tool class="edu.harvard.hul.ois.fits.tools.ffident.FFIdent" exclude-exts="dng,wps,vsd,jar,ppt,rtf" classpath-dirs="lib/ffident" />
            <tool class="edu.harvard.hul.ois.fits.tools.tika.TikaTool" exclude-exts="jar,avi,mov,mpg,mpeg,mkv,mp4,mpeg4,m2ts,mxf,ogv,mj2,divx,dv,m4v,m2v,ismv" classpath-dirs="lib/tika"/>
    	</tools>
    	
    	<output>
    		<dataConsolidator class="edu.harvard.hul.ois.fits.consolidation.OISConsolidator"/>
    		<display-tool-output>false</display-tool-output>
    		<report-conflicts>true</report-conflicts>	
    		<validate-tool-output>false</validate-tool-output>
    		<internal-output-schema>xml/fits_output.xsd</internal-output-schema>
    		<external-output-schema>http://hul.harvard.edu/ois/xml/xsd/fits/fits_output.xsd</external-output-schema>
    		<fits-xml-namespace>http://hul.harvard.edu/ois/xml/ns/fits/fits_output</fits-xml-namespace>
    		<enable-statistics>true</enable-statistics>
    		<enable-checksum>true</enable-checksum>
    		<!-- The below controls the exclusion of the checksum for certain files, even if enable-checksum is true -->
    		<!-- Video Exclusions -->
    		<!-- <checksum-exclusions exclude-exts="avi,mov,mpg,mkv,mp4,mxf,ogv,mj2,divx,dv,m4v,m2v,ismv"/> -->
    		<!-- Audio Exclusions -->
    		<!-- <checksum-exclusions exclude-exts="wav,aif,mp3,mp4,m4a,ra,rm"/> -->
    	</output>
    	
    	<process>
    		<max-threads>20</max-threads>
    	</process>
    	
    	<!-- file name of the droid signature file to use in tools/droid/-->
    	<droid_sigfile>DROID_SignatureFile_V82.xml</droid_sigfile>
    	
    	<!-- the fits home is used by the MediaInfo tool to load the jna api libs  -->
    	<!-- in most cases you won't need to change -->
    	<!-- example for BB will be /fits -->
    	<fits_home>.</fits_home>	
    		
    </fits_configuration>
  4. To provide consistent camera make and model fields to fits output edit /opt/sidora/fits/xml/exiftool/exiftool_image_to_fits.xslt and match the following section

    vim /opt/sidora/fits/xml/exiftool/exiftool_image_to_fits.xslt
    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet version="2.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    
    <xsl:import href="exiftool_common_to_fits.xslt"/>
    <xsl:template match="/">
    
        <fits xmlns="http://hul.harvard.edu/ois/xml/ns/fits/fits_output">   
    
    		<xsl:apply-imports/>
    		
    		<metadata>
    		<image>
    		...
    		...
    		...	
    			<!--  make and model are ambiguous for identifying 
    			      a digital camera or scanner source
    			      so only use them as digital camera values
    			      if the file source or shutter speed is specified,
    			      else assume the device is a scanner -->
    			<xsl:choose>
    				<xsl:when test="exiftool/FileSource='Digital Camera'">
    					<digitalCameraManufacturer>
    					    <xsl:value-of select="exiftool/Make"/>
    					</digitalCameraManufacturer>
    					<digitalCameraModelName>
    					    <xsl:value-of select="exiftool/Model"/>
    					</digitalCameraModelName>
    				</xsl:when>
    				<xsl:when test="exiftool/ShutterSpeedValue">
    					<digitalCameraManufacturer>
    					    <xsl:value-of select="exiftool/Make"/>
    					</digitalCameraManufacturer>
    					<digitalCameraModelName>
    					    <xsl:value-of select="exiftool/Model"/>
    					</digitalCameraModelName>
    				</xsl:when>					
    				<!--<xsl:otherwise>
    					<scannerManufacturer>
    						<xsl:value-of select="exiftool/Make"/>
    					</scannerManufacturer>
    					<scannerModelName>
    						<xsl:value-of select="exiftool/Model"/>
    					</scannerModelName>
    				</xsl:otherwise>-->
                                    <xsl:otherwise>
                                            <digitalCameraManufacturer>
                                                <xsl:value-of select="exiftool/Make"/>
                                            </digitalCameraManufacturer>
                                            <digitalCameraModelName>
                                                <xsl:value-of select="exiftool/Model"/>
                                            </digitalCameraModelName>
                                    </xsl:otherwise>
    			</xsl:choose>
    			...
    			...
    			...
    </xsl:stylesheet>
  5. Make Exiftool executable (Required for NEF image conversion to JPEG in derivative routes)

    Make Exiftool executable
    sudo chmod a+x /opt/sidora/fits/tools/exiftool/perl/exiftool
  6. Check the version:

    [user@host ~]# cd # Any neutral directory
    [user@host ~]# fits -v
    1.2.0
  7. Test the Installation:

    [user@host ~]# fits -i /<fits-home>/fits/README.md
     
    #Example FITS output for the above README.md
     
    Jan 24, 2017 3:11:17 PM edu.harvard.hul.ois.jhove.JhoveBase init
    SEVERE: Testing SEVERE level
    <?xml version="1.0" encoding="UTF-8"?>
    <fits xmlns="http://hul.harvard.edu/ois/xml/ns/fits/fits_output" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://hul.harvard.edu/ois/xml/ns/fits/fits_output http://hul.harvard.edu/ois/xml/xsd/fits/fits_output.xsd" version="1.0.4" timestamp="1/24/17 3:11 PM">
     <identification>
      <identity format="Plain text" mimetype="text/plain" toolname="FITS" toolversion="1.0.4">
       <tool toolname="Jhove" toolversion="1.11" />
       <tool toolname="file utility" toolversion="5.11" />
      </identity>
     </identification>
     <fileinfo>
      <size toolname="Jhove" toolversion="1.11">5232</size>
      <filepath toolname="OIS File Information" toolversion="0.2" status="SINGLE_RESULT">/opt/sidora/fits-1.0.4/README.md</filepath>
      <filename toolname="OIS File Information" toolversion="0.2" status="SINGLE_RESULT">README.md</filename>
      <md5checksum toolname="OIS File Information" toolversion="0.2" status="SINGLE_RESULT">c1c9cf2a4be3845c2497f82fddb16e64</md5checksum>
      <fslastmodified toolname="OIS File Information" toolversion="0.2" status="SINGLE_RESULT">1481043440000</fslastmodified>
     </fileinfo>
     <filestatus>
      <well-formed toolname="Jhove" toolversion="1.11" status="SINGLE_RESULT">true</well-formed>
      <valid toolname="Jhove" toolversion="1.11" status="SINGLE_RESULT">true</valid>
     </filestatus>
     <metadata>
      <text>
       <linebreak toolname="Jhove" toolversion="1.11" status="SINGLE_RESULT">LF</linebreak>
       <charset toolname="Jhove" toolversion="1.11">US-ASCII</charset>
      </text>
     </metadata>
     <statistics fitsExecutionTime="1313">
      <tool toolname="MediaInfo" toolversion="0.7.75" status="did not run" />
      <tool toolname="OIS Audio Information" toolversion="0.1" status="did not run" />
      <tool toolname="ADL Tool" toolversion="0.1" status="did not run" />
      <tool toolname="VTT Tool" toolversion="0.1" status="did not run" />
      <tool toolname="Droid" toolversion="6.1.5" executionTime="361" />
      <tool toolname="Jhove" toolversion="1.11" executionTime="1263" />
      <tool toolname="file utility" toolversion="5.11" executionTime="1244" />
      <tool toolname="Exiftool" toolversion="10.00" executionTime="1240" />
      <tool toolname="NLNZ Metadata Extractor" toolversion="3.6GA" status="did not run" />
      <tool toolname="OIS File Information" toolversion="0.2" executionTime="295" />
      <tool toolname="OIS XML Metadata" toolversion="0.2" status="did not run" />
      <tool toolname="ffident" toolversion="0.2" executionTime="1080" />
      <tool toolname="Tika" toolversion="1.10" executionTime="590" />
     </statistics>
    </fits>

    FITS ERROR

    If you get the following error running FITS:

    "Exception in thread "main" edu.harvard.hul.ois.fits.exceptions.FitsToolException: Error loading native library for MediaInfo please check that fits_home is properly set
    at edu.harvard.hul.ois.fits.tools.mediainfo.MediaInfo.<init>(MediaInfo.java:92)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at java.lang.Class.newInstance(Class.java:442)
    at edu.harvard.hul.ois.fits.tools.ToolBelt.init(ToolBelt.java:136)
    at edu.harvard.hul.ois.fits.tools.ToolBelt.<init>(ToolBelt.java:72)
    at edu.harvard.hul.ois.fits.Fits.<init>(Fits.java:239)
    at edu.harvard.hul.ois.fits.Fits.<init>(Fits.java:121)
    at edu.harvard.hul.ois.fits.Fits.<init>(Fits.java:110)
    at edu.harvard.hul.ois.fits.Fits.main(Fits.java:290)"

    Edit <fits-home>/fits/fits-env.sh and add the following after "export FITS_HOME"

    LD_LIBRARY_PATH="$FITS_HOME/tools/mediainfo/linux"
    export LD_LIBRARY_PATH


    More on this see:

    http://projects.iq.harvard.edu/fits/fitsfaq (Video Support Details Section)

    https://github.com/galterlibrary/digital-repository/issues/480#issuecomment-247592570

    https://github.com/harvard-lts/fits/issues/120

    Error loading native library for MediaInfo please check that fits_home is properly set

FITS Web Service Install (Step-by-step guide)

  1. Stop Fedora

    # sudo service fedora stop
  2. Edit Fedora Tomcat catalina.properties

    1. Add Entries to <FEDORA_CATALINA_HOME>/conf/catalina.properties
      1. Add the <fits.home> environment variable.

        fits.home=/opt/sidora/fits
  3. Deploy FITSservlet to Fedora's Tomcat

    # wget http://projects.iq.harvard.edu/files/fits/files/fits-1.1.3.war <FEDORA_CATALINA_HOME>/webapps/
    # chown fedora:fedora <FEDORA_CATALINA_HOME>/webapps/fits-1.1.3.war
  4. Add all <fits.home>/lib/ JAR files to /usr/local/fedora/webapp/fits-1.1.3/WEB-INF/lib directory.

    cp <fits.home>/lib/*.jar /usr/local/fedora/tomcat/webapps/fits-1.1.3/WEB-INF/lib/

    Note: Do NOT add any JAR files that are contained in any of the FITS lib/ subdirectories to this classpath entry. They are added programmatically at runtime by the application. (Additional Information: https://github.com/harvard-lts/FITSservlet)

  5. Start Fedora

    # service fedora start
  6. Test FITS Web Service

    1. Obtaining the version information for FITS being used to examine the input files returned in plain text format.
      1. Using Curl:

        # curl --get http://localhost:8080/fits-1.1.3/version
        
        
        1.2.0
      2. Using Browser:

        http://localhost:8080/fits-1.1.3/version

There is no content with the specified labels