What is CoPhIR

The CoPhIR (Content-based Photo Image Retrieval) Test-Collection has been developed to make significant tests on the scalability of the SAPIR project infrastructure (SAPIR: Search In Audio Visual Content Using Peer-to-peer IR) for similarity search.

CoPhIR is the result of a joint effort of NMIS Lab and HPC Lab of ISTI-CNR in Pisa, Italy.

We are extracting metadata from the Flickr archive, using the EGEE European GRID, through the DILIGENT project.

For each image, the standard MPEG-7 image feature have been extracted. Each entry of the test-bed contains:

  • The link to the corresponding entry into Flickr Web site
  • The photo image thumbnail
  • An XML structure with the Flickr user information in the corresponding Flickr entry: title, location, GPS, tags, comments, etc.
  • An XML structure with 5 extracted standard MPEG-7 image features:
    • Scalable Colour
    • Colour Structure
    • Colour Layout
    • Edge Histogram
    • Homogeneous Texture
The data collected so far represents the world largest multimedia metadata collection that is available for research on scalable similarity search techniques. CoPhIR consist of 106 million processed images.
CoPhIR is now available to the research community to try and compare different indexing technologies for similarity search, with scalability being the key issue.
Our use of the Flickr image content is compliant to the Creative Commons license. CoPhIR Test Collection is compliant to the European Recommendation 29/2001 CE, based on WIPO (World Intellectual Property Organization) Copyright Treaty and Performances and Phonograms Treaty, and to the current Italian law 68/2003.

In order to access the CoPhIR distribution, the organizations (universities, research labs, etc.) interested in building experimentations on it will have to sign the enclosed CoPhIR Access Agreement and the CoPhIR Access Registration Form, sending the original signed document to us by mail. Please follow the instruction in the section “How to get CoPhIR Test Collection”. You will then receive Login and Password to download the required files.