A tool used to evaluate the output of retrieval algorithms. Written in Rust.

Overview

Rusteval

Build Status

A tool used to evaluate the output of retrieval algorithms. Written in Rust.

Building

Install Rust with

curl -sSf https://static.rust-lang.org/rustup.sh | sh

and build the release version with

git clone 
cd rusteval
cargo build --release

Testing

Before testing, unzip the test result file found in the fixtures/ folder with (unzipped this is >100Mb):

gunzip fixtures/G1_TRACK_I_Bentham.xml.gz

Run the test suite with

cargo test --release

or, for a more verbose output

cargo test --release -- --nocapture

Running

After building and testing, run rusteval with

target/release/rusteval  

The fixtures/ folder contains some examples of relevance and result files (see below for an explanation of what these files are). For example, in order to reproduce some of the results of the ICFHR'14 keyword spotting competition, you can run

target/release/rusteval fixtures/TRACK_I_Bentham_ICFHR2014.RelevanceJudgements.xml fixtures/G1_TRACK_I_Bentham.xml

This should produce the results of evaluation of method 'G1' for the 'Bentham' track of the competition. Results show up for each of the selected queries, and averaged over all queries. The last lines of the output should read something like

MEAN:  precAt5    precAt10   ap
=======================================================================
       0.73813    0.60268    0.52402

This output means that mean precision at 5 is 73.8%, mean precision at 10 is 60.2%, and mean average precision (MAP) is 52.4% for the submitted method.

The retrieval paradigm, relevance and result files

The retrieval paradigm typically presupposes a finite set of queries, each associated with a finite set of matching tokens.

A retrieval algorithm returns an ordered list for each query, representing all tokens from best to worst match.

This information is necessary for evaluation. Input to the tool is read from two distinct text files, the relevance file and the result file.

The relevance file tells us:

  • What and how many are our queries
  • With what matching tokens does each query actually match

The result file tells us:

  • What is the ordered list of matching tokens, from best to worst match, for each query

Supported input file formats

trec_eval format

This format has been originally introduced for use with the trec_eval evaluation software.

Relevance file

Relevance files follow the format

qid  0  docno  rel

for each text line.

The line above tells us that query with id qid matches with token docno. The degree that the query and each token match is encoded as the floating-point value rel, taking values in [0, 1]. A perfect match has rel = 1.

Sample relevance file:

cv1 0 tok1 1
cv1 0 tok2 1
cv1 0 tok3 0
cv2 0 tok1 0
cv2 0 tok2 0
cv2 0 tok3 1

This tells us that query cv1 matches with tokens tok1 and tok2 but not tok3; query cv2 matches with token tok3 only.

Results file

Result files follow the format

qid 0 docno rank sim run_id

for each text line.

rank is an integer that is ignored but required by the format, and has to be in the range [0, 1000] according the documentation. sim is a floating-point value. Higher sim corresponds to a better match. run_id is also required but ignored.

According to the docs, the file has to be sorted according to qid.

Sample result file:

cv1 0 April_d06-086-09 0 -0.960748 hws
cv1 0 April_d05-008-03 1 -1.307986 hws
cv1 0 April_p03-181-00 2 -1.372011 hws
cv1 0 April_d05-021-05 3 -1.394318 hws
cv1 0 April_e06-053-07 4 -1.404273 hws
cv1 0 April_g01-025-09 5 -1.447217 hws
cv1 0 April_g01-027-03 6 -1.453828 hws
cv1 0 April_p03-072-03 7 -1.556320 hws
cv1 0 April_g01-008-03 8 -1.584332 hws
cv1 0 April_n01-045-05 9 -1.682590 hws

This shows results for matches with query cv1. The best match is April_d06-086-09, the worst match is April_n01-045-05. Note again that is is the rank value that encodes the order of the matches, i.e. the penultimate floating-point number in each line.

icfhr'14 keyword spotting format

This format is adapted to be used with keyword spotting, a form of image retrieval where retrieved elemens are word images, typically cropped off a containing document image. It has been used for the ICFHR'14 keyword spotting competition.

Tokens are defined with an XML word tag, that must contain the following fields in this particular order:

  • document
  • x
  • y
  • width
  • height
  • Text (optional)
  • Relevance (optional; default value = 1)

Note also that rusteval requires that each line must contain at most one XML tag.

Relevance file

Sample relevance file:

">
xml version="1.0" encoding="utf-8"?>
<GroundTruthRelevanceJudgements xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <GTRel queryid="query1">
    <word document="027_029_001" x="159" y="1775" width="184" height="89" Text="possess" Relevance="1" />
    <word document="027_029_001" x="860" y="1774" width="180" height="89" Relevance="1" />
  GTRel>
  <GTRel queryid="query2">
    <word document="027_029_001" x="1490" y="1769" width="176" height="86" Relevance="1" />
    <word document="071_053_004" x="354" y="790" width="319" height="108" Text="possesst" Relevance="0.7" />
    <word document="027_029_001" x="1460" y="178" width="298" height="98" Relevance="0.6" />
  GTRel>
GroundTruthRelevanceJudgements>

Result file

The quality of the match is encoded by the order in which the token appears in the file.

Sample result file:

">
xml version="1.0" encoding="utf-8"?><RelevanceListings xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <Rel queryid="query1">
    <word document="027_029_001" x="159" y="1775" width="184" height="89" />
    <word document="027_029_001" x="860" y="1774" width="180" height="89" />
    <word document="027_029_001" x="1490" y="1769" width="176" height="86" />
    <word document="027_029_001" x="1015" y="2182" width="189" height="87" />
    <word document="071_053_004" x="92" y="607" width="220" height="138" />
  Rel>
  <Rel queryid="query2">
    <word document="027_029_001" x="1015" y="2182" width="189" height="87" />
    <word document="071_053_004" x="92" y="607" width="220" height="138" />
    <word document="027_029_001" x="159" y="1775" width="184" height="89" />
    <word document="027_029_001" x="860" y="1774" width="180" height="89" />
    <word document="027_029_001" x="1490" y="1769" width="176" height="86" />
  Rel>
RelevanceListings>

In this example, for query2 the best match is document="027_029_001" x="1015" y="2182" width="189" height="87", and the worst match is document="027_029_001" x="1490" y="1769" width="176" height="86".

Metrics

Precision at 5

Precision at 5 is defined as the ratio of the number of instances, among the k closest matches, that are correctly retrieved, divided by k. For Precision at 5, k equals to 5, or the total number of possible matches if this number is less than 5 (the software provided with the keyword spotting competition of ICFHR 2014 also uses this convention). Precision at 10 is defined in an analogous manner.

Average Precision

Average precision is defined as the weighted average of 'Precisions at k' for all possible values of k. The weight depends on k and equals to one if the k-th retrieved instance is a match. Otherwise it equals to zero.

For more details, see

@ARTICLE{Giotis17,
    title = "A survey of document image word spotting techniques",
    author = "A. P. Giotis and G. Sfikas and B. Gatos and C. Nikou",
    journal = "Pattern Recognition",
    volume = "68",
    number = "",
    pages = "310 - 332",
    year = "2017",
    publisher = "Elsevier"
}
You might also like...
Radiate is a parallel genetic programming engine capable of evolving solutions to many problems as well as training learning algorithms.

Radiate Coming from Evolutionary Radiation. Evolutionary radiation is a rapid increase in the number of species with a common ancestor, characterized

Compare binary files using alignment algorithms

biodiff Compare binary files using alignment algorithms. What is this This is a tool for binary diffing. The tool is able to show two binary files sid

Genetic Algorithms Library

genx genx provides modular building blocks to run simulations of optimization and search problems using Genetic Algorithms (GA). The vision for genx i

DSP algorithms for embedded. Often integer math.

This crate contains some tuned DSP algorithms for general and especially embedded use.

ECFFT algorithms on the BN254 base field

ECFFT algorithms on the BN254 base field This crate implements structs and traits for the ECFFT algorithms from the paper Elliptic Curve Fast Fourier

This library provides implementations of many algorithms and data structures that are useful for bioinformatics.

This library provides implementations of many algorithms and data structures that are useful for bioinformatics. All provided implementations are rigorously tested via continuous integration.

A small collection of procedural dungeon generation algorithms.

mapgen A small collection of procedural dungeon generation algorithms. Built with Rust & macroquad. WebAssembly build deployed here. Animated version

AI plays a small escape room game, written in rust
AI plays a small escape room game, written in rust

Escape AI AI learns escape a room. This is a rust based implementation of a genetic algorithm and reinforcement learning simulation that trains an AI

Genetic Algorithm library in Rust

RsGenetic Summary and Features RsGenetic is a framework for executing genetic algorithms in Rust. It is designed to have a simple but modular API. Exa

Comments
  • Analyze result file

    Analyze result file

    Input is a result file and a ground truth file.

    The output is

    • Statistics of metrics; moments other than the mean
    • Which queries fare better, which queries fare worse
    • Print a z-score per query, based on the AP statistics of the whole trial.
    functionality 
    opened by sfikas 0
  • Compare two result files

    Compare two result files

    Input should be two result files and a ground truth file.

    The output should be a synopsis of differences of the output of the two results, per query . For example, method B is better than A by +1% better MAP, because query xx gives a significantly better AP.

    functionality 
    opened by sfikas 0
  • Implement functionality for the segmentation-free paradigm

    Implement functionality for the segmentation-free paradigm

    Related issues

    1. what happens if a detector returns 0 hits for a query ? p at X should be 0 -- we need to 'punish' this detector since it evidently failed .
    • what if a detector returns a set of bboxes B, where a subset S of B contains bboxes that correspond to hits for a single query ? Solution: Only one of the bboxes in S should count as a hit.
    • The reason is that if we count more than one bbox in S as a hit, (1) the effectiveness of this detector appears to be boosted (when it really isn't) (2) we open the possibility for a detector to cheat, just by submitting a S with large |S|.
    • we should keep only the 'best' hit in S, in some well-defined sense.
    functionality 
    opened by sfikas 0
Releases(v0.5)
Owner
Giorgos Sfikas
Bsc/Msc/PhD in Comp.Sci., BA in History&Archaeology. Currently visiting lecturer at the University of West Attica and researcher at NCSR "Demokritos".
Giorgos Sfikas
Fast, parallel, extensible and adaptable genetic algorithms framework written in Rust

oxigen Oxigen is a parallel genetic algorithm framework implemented in Rust. The name comes from the merge of OXIdación (Rust translated to Spanish) a

Martín Pozo 146 Dec 18, 2022
A browser app that evolves vehicles using genetic algorithms, written in Rust and Bevy

Vehicle Evolver Deluxe This is a simulation that uses AI (to be specific: genetic algorithms) to try to build better and better vehicles. The vehicles

null 95 Dec 26, 2022
darwin-rs, evolutionary algorithms with rust

darwin-rs This library allows you to write evolutionary algorithms (EA) using the Rust programming language. Written by Willi Kappler, License: MIT -

Willi Kappler 95 Jan 1, 2023
This crate implements fast route planning algorithms in Rust.

This crate implements fast route planning algorithms in Rust. Algorithms Currently implemented: Contraction Hierarchies: The implementat

Payas Rajan 4 Aug 15, 2021
"Algorithms for approximate string matching" in Rust, with Python bindings.

ukkonen Implementation of a bounded Levenshtein distance by Esko Ukkonen in "Algorithms for approximate string matching" in Rust, with Python bindings

Ethan Smith 1 Dec 1, 2021
Common data structures and algorithms for competitive programming in Rust

algorithm-rs algorithm-rs is common data structures and algorithms for competitive programming in Rust. Contents TBA How To Contribute Contributions a

Chris Ohk 16 Dec 21, 2022
zine/book about bitmap drawing algorithms and math with code examples in Rust

A Bitmapper's Companion - zine/book about bitmap drawing algorithms and math with code examples in Rust A small zine/book written in LaTeX. In progres

Manos Pitsidianakis 42 Nov 8, 2022
Algorithms implemented in Rust, explained.

Rusty Algorithms & Data Structures for Learners This repository presents Rust implementations of common algorithms and data structures, many of which

Tianyi Shi 327 Dec 19, 2022
Bandit Algorithms in Rust

Multi-armed bandit algorithms in Rust Cargo [dependencies] bandit = "0.12.3" Description and Scope This library currently only implements the annealin

Michael Bohn 30 Nov 24, 2022
A Rust implementation the HOTP and TOTP Algorithms

xotp A Rust implementation the HOTP and TOTP Algorithms. HOTP was implemented in accordance with RFC4226 TOTP was implemented in accordance with RFC62

Tejas Mehta 3 Jan 21, 2022