

o 
accuracy
This measures the portion of all decisions that were correct decisions. It is defined as (a+d)/(a+b+c+d). It falls in the range from 0 to 1, with 1 being the best score. Note that macroaccuracy and microaccuracy will always give the same number. 
o 
error
This measures the portion of all decisions that were incorrect decisions. It is defined as (b+c)/(a+b+c+d). It falls in the range from 0 to 1, with 0 being the best score. Note that macroerror and microerror will always give the same number. 
o 
precision
This measures the portion of the assigned categories that were correct. It is defined as a/(a+b). It falls in the range from 0 to 1, with 1 being the best score. 
o 
recall
This measures the portion of the correct categories that were assigned. It is defined as a/(a+c). It falls in the range from 0 to 1, with 1 being the best score. 
o 
F1
This measures an even combination of precision and recall. It is defined as 2*p*r/(p+r). In terms of a, b, and c, it may be expressed as 2a/(2a+b+c). It falls in the range from 0 to 1, with 1 being the best score. 
Sometimes it’s worth trying to maximize the accuracy score, but accuracy (and its counterpart error) are considered fairly crude scores that don’t give much information about the performance of a categorizer.
The general execution flow when using this class is to create a Statistics::Contingency object, add a bunch of results to it, and then report on the results.
o $e = Statistics::Contingency>new() Returns a new Statistics::Contingency object. Expects a categories parameter specifying the entire set of categories that may be assigned during this experiment. Also accepts a verbose parameter  if true, some diagnostic status information will be displayed when certain actions are performed.
o $e>add_result($assigned_categories, $correct_categories, $name) Adds a new result to the experiment. The lists of assigned and correct categories can be given as an array of category names (strings), as a hash whose keys are the category names and whose values are anything logically true, or as a single string if there is only one category.
If you’ve already got the lists in hash form, this will be the fastest way to pass them. Otherwise, the current implementation will convert them to hash form internally in order to make its calculations efficient.
The $name parameter is an optional name for this result. It will only be used in error messages or debugging/progress output.
In the current implementation, we only store the contingency tables per category, as well as a table for the entire result set. This means that you can’t recover information about any particular single result from the Statistics::Contingency object.
o $e>set_entries($a, $b, $c, $d) If you don’t wish to use the c<add_result()> interface, but still take advantage of the calculation methods and the various edge cases they handle, you can directly set the four elements of the contingency table with this method.
o $e>micro_accuracy Returns the microaveraged accuracy for the data set.
o $e>micro_error Returns the microaveraged error for the data set.
o $e>micro_precision Returns the microaveraged precision for the data set.
o $e>micro_recall Returns the microaveraged recall for the data set.
o $e>micro_F1 Returns the microaveraged F1 for the data set.
o $e>macro_accuracy Returns the macroaveraged accuracy for the data set.
o $e>macro_error Returns the macroaveraged error for the data set.
o $e>macro_precision Returns the macroaveraged precision for the data set.
o $e>macro_recall Returns the macroaveraged recall for the data set.
o $e>macro_F1 Returns the macroaveraged F1 for the data set.
o $e>stats_table Returns a string combining several statistics in one graphic table. Since accuracy is 1 minus error, we only report error since it takes less space to print. An optional argument specifies the number of significant digits to show in the data  the default is 3 significant digits.
o $e>category_stats Returns a hash reference whose keys are the names of each category, and whose values contain the various statistical measures (accuracy, error, precision, recall, or F1) about each category as a hash reference. For example, to print a single statistic:
print $e>category_stats>{sports}{recall}, "\n";Or to print certain statistics for all categtories:
my $stats = $e>category_stats; while (my ($cat, $value) = each %$stats) { print "Category $cat: \n"; print " Accuracy: $value>{accuracy}\n"; print " Precision: $value>{precision}\n"; print " F1: $value>{F1}\n"; }
Ken Williams <kwilliams@cpan.org>
Copyright 20022008 Ken Williams. All rights reserved.This distribution is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
perl v5.20.3  STATISTICS::CONTINGENCY (3)  20130609 
Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.