GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages


Manual Reference Pages  -  STATISTICS::DESCRIPTIVE::DISCRETE (3)

.ds Aq ’

NAME

Statistics::Descriptive::Discrete - Compute descriptive statistics for discrete data sets.

CONTENTS

SYNOPSIS



  use Statistics::Descriptive::Discrete;

  my $stats = new Statistics::Descriptive::Discrete;
  $stats->add_data(1,10,2,0,1,4,5,1,10,8,7);
  print "count = ",$stats->count(),"\n";
  print "uniq  = ",$stats->uniq(),"\n";
  print "sum = ",$stats->sum(),"\n";
  print "min = ",$stats->min(),"\n";
  print "max = ",$stats->max(),"\n";
  print "mean = ",$stats->mean(),"\n";
  print "standard_deviation = ",$stats->standard_deviation(),"\n";
  print "variance = ",$stats->variance(),"\n";
  print "sample_range = ",$stats->sample_range(),"\n";
  print "mode = ",$stats->mode(),"\n";
  print "median = ",$stats->median(),"\n";



DESCRIPTION

This module provides basic functions used in descriptive statistics. It borrows very heavily from Statistics::Descriptive::Full (which is included with Statistics::Descriptive) with one major difference. This module is optimized for discretized data e.g. data from an A/D conversion that has a discrete set of possible values. E.g. if your data is produced by an 8 bit A/D then you’d have only 256 possible values in your data set. Even though you might have a million data points, you’d only have 256 different values in those million points. Instead of storing the entire data set as Statistics::Descriptive does, this module only stores the values it’s seen and the number of times it’s seen each value.

For very large data sets, this storage method results in significant speed and memory improvements. In a test case with 2.6 million data points from a real world application, Statistics::Descriptive::Discrete took 40 seconds to calculate a set of statistics instead of the 561 seconds required by Statistics::Descriptive::Full. It also required only 4MB of RAM instead of the 400MB used by Statistics::Descriptive::Full for the same data set.

METHODS

$stat = Statistics::Descriptive::Discrete->new(); Create a new statistics object.
$stat->add_data(1,2,3,4,5); Adds data to the statistics object. Sets a flag so that the statistics will be recomputed the next time they’re needed.
$stat->add_data_tuple(1,2,42,3); Adds data to the statistics object where every two elements are a value and a count (how many times did the value occur?) The above is equivalent to $stat->add_data(1,1,42,42,42); Use this when your data is in a form isomorphic to ($value, $occurrence).
$stat->max(); Returns the maximum value of the data set.
$stat->min(); Returns the minimum value of the data set.
$stat->count(); Returns the total number of elements in the data set.
$stat->uniq(); Returns the total number of unique elements in the data set. For example, if your data set is (1,2,2,3,3,3), uniq will return 3.
$stat->sum(); Returns the sum of all the values in the data set.
$stat->mean(); Returns the mean of the data.
$stat->median(); Returns the median value of the data.
$stat->mode(); Returns the mode of the data.
$stat->variance(); Returns the variance of the data.
$stat->standard_deviation(); Returns the standard_deviation of the data.
$stat->sample_range(); Returns the sample range (max - min) of the data set.
$stat->get_data(); Returns a copy of the data array. Note: This array could be very large and would thus defeat the purpose of using this module. Make sure you really need it before using get_data().

NOTE

The interface for this module is almost identical to Statistics::Descriptive. This module is incomplete and not fully tested.

BUGS

o Code for calculating mode is not as robust as it should be.
o Other bugs are lurking I’m sure.

TODO

o Make test suite more robust
o Add rest of methods (at least ones that don’t depend on original order of data) from Statistics::Descriptive

AUTHOR

Rhet Turnbull, RhetTbull on perlmonks.org, rhettbull at hotmail.com

If you find this code useful, I would appreciate an email letting me know.

CREDIT

Thanks to the following individuals for finding bugs, providing feedback, and submitting changes:
o Peter Dienes for finding and fixing a bug in the variance calculation.
o Bill Dueber for suggesting the add_data_tuple method.

COPYRIGHT



  Copyright (c) 2002 Rhet Turnbull. All rights reserved.  This
  program is free software; you can redistribute it and/or modify it
  under the same terms as Perl itself.

  Portions of this code is from Statistics::Descriptive which is under
  the following copyrights:

  Copyright (c) 1997,1998 Colin Kuskie. All rights reserved.  This
  program is free software; you can redistribute it and/or modify it
  under the same terms as Perl itself.

  Copyright (c) 1998 Andrea Spinelli. All rights reserved.  This program
  is free software; you can redistribute it and/or modify it under the
  same terms as Perl itself.

  Copyright (c) 1994,1995 Jason Kastner. All rights
  reserved.  This program is free software; you can redistribute it
  and/or modify it under the same terms as Perl itself.



SEE ALSO

Statistics::Descriptive
Search for    or go to Top of page |  Section 3 |  Main Index


perl v5.20.3 DISCRETE (3) 2002-06-13

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.