com.sparsity.sparksee.gdb
Class AttributeStatistics

java.lang.Object
  extended by com.sparsity.sparksee.gdb.AttributeStatistics

public class AttributeStatistics
extends java.lang.Object

Attribute statistics class.

It contains statistic data about an attribute.

Some fields are valid just for numerical attributes and others just for string attributes. Also, some statistics are considered BASIC because computing them do not require to traverse all the different values of the attribute. For each getter method the documentation tells if the statistic is BASIC or not. See the Graph class method getAttributeStatistics or check out the SPARKSEE User Manual for more details on this.

Author:
Sparsity Technologies http://www.sparsity-technologies.com

Method Summary
 double getAvgLengthString()
          Gets the average length.
 long getDistinct()
          Gets the number of distinct values (BASIC statistics).
 Value getMax()
          Gets the maximum existing value (BASIC statistics).
 int getMaxLengthString()
          Gets the maximum length.
 double getMean()
          Gets the mean or average.
 double getMedian()
          Gets the median.
 Value getMin()
          Gets the minimum existing value (BASIC statistics).
 int getMinLengthString()
          Gets the minimum length.
 Value getMode()
          Gets the mode.
 long getModeCount()
          Gets the number of objects with a Value equal to the mode.
 long getNull()
          Gets the number of objects NULL a Value (BASIC statistics).
 long getTotal()
          Gets the number of objects with a non-NULL Value (BASIC statistic).
 double getVariance()
          Gets the variance.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

getMin

public Value getMin()
Gets the minimum existing value (BASIC statistics).

Returns:
The minimum existing value.

getMinLengthString

public int getMinLengthString()
Gets the minimum length.

If the attribute is not an string attribute, it just returns 0.

Returns:
The minimum length.

getVariance

public double getVariance()
Gets the variance.

It is computed just for numerical attributes.

Returns:
The variance.

getMode

public Value getMode()
Gets the mode.

Mode: Most frequent Value.

Returns:
The mode.

getNull

public long getNull()
Gets the number of objects NULL a Value (BASIC statistics).

Returns:
The number of objects NULL a Value.

getDistinct

public long getDistinct()
Gets the number of distinct values (BASIC statistics).

Returns:
The number of distinct values.

getMean

public double getMean()
Gets the mean or average.

Mean or average: Sum of all Values divided by the number of observations.

It is computed just for numerical attributes.

Returns:
The mean.

getMax

public Value getMax()
Gets the maximum existing value (BASIC statistics).

Returns:
The maximum existing value.

getMedian

public double getMedian()
Gets the median.

Median: Middle value that separates the higher half from the lower.

If a < b < c, then the median of the list {a, b, c} is b, and if a < b < c < d, then the median of the list {a, b, c, d} is the mean of b and c, i.e. it is (b + c)/2

It is computed just for numerical attributes.

Returns:
The median.

getTotal

public long getTotal()
Gets the number of objects with a non-NULL Value (BASIC statistic).

Returns:
The number of objects with a non-NULL Value.

getMaxLengthString

public int getMaxLengthString()
Gets the maximum length.

If the attribute is not an string attribute, it just returns 0.

Returns:
The maximum length.

getAvgLengthString

public double getAvgLengthString()
Gets the average length.

If the attribute is not an string attribute, it just returns 0.

Returns:
The average length.

getModeCount

public long getModeCount()
Gets the number of objects with a Value equal to the mode.

Returns:
The number of objects with a Value equal to the mode.