Feature selection with the ChiSquare function.
Chi-Square(t,ci) = (N.(AD-CB)^2)
where t = term
ci = category i
N = number of documents in the collection
A = number of times where t and c co-occur
B = t occurs without c
C = c occurs without t
D = " neither c nor t occur
for more details, see :
Yiming Yang, Jan O. Pedersen, A Comparative Study on Feature Selection
in Text Categorization, in Proceedings of ICML-97, 14th International
Conference on Machine Learning, 1997.
(available on citeseer.nj.nec.com)