Comments on Geeking with Greg: Size matters? Or simplicity?

To amend my last comment: I missed the last paragr...

2008-05-08T07:18:00.000-07:00

To amend my last comment: I missed the last paragraph; the authors tried mixing techniques to reach 98% accuracy.

This doesn't mean the other metrics are useless.I ...

2008-05-08T06:33:00.000-07:00

This doesn't mean the other metrics are useless.
I suppose other metrics make different classification errors that the 2000-words predictor, and I expect a combination of several methods would reach a higher score.

We found similar correlations between word count a...

2008-05-04T14:07:00.000-07:00

We found similar correlations between word count and quality at Epinions. Eric.

Reminds me of "Pivoted Document Length Normalizati...

2008-05-04T14:03:00.000-07:00

Reminds me of "Pivoted Document Length Normalization", http://citeseer.ist.psu.edu/singhal96pivoted.html and I guess the avg length of the featured articles could be a parameter into this technique.

I never noticed the wikipedia featured articles, but it's great to have that as it's a fun task to try to train a classifier to recognize other "good" articles, and it looks like from the references this has already been explored.

-- Dave