Monday:

Kolmogorov-Smirnov Test

Tests for how similar two data sets are, by measuring the largest distance between the two functions. The IDL function, kstwo, works by inputting two data sets and outputting the K-S statistic "D" and the corresponding "prob". If prob is small, the tests are likely not from the same origin. I ran this test on my data and a couple sets of random data, and taking the mean and standard deviation of numerous trials for comparison. I got the kinds of results I was expecting between the random sets, but got two differing results on the GOODS stars when I included my whole star catalog versus limiting it to the 27th magnitude.

Results

GOODS-N to 27th mag vs. Random 1 (Normalized*, 1 set vs 9 sets)

d-mean: 0.19797699

d-stdev: 0.04482917

prob-mean: 0.69592268

prob-stdev: 0.2092225

*The first time I ran it, I hadn't yet normalized them, so the sets had varying total populations, and resulted in even higher values of d and lower probabilities.

GOODS-N to 28th mag vs. Random 1 (Normalized, 1 set vs 9 sets)

d-mean: 0.089285724

d-stdev: 0.055698492

prob-mean: 0.99998375

prob-stdev: 0.18113585

GOODS-S to 27th mag vs. Random 1 (Normalized, 1 set vs 9 sets)

d-mean: 0.16683391

d-stdev: 0.027475102

prob-mean: 0.84035881

probsigma: 0.12022707

GOODS-S to 28th mag vs. Random 1 (Normalized, 1 set vs 9 sets)

d-mean: 0.12184878

d-stdev: 0.042824080

prob-mean: 0.99533697

prob-stdev: 0.19727988

Random 2 vs. Random 3 (1 set vs 9 sets)

d-mean: 0.14321605

d-stdev: 0.055629589

prob-mean: 0.92319757

prob-stdev: 0.20686308

Random 2 vs. Random 3 (9 sets vs 9 sets)

d-mean: 0.12757371

d-stdev: 0.032610029

prob-mean: 0.98843256

prob-stdev: 0.070053501

Random 4 vs. Random 5 (100 sets vs 100 sets)

d-mean: 0.1610636

d-stdev: 0.046950535

prob-mean: 0.8827874

prob-stdev: 0.18292440

Conclusions

I'm more comfortable going with the statistics done on the GOODS data to the 27th magnitude, since in my work before eliminating the dimmest data points gave me a more stellar sample. This means their likenesses to randomness are lowered. The South field I think is still well within range to call "close to random" at about 84%, given the averages and standard deviations where the sets are known to be random. The North field I can't say quite as confidently, at almost 70%, but it lies at the edge of what I'd call random.

Subscribe to:
Post Comments (Atom)

## No comments:

## Post a Comment