Stark Labs Affordable, Powerful, and Easy to Use Astrophotography Software


Stacking accuracy

Q: How can I get the sharpest images in my stack using Nebulosity? How does Nebulosity compare to other stacking tools?

Nebulosity has several means of aligning the images prior to actually stacking them. We can use simple translation, translation + rotation, translation + rotation + scaling, and Drizzle. I've covered Drizzle in an article for Astrophoto Insight, so I'll focus on the more traditional methods here.

The big difference between "translation" and "translation + rotation (+ scaling)" is that when doing a translation-only alignment, Nebulosity does not resample the image. It does "whole pixel" registration. This sounds worse than "sub-pixel" registration. Isn't it better to shift by small fractions of a pixel? Well, it would be, except for the fact that when you do so, you need to know what the image looks like shifted a fraction of a pixel. That means, you must interpolate the image and interpolation does cause a loss of sharpness. So, you're faced with a trade-off. Keep the image exactly as-is and shift it by whole pixels or resample it and shift it by fractional pixels.

Now, toss into this the fact that our long-exposure shots are already blurred by the atmosphere (and to a varying degree from frame to frame) and you've got a mess if you try to determine which is better from just thinking about it. So, we have what we call an "empirical problem." Let's get some data and test it.

I took some data I had from M57 shot with an Atik 16IC at 1800 mm of focal length and some wider-field data of M101 shot on a QHY 2Pro at 800 mm. I ran the M57 data through a number of alignments and Michael Garvin ran the M101 data through several as well.

Here are the images from M57 (click here for full-res PNG file). All were processed identically, save for the alignment / stacking technique.

Here are the images from M101 (click here for full-res PNG version). Again, all were processed identically. Here, the image has been enlarged by 2x and a high-pass filter overlay used to sharpen each (all images were on the same layer in Photoshop so the same exact sharpening was applied).

So what do we take from all this? Well, first, there's not a whole lot of difference among the methods. All seem to do about the same thing. To my eye, adding the "starfield fine tune" flag in Nebulosity helps a touch and using the resampling (adding a rotation component) hurts a touch, but these aren't huge effects. Someday, I'll beef up the resampling algorithm used in the rotation + (scale) version. Comparing Nebulosity's results with those of other programs again seems pretty much a tie. I can't pick out anything in their stacks that I don't see as well in Nebulosity's. Overall, these images seem to be limited more by the actual sharpness of the original data than by the stacking method.

Combining images: means, medians, and standard deviations

Q: I hear medians are a good way to stack images as they can remove things like hot pixels, cosmic rays, or streaks from satellites. Does Nebulosity support this?

The short answer is no ... but... When combining images we want something that helps reduce the noise. We'd also like something that is tolerant of "outliers". The mean (average) is great at the first part but lousy at the second part. Medians are not so hot at the first part but great on the second part. What we'd like is something that is good at both parts. Nebulosity supports standard-deviation based filtering of your images to let you keep most of the frames and pitch just the bad ones.

OK, so what is it and why is it better? What are these 1.5, 1.75, etc. thresholds I'm being asked about?

If you were to take a perfect image of a target, each pixel would have its "ideal" or "true" value - how much intensity there is from that part of the target. The trouble is, each time we sample the target (aka each image) we get that true value for a pixel but we also get some noise on top of it. We want, of course, the true value. How do we get rid of that noise?

In statistics, we have several typical ways of describing our data. Think right now just about a single pixel (after alignment). So, we have the same spot in the arm of M51 or something. The most common way is the mean (aka average) of all of our samples (aka images, light frames, etc.). It should tell us the central tendency and therefore estimate the truth. The more samples we have, the better the estimate is since we don't have to rely on just one sample (which has truth plus or minus some noise value) or a few samples. With more samples, the noise tends to cancel and we are left with a closer estimate of the truth (the noise, BTW, tends to follow a 1/sqrt(# samples) rule). We can quantify how much noise there is in our samples with a second way of describing our data. The variance (and its square root, the standard deviation) are the typical ways we do this, telling us how much "spread" there is in our samples.

If we assume the data are "normal", about 70% of all samples will lie within one standard deviation (SD) of the mean (that is, 70% are less than one standard deviation above or one standard deviation below the average). About 95% like within 2 SD of the mean. Below, I show the histogram of 5000 normally-distributed random numbers (pretend you had 5000 light frames!). Samples in green lie within 1 SD of the mean. Yellow (and green) lie within 1.5 SD. Orange (and yellow and green) are within 2 SD and red are outside of 2SD. Now, these are all real samples (nothing like an errant satellite) but we could safely pitch those samples in red or orange and still have a good estimate of the mean. We'd not loose too many samples and we'd take out those that are more likely to be outliers. If a sample comes in that is > 2SD, odds are pretty good it's an errant value (like a hot pixel or satellite). Even if it's not, since we don't have 5000 samples - we have far fewer - filtering these out will help keep our estimate of the mean centered where it should be and not skewed by that outlier. Thinking about this diagram will help us a lot in the next step - understanding what happens during stacking. Just remember that with the standard deviation, we know what kind of values we might expect to find and what type of values are really abnormal (e.g., something 5 SD from the mean is very abnormal as there is only a 0.000057% chance this is a real sample and not the result of something else going on).

OK, given that background, here is what happens during the stack. For each (aligned) pixel, we calculate the mean and standard deviation across all of the images in the stack. If your SD threshold is at 1.5, any samples of that pixel that have an intensity beyond 1.5 SD from the mean are removed and a new average, excluding these samples, is calculated. This, BTW, is why hot pixels are often eliminated using SD stacking - those hot pixel values are very abnormal and lie far away from the mean.

With the filter set at 1.75, it takes a more extreme or "outlying" intensity value to be counted as "bad" than at 1.5. At 2.0, it takes even more abnormal a value to be excluded. Thus, more samples go into the final image using a higher threshold (and more things like semi-hot pixels as well). Typically, filtering values at 1.5 or 1.75 will yield the best results.

Standard-deviation based stacking therefore lets in more good samples than a median and takes out more (>0) bad samples than the mean (average). That's what makes it such a nice technique for filtering out bad samples. Note, you're not filtering out whole frames. This math is done on each pixel. So, frame #1 may have a bad value at pixel 109,231 but be great everywhere else. For all other pixels, this frame's data will be used but for pixel 109,231 it won't.

The technique isn't perfect. With a lot of outliers, the estimate of the standard deviation goes way up. So, we have a bigger "spread" that is considered normal and it takes something more aberrant to get filtered out. There are techniques to get around this, of course as well, but that's a topic for another day.