- Over-observing issues
- Previous over-observing solutions
- Proposed new over-observing solution
- Computing
`Nobs` - Problems with proposed new over-observing solution
- (Separate page) 'Blunder management' (how to handle outliers in a reasonably mathematically defensible manner)
- Notes about precision vs. uncertainty issues

Find_Orb currently uses several techniques to model astrometric errors. As far as I know, it's the only software to implement "blunder management" and to handle random timing errors in a mathematically reasonable way. It has some decent algorithms for handling over-observing issues. However, some recent conversations (late June 2017), both private and in reply to a bug report I posted on the Minor Planet Mailing List, caused me to think about some ways in which I (and others) could do a better job.

I have not seen, nor did I really expect, a great improvement in actual results (accuracy of the calculations) from these changes. The earlier, somewhat naïve method of handling over-observing implemented in Find_Orb in 2012 (and implemented in simplified form in FCCT15, and in better form in VFCC17) had already made a big improvement; the new methods described here couldn't really add much. The real benefit of these new methods is that the error model is more mathematically defensible than it was, with fewer ad hoc elements.

The most important thing I could do to improve error handling in Find_Orb
would probably be to implement the estimated sigmas described in
VFCC17. This paper shows that the sigmas for
survey astrometry vary depending on date (as the surveys changed techniques
and equipment), magnitude, and rates of motion in ways that make some
observations a * lot * more precise than others.

Some parts of my error handling scheme before September 2017 were embarrassingly ad hoc, with little or no mathematical justification. (Though in my defense, so was everybody else's; none of us had exactly covered ourselves in glory here.) Following are some thoughts on a more robust system.

Until 2012, Find_Orb (and everyone else, as far as I'm aware) gave observations a "weight" in the orbit solution that depended solely on the estimated astrometric uncertainty. In other words, it assumed that observations really just had random Gaussian errors that were uncorrelated with each other, and that there were no such things as systematic errors in astrometric data.

In reality, a set of observations from a particular observatory made on a particular night usually are at least somewhat correlated. They probably share a common catalog error and a common timing error. To make matters worse, in 2012, I didn't have a way to "debias" astrometry to remove at least most of the catalog error. (We can now use FCCT15 to estimate and subtract most of the systematic catalog errors.)

The problem got really bad if one observatory submitted a large fraction of the observations for an object, especially if they had a noticeable systematic error. The orbit would then reflect that data; any errors made by that observatory would overwhelm the good efforts of the other observers.

To picture this, imagine three people weigh a rock. The first two make three or four accurate measurements and call it good. The third takes a hundred measurements... but doesn't notice that the scale is incorrectly zeroed, so all the measurements are systematically wrong by the same amount. If you simply averaged all hundred-plus measurements, the third person's measurements will dominate the answer and "drown out" the good data.

FCCT15 applied a rather simple solution: find the number Nobs observations made from that observatory on that night, and give those observations each a weight of 1/sqrt(Nobs). This is equivalent to saying: "All observations made on a given night are perfectly correlated. There are no random errors, only systematic ones. Once you've gotten one observation on a given night, you might as well stop; further data will be of no value."

Phrased that way, the FCCT15 solution doesn't sound very good. But
it was still better than the default method of applying no re-weighting of
data. The default method corresponds to the opposite extreme: "All
observations on a night are completely *un*correlated. All errors are
random. Take Nobs observations on a night, and you've reduced your errors
by a factor of sqrt( Nobs)."

The FCCT15 method, as implemented by NEODyS/AstDyS in their .rwo format,
had some additional problems. The RMS errors given in the .rwo data came
already multiplied by sqrt(Nobs). In late June 2017, I realized that
this led to
a serious bug in Find_Orb: I was taking those RMS values, treating them
as "estimated astrometric uncertainties", and multiplying them by an
additional factor to correct for over-observing. In other words, the data
were being adjusted twice; the more data you got, the *less* weight it
had in the orbit! It also corrupted blunder management
and outlier detection, which thought they were dealing with "real"
uncertainties, not re-weighted ones.
(Click here for a follow-up post with further explanation of the bug.)
Fortunately, I was able to compute Nobs, *divide* the RMS values by
sqrt(Nobs), and recover the original "astrometric uncertainties". It's a
strange thing to be doing, but no actual information was lost in the process,
and the resulting uncertainties are "reasonable".

I strongly disagree with the decision to "pre-multiply" RMS values by sqrt(Nobs) or any other value, both for the above reasons and because it causes trouble if more observations are added. I've worked around these issues in Find_Orb, and I don't think anyone will come to grief as a result of them. But it was, in my opinion, a really bad idea.

I gather that some orbit computers have actually gone so far as to include a maximum of four observations from a given observatory on a given night. Get more than that, and your observations are simply ignored. This does mean you get some benefit from more observations, but only to a certain point. Unfortunately, it also means that you can't use more data to beat down the random component of the error.

When I got around to addressing the issue, none of these schemes appealed to me, but I didn't have a good grasp on how to handle it. After some thought and some discussions with Marco Micheli and Alan Harris, I decided that if Nobs <= Nmax, no re-weighting would be done. If Nobs > Nmax, observations would be given weights of sqrt( Nmax/Nobs). Somewhat arbitrarily, I set Nmax=5 by default, with the ability to reset it. (Nmax=1 corresponds closely to the FCCT15 scheme.) This method was arrived at (independently, I think) in VFCC17 with Nmax fixed at 4.

As stated above, I chose Nmax=5 without a solid theoretical basis.
Vereš et. al. similarly give no mathematical rationale; as I did,
they simply reasoned that four observations ought to get at least
*some* greater weight than two, but that beyond a certain point,
added observations were presumably not helping anymore.

A discussion with Dave Tholen got me to thinking about the role of Nmax. Suppose you're observing an object that's at the faint end of what you can do; there are random errors due to noise of about half an arcsecond. But you're using Gaia-DR2 and have checked your timing against GPS, and have otherwise managed to reduce systematic error to (say) the level of 50 milliarcseconds.

In such a case, you'd need a hundred observations just to bring your random error down to the same level as your systematic error. It's pretty clear that in this situation, Nmax ought to be much greater than four.

To reverse this, suppose someone got high-quality, low-noise data with random errors at the 50 milliarcsecond level. But they did so in 1992, and had only GSC-1.1 to reduce against, resulting in systematic errors of about half an arcsecond. With almost all your error being systematic, getting more than one or two observations on a night would be almost pointless, and Nmax should be set to one (as in FCCT15) or maybe two.

Let's assume that there is a systematic error sigma_s and a random error sigma_r. In that case, the total error sigma_t after getting Nobs observations would be

(1) sigma_t^2 = sigma_s^2 + sigma_r^2 / Nobs

For a single observation, the total error is simply that from random and systematic errors added in quadrature. As you get more observations, they gradually approach being equal to a single observation with only systematic error.

In the VFCC17 and Find_Orb methods, you get asymptotically similar behavior, in that as you get more observations, they gradually approach being equal to a single observation with error scaled by 1/sqrt(Nmax). To get matching behavior, one would set (these three expressions are equivalent)

(2) 1/sqrt(Nmax) = sigma_s / sqrt( sigma_s^2 + sigma_r^2) (3) Nmax = (sigma_s^2 + sigma_r^2) / sigma_s^2 = sigma_r^2 / sigma_s^2 + 1 (4) sigma_r^2 = (Nmax - 1) sigma_s^2

My Nmax=5, therefore, basically boils down to assuming that the random errors have twice the magnitude of the systematic errors. VFCC17's value of Nmax=4 assumes random errors have sqrt(3) = 1.732 times the magnitude of systematic errors. (As stated previously, FCCT15's value of Nmax=1 assumes no random error. The naïve method of not accounting for over-observing at all corresponds to Nmax=∞ and sigma_s = 0. At least, roughly speaking... all I've done here is to ensure that the asymptotic behavior is correct relative to the Nobs=1 case.)

You'll notice that my first hypothetical instance (random errors of half an arcsecond = 500 mas, but systematic errors down at the 50-milliarcsecond level) would correspond to 500^2 = (Nmax - 1)(50^2), i.e., Nmax = 101. If the second "terrible systematic errors, but careful measurement limiting random errors" case reversed this, so that sigma_r = 50 milliarcsec and sigma_s = 500 milliarcsec, one would have Nmax=1.01.

Finally, it's convenient to note that you can take (1) and (4) to get

sigma_t(Nobs)^2 1 + (Nmax - 1) / Nobs (5) --------------- = --------------------- sigma_t(1)^2 Nmax

This suggests a very straightforward modification of the 're-weighting' factor R(Nobs). In FCCT15, VCCT17, and (before September 2017) Find_Orb,

R(Nobs) = 1 if Nobs <= Nmax R(Nobs) = sqrt(Nmax/Nobs) if Nobs >= Nmax

(Nmax=1 for FCCT15, Nmax=4 for VCCT17, Nmax=adjustable in Find_Orb, Nmax = ∞ if no reweighting is done and all errors are assumed to be random/non-systematic.) Instead, we should be using (and Find_Orb does use, as of September 2017)

(6) R(Nobs) = sqrt(Nmax / ( Nobs + Nmax - 1))

(this is arrived at by noting that R(Nobs)^2 is simply Nobs divided by (5).) You'll note that the "new" reweighting factor is exactly equivalent to the "old" factor for Nmax=1 or ∞, and has the same asymptotic behavior for large Nobs. However, since it more accurately reflects the idea that our errors result from a mix of systematic and random errors, and does so in a less ad hoc way, I'd say it's to be preferred.

The assumption in all of this has been that observations made on a given "night" are perfectly correlated; those made the next or previous night are completely uncorrelated. This corresponds to taking Nobs for observation 'i' to be a sum over all 'n' observations,

Nobs = sum( j = 1...n) F(j) where (a) F(j) = 1 if observations i and j were on the same night from the same observatory F(j) = 0 otherwise

My first effort at handling over-observing in Find_Orb, in November 2012, was a modified version of this scheme. It looked for the observation time to be within an amount 't_max', defaulting to half a day:

(b) F(j) = 1 if |obstime(i) - obstime(j)| < t_max and both observations i and j were made from the same observatory F(j) = 0 otherwise

I did this because it was simpler, seemed reasonable, and allowed me to try out different ideas as to the time scales over which observations are correlated. (a) and (b) result in similar behavior, but my (b) approach does assume that observations made at 6:00 and 18:00 local time will be just as correlated as those made between 18:00 and 6:00 the next morning. I have a hunch that FCCT15's (a) assumption that the "same night" observations, twelve hours apart, will be better correlated than the morning/following evening observations, may be a little closer to reality. In fact, a bit later, I modified my code to use the "same night" criterion.

However, I later decided that both methods are flawed. If nothing else, the assumption that F(j) is either zero or one, i.e., observations are perfectly correlated up to a certain point and then perfectly uncorrelated, seems rather silly. And observations made within minutes might have greater correlation than those made several hours apart.

With that assumpion, F(j) = 1 if the difference in time dt is nearly zero (observations taken one after another ought to have nearly perfectly correlated systematic errors). F(j) should gradually decrease with increasing |dt|, going toward zero as |dt| approaches infinity. This sounds a lot like a Gaussian. As of September 2017, I've therefore set

(c) F(j) = exp( -(dt / t_max)^2 / 2) [if obs i & j were made at the same observatory] dt = obstime(i) - obstime(j)

This is, I would submit, more defensible than either scheme (a) or (b). One would expect that there would be at least some night-to-night correlation. One would also expect that the correlation between observations made minutes apart should be high, and that between observations made at the beginning and end of the night ought to be not so high, and that the correlation between observations made a week or more apart ought to be pretty low. (c) reflects those assumptions.

(*Note* warning below that correlation
may be a function of where we are in the sky, more than it is a function
of how much time elapsed between observations. Taking this into account would
require some sort of modification of F(j), though I've only vague ideas as
to what that would be.

*Note* that Jim Baer has kindly pointed out that
a similar method has already been in use, though not quite as
simplified as what we're talking about here: an actual correlation
matrix linking nearby observations is used. Jim suggests that this
method is not suitable for real-time calculations. The above scheme
(c) *does* work quite well in real time, though. Despite the
shortcomings I've described, I'd argue that it is a much more defensible
method than either (a) or (b), and requires only slightly more code.

• The above is reasonable mathematically. But how do we decide what the systematic and random errors might be? (Should note that observers can specify both types of errors in the new ADES format. But even if you're the observer with full details on the images and observing setup, figuring out how large either error source is can be daunting.)

If there are *many* observations on a given night, we can actually
compute the random error from the standard deviation of the residuals.
(Which would probably be the safest thing to do. If somebody got, say,
six observations, it's probably better to compute the random error,
rather than make assumptions that may not reflect what the sky, scope,
object, and camera were doing that night.)

You could then take the values of random errors computed for objects where you had lots of data, and apply them to objects that got fewer observations, but were otherwise similar.

Figuring out the systematic catalog error is more difficult. FCCT15 computed these biases and provided a way to subtract them, resulting in theoretically "unbiased" data. But we don't know how much systematic catalog bias remains. We can set it to be effectively zero for Gaia-DR2. But I don't see how we can say much about it for other catalogs.

• We've somewhat arbitrarily decided that errors are correlated over a time scale of one night. For timing errors, there's probably some truth to that. For systematic catalog errors, the correlation is really in area of the sky, not in time. If the object is moving quickly across the sky, the catalog bias will not be especially systematic. If it's a TNO spending years in one part of the sky, t_max may be closer to "years" than to "one night". (Especially because timing errors are much less significant for a TNO; systematic catalog errors will dominate.)

• What about photometry? Presumably, photometric errors should undergo a similar analysis for random vs. systematic errors, the latter being strongly catalog-dependent. At present (to my knowledge, at least), no over-observing adjustment is done by anybody in this area: go out and get lots of bad photometry, and it can overwhelm the good stuff.

• Davide Farnocchia points out that treating the "RMS errors" in
FCCT15
as "astrometric sigmas" multiplied by sqrt(Nobs) is not quite right.
Those "RMS errors" were arrived at under the assumption that they *would*
get multiplied by that factor. Use the Find_Orb/VFCC17
correction, multiplying them by a different factor, and you'll get different
results. That's a bit of a head-scratcher. Either we'd have to include an
additional "fudge factor" that brings the adjusted sigmas to a comparable
level, or the RMS errors would always have to be arrived at in combination
with a compatible over-observing scheme. (I think the "fudge factor" ought
to be possible.)

• VFCC17 makes some interesting points about the astrometric errors in surveys, and provides some general rules for assigning sigmas to astrometry that could be quite useful here. I'm going to take a guess, though, that their sigmas are slightly higher than they ought to be for the "better" surveys. The problem I see is that the sigmas are based on residuals, "observed minus computed", with the implicit assumption that you're trying to match the computed values. The better observers are probably getting results that are about as accurate as the computed orbits; if they disagree by 0.05", you don't know if the observer is wrong or if the orbit is wrong. The better your astrometry, the more it gets penalized by this.

Occasionally, one will encounter data where the precision indicates a higher uncertainty than the sigma given for it does. For example, if I tell you that I've measured the weights of three people as 51 +/- 0.1 kg, 55 +/- 0.1 kg, and 60 +/- 0.1 kg, you might wonder why I'm saying the values are good to tenths of kilograms when the actual values have been rounded to the nearest kilogram. You might wonder if the values were just totally wrong.

MPC's punched-card format normally gives RAs to 0.01 second (0.15 arcsec at the celestial equator) and decs to 0.1 arcsecond. If you tell Find_Orb that an observation is actually good to 0.14 arcsecond, you'll get a warning message of this sort :

At least one observation, made at 30 Jul 2022 14:56:21, has an uncertainty that is less than its precision. This doesn't make sense, and may indicate something's wrong with the data. See https://www.projectpluto.com/errors.htm#precision for further discussion.

The solution is either to say that the uncertainty is actually
larger than 0.14 arcsec, *or* to provide extra digits to get the
correct precision (the MPC format will allow for one more digit in both
RA and dec, and
ADES astrometry allows arbitrary precision).

Similar constraints apply to the observation time (which in the 80-column format can be to 10^-5 day = 0.864 second precision, or to a "microday" = 86.4 millisecond precision) and to the photometry (which can be given to whole magnitudes, 0.1 mag, or 0.01 mag precision). For each of those quantities, if the stated uncertainty is less than the precision, you get a warning.

In theory, Find_Orb need not be quite so strict. Rounding a quantity
to the nearest *x* introduces an error with root-mean-square value
(sqrt(3) / 6)*x*. So rounding a declination to, say, the
nearest 0.1 arcsecond introduces an error with rms=0.29 arcsecond, roughly.
This would allow the above weights to be 51 +/- 0.3 kg, etc. But even in
that case, one would wonder if a blunder had been made. I think a warning
is needed if the uncertainty is less than the precision.