Some mathematical details of the method of Herget
The method of Herget is really a pretty simple one; that is the main reason I decided to use it to determine initial orbits in my FIND_ORB orbit determination software. If someone asked me to stand up at a chalkboard and explain how it works, or to sit down at a computer and write a program for it, I could do so without cracking open a book or having to wave my hands when I got to the hard parts... this is not true of the methods of Laplace, Olbers, or Gauss. Also, it can make use of many observations and can account for perturbed orbits; the others can only use three and assume unperturbed motion.
The ability to handle perturbations is especially important for "odd" cases. It's the reason FIND_ORB has found orbits for cases such as Cassini's flyby of Earth, some Earth-orbiting satellites, Shoemaker-Levy 9 (comet that crashed into Jupiter) and all five of the outer satellites of Uranus.
The C/C++ source code I use for FIND_ORB is posted on this site; click here for details. If you're a decent C programmer, this might be a reasonable way to learn about the method of Herget (and about least-squares corrections to orbits and several other basic concepts). However, the following mathematical description is probably of more interest, even to C programmers.
Almost everything described below was extracted from J M A Danby's excellent book, Fundamentals of Celestial Mechanics , published by Willmann-Bell.
The method works as follows. Suppose you have a set of observations for an object, in right ascension and declination. You start out by assuming that you know the distance to the object for two of those observations... these are the R1 and R2 that FIND_ORB asks for. FIND_ORB also chooses the first and last observation, which seems to help the method avoid strange divergences.
Anyway, since you know the distance and RA/dec for those two points in time, you also know exactly where the object was on those two dates. Call those positions P1 = (x1, y1, z1) and P2 = (x2, y2, z2). Given that, there is one and only one orbit that could take the object from P1 to P2 in the time delta_t = (t2 - t1). Danby supplies several ways of computing what that orbit would be. I found most of them a little confusing, so I came up with my own very simple method. (I'm sure this isn't at all original, though. I just haven't found a reference to it yet.)
(1) Start out with a "guessed" initial velocity V1 = (vx1, vy1, vz1),
vx1 = (x2 - x1) / delta_t vy1 = (y2 - y1) / delta_t vz1 = (z2 - z1) / delta_t
If there were no gravity from the Sun and planets, and the object just moved in a straight line, this velocity would be exactly correct. (As a matter of fact, Dave Tholen's TWOOBS code does essentially just that. If delta_t is small enough, say, a few days, it's a "good enough" approximation; the object doesn't fall all that far toward the sun over the intervening time. You might want to try this first, then add the following refinements later.)
(1b) (Completely optional) The next improvement over the above, "just go in a straight line" estimate is to include gravity to first order. You do that by computing the object's acceleration at the approximate midpoint of the orbit: (x1+x2)/2, (y1+y2)/2, (z1+z2)/2. If you assume this acceleration (ax, ay, az) to be close to the constant acceleration between t1 and t2, you can get a much better initial V1:
vx1 = (x2 - x1) / delta_t - ax1 * delta_t / 2 vy1 = (y2 - y1) / delta_t - ay1 * delta_t / 2 vz1 = (z2 - z1) / delta_t - az1 * delta_t / 2
This step is by no means necessary. Physically, you can think of it as resembling a rifleman taking a long shot and thinking, "The bullet will drop ten centimeters due to gravity over that far a distance; I'll aim ten centimeters high in order to compensate." In this case, the rifleman is ignoring the fact that gravity will (for this special sort of "bullet") change somewhat in intensity and direction while the bullet is in the air; but the changes are small enough that this provides a good estimate, and one that is a lot better than the estimate that ignored gravity completely. This is a useful analogy; it'll crop up again later on.
If you do include this rough compensation for gravity, you'll probably be all set for delta_t of up to a few weeks.
(2) Integrate the orbit numerically from the time t1 to the "end time" t2. (If you were only including the sun, then you could use an analytic solution. But you need to be able to numerically integrate orbits anyway to handle perturbations. FIND_ORB uses Fehlberg's modified version of the Runge-Kutta method, as described in Fundamentals of Celestial Mechanics, p. 297-300; but the algorithm you use to integrate the orbit is somewhat irrelevant, as long as it gives accurate results.)
(3) Your end result will, of course, not be (x2, y2, z2), because of the effects of gravity during the time... the object does not really move as simply as was suggested above. Suppose you are off by an amount (delta_x, delta_y, delta_z). Then you recompute a new "guess" for the velocity,
vx1 = vx1 - delta_x / delta_t vy1 = vy1 - delta_y / delta_t vz1 = vz1 - delta_z / delta_t
(4) You have to repeat steps (2) and (3) until those delta values become suitably small.
This method has the advantage of being simple and physically obvious. Step (3) corresponds to saying, "My first attempt was .13 AU 'high' and .08 AU to the 'left'; so my next attempt will be 'lower' by .13 AU and 'to the right' by .08 AU", again in much the way a rifleman might take one shot, then use the result to correct his aim on future shots. Also, the numerical integration can include the perturbing effects of planets; the more sophisticated methods described by Danby do not do this. I suppose you could say that this is a "physics oriented" approach, while the others are more "mathematically oriented".
The method has the disadvantage that, if more than about 1/8 of an orbit is involved, the delta_xyz values do not converge, and there is no solution! I recently modified my code to handle this. At this point, the math starts to get slightly hairier and the connection to the real physics, much less obvious. The reason is that, over longer periods of time, the relationship between changes in the initial velocity and where the object ends up is not so simple. In the above, we assumed that if we pushed the object faster, it would end up past the target point; aim above the target, and it would end up above the target; and so on. That ceases to be true for longer orbits. You could aim above and end up to the left, below and end up to the right, and so on.
Getting around this is ugly and usually unnecessary; you can work by starting out with a sub-chunk of the data covering less than about an eighth of an orbit, and get a good enough "introductory" orbit to use in a full six-parameter least-squares adjustment. Alternatively, you can note that the idea of using the method of Herget with the first and last observations as the "fixed" ones is by no means graven in stone; you can switch this to be two observations about an eighth of an orbit apart, near the middle of the arc.
Doing this becomes both ugly and necessary if you're dealing with heavily-perturbed objects or problems such as "a probe must leave Earth on date X and reach Mars on date Y, after completing half an orbit". (Most probes to Mars do exactly this.) If you don't expect to do this (and few people ever do), you can skip over the next few paragraphs. Otherwise, read on:
We'll suppose that your velocity guess, (vx1, vy1, vz1), leads you to be off by a vector 'delta' at your destination at time t2. Suppose that we adjust this velocity guess by a small amount to result in (vx1 + h, vy1, vz); this results in a new offset, delta + h * P. P is essentially a (vector) linear derivative, showing how your error at the far end of the trip varies with small changes in the x-component of the initial velocity. That is, "adjust the starting x-component of the velocity by a certain amount, and it changes your position at the far end of the orbit, time t2, by P times that amount." We're getting that derivative numerically.
Now we try again with velocity guesses (vx, vy + h, vz), resulting in an offset delta + h * Q, and (vx, vy, vz + h), resulting in an offset delta + h * R. We can now generalize by saying that, for a velocity (vx + eps_x, vy + eps_y, vz + eps_z), we'd end up at the offset (delta + eps_x * P + eps_y * Q + eps_z * R) at time t2.
Of course, what we really want to get is an offset of zero (that is, we want to hit the target location). Therefore,
offset = (delta + eps_x * P + eps_y * Q + eps_z * R) = 0
This is a not-too-tough exercise in linear algebra. Breaking it up into components in x, y, and z, we get
delta_x + eps_x * Px + eps_y * Qx + eps_z * Rx = 0 delta_y + eps_x * Py + eps_y * Qy + eps_z * Ry = 0 delta_z + eps_x * Pz + eps_y * Qz + eps_z * Rz = 0
...that is, we've got three unknowns (the 'epsilon' values) and three linear equations, and the solution becomes easy. Going back up a paragraph or two, we now know that, given velocity (vx + eps_x, vy + eps_y, vz + eps_z) at time t1, we'll be at a zero offset at time t2 (that is, we'll hit the target location.)
One nice aspect of this is the fact that our model for computing the offsets at time t2 can include perturbations if we wish. That means that we suddenly have a way to compute the "transfer orbit" that works anywhere in the solar system, a fact that was recently of great utility to me in dealing with irregular satellites of Saturn, Uranus, and Jupiter.
One caveat here: because the problem is not entirely linear, you'll find that if you numerically integrate with the new velocity, you'll be much closer to, but not exactly dead on, the target location. You just have to repeat the procedure until you're close enough to keep you happy (which usually doesn't take very many iterations.)
In any case... whether you used a simple but modest-accuracy or a more complex but highly accurate method, you have now determined an orbit for the object, starting at time t1, position (x1, y1, z1), velocity (vx1, vy1, vz1), that matches your observations at times t1 and t2. In between, there are errors in RA and dec, for observations 1, 2, 3, ... n.
Let's call these errors
delta_ra(i) = observed_ra(i) - computed_ra(i) delta_dec(i) = observed_dec(i) - computed_dec(i), i = 1, 2, ... n.
Observation 1 matches t1, and observation n matches time t2, where the errors are zero. So delta_ra(1) = delta_dec(1) = delta_ra(n) = delta_dec(n) = 0. But for all other i, we can assume that the observed position and the computed position will differ.
These 'delta_ra' and 'delta_dec' values are also known as 'residual errors' or 'O-C' (observed minus computed) values.
At this point, we've got a way to measure the error for each object, and we can compute a "total error" for our guess at the object's orbit:
total_error = sum( delta_ra(i) ^ 2 + delta_dec(i) ^ 2), i = 1 to n.
This is usually shown by FIND_ORB as a "root-mean-square", or RMS, error:
rms_error = sqrt( total_error / n)
Our goal is to minimize the total error (or, equivalently, the RMS error). Then we can say we have the orbit that gives a "best fit" to the observations.
The usual method for doing this is the method of least squares. In most cases, the method of least squares is quite horrifying for a programmer to contemplate... which is really too bad, because it is an enormously helpful mathematical tool. (I've "encapsulated" the method into a set of software tools that are used in my astrometry software, in orbit computation, and in analyzing errors in telescope mounts. All this is in LSQUARE.CPP in the Find_Orb source. You'll still need a good math textbook to figure out what I did.)
In this particular case, though, we only need solve for two variables, R1 and R2. That simplifies matters a little, though not as much as one would ideally like. I'll leave the full description to Danby.
Dave Tholen and Rob McNaught have each come up with two decent ways of getting R1 and R2. Dave Tholen's TWOOBS program (meaning "two observations", but pronounced as the single-syllable "twoobs") works by trying a wide range of R1 and R2 values. PCs are brisk enough that this is not necessarily a big deal. TWOOBS can immediately discard absurdly hyperbolic solutions and (slightly more risky) retrograde solutions.
TWOOBS makes no judgments about the remaining orbits. You wind up with a slew of possible solutions, which does give you a clue as to the ephemeris uncertainty. Rob McNaught came up with a scheme he calls "Pangloss", because it comes up with "the best of all possible orbits". In this method, you again try a whole slew of R1 and R2 values, but consider how "reasonable" each orbit is. Minor planets do fall into distinct groups and families, so you can say (for example) "this looks like a normal kind of main-belt orbit of the sort 90% of asteroids have" vs. "practically no known object has that sort of semimajor axis... this is probably nowhere near the correct solution."
At present, FIND_ORB just adjusts R1 and R2 to minimize the sum of the residuals. I hope to make it slightly Panglossian by having it minimize the sum of the residuals minus a "reasonableness factor". That factor would be zero for a really bizarre orbit, and high for a reasonable-looking orbit. If there were only two observations available (a case where, no matter what R1 and R2 are, the sum of the residuals will be zero), the result would be purely Panglossian. As more observations become available, it would become a mix of Pangloss and Herget. And with a long enough arc, it would basically be judging the orbit purely on the basis of the observations, and discarding its preconceived ideas about "reasonableness".
More generally, once you have "good" values of R1 and R2, and therefore have an orbit that describes your observed data pretty well, you can switch to the general method of least squares, the "full step" in FIND_ORB. With this method, FIND_ORB adjusts (x1, y1, z1) and (vx1, vy1, vz1), again trying to find the minimum value for the total_error. Because we are no longer insisting that the first and last observations have zero error (or, equivalently, because we can now adjust six values instead of only two, R1 and R2), we can usually get a lower total_error than we did before.