Creating your own .TDF (user-added dataset) files

  • What .TDFs are and why to use them
  • Basic theory behind .TDFs
  • Defining your own custom .TDF symbols
  • Providing limits in RA and declination
  • Adding proper motions to your dataset
  • Adding "More Info" and "click" data for your dataset
  • Adding note files for .TDF datasets
  • Handling languages other than English
  • Making lines and rectangles in .TDFs
  • Some additional .TDF keywords
  • What .TDFs are and why to use them

    As delivered, Guide displays an extremely wide variety of datasets covering all kinds of celestial objects. However, some people will have the need or desire to add objects from completely separate datasets previously unknown to Guide. The user-added dataset capability lets you do this.

    The basic idea is a pretty simple one. Most databases are in plain ASCII text or FITS files, with data arranged in columns. Guide will absolutely need to know certain basic things about the database, such as which columns contain the RA hours data, which the declination minutes, the file name of the database, the epoch of the coordinates, and so on.

    All of this information is stored in a Text Definition File (.TDF). There are three examples provided in your Guide directory, CD_DATA.TDF, CD_DATA2.TDF, and RADIO.TDF. (Users of later versions of Guide will also have CD_DATA3.TDF and CD_DATA4.TDF, which describe datasets added in those versions.) Each contains definition data for several datasets on the Guide CD. Even if you have no real desire to add your own datasets, the pre-defined datasets in these .TDF files can be useful, giving information on radio objects, quasars, nearby stars, binary stars, and more. You can also look here for still more datasets; you can then display them and/or look at the .TDF files to learn more about how the system works.

    Most people will just use these built-in and downloadable datasets. If, instead, you'd like to add your own dataset, the following describes how to go about doing so.

    Be aware that this is not a completely trivial process! If you don't know something about the format of the dataset you plan to add, or are not familiar with the use of a text editor, you should probably not attempt to do this.

    Basic theory behind .TDFs

    A good starting point in understandig how TDFs work is to look at the .TDF files mentioned in the previous section. Many users have examined these and used them as templates for putting together their own .TDF files, sidestepping the need to read documentation.

    As you can see in these files, each dataset in a .TDF file starts with two lines such as:

    file !:\radio\quasars\table1.dat
    title Quasars

    (The '!' stands for the CD-ROM drive letter; it probably will be of little use to you, since your datasets will come from someplace else.) These lines, of course, simply tell Guide where to find the data and what to call the dataset when it's listed in dialog boxes. Each dataset ends with the

    shown 0

    commands (or "shown 1/end", if the dataset is turned on). The lines in between, however, will vary widely between datasets.

    All datasets will have a description of the format of coordinates. For example,

    RA H  20   2
    RA M  23   2
    RA S  26   4
    de d  30   3
    de m  34   2
    de s  37   2

    tells Guide that, in this dataset, the RA hours of an object is stored in columns 20-21 of each line; the RA minutes in columns 23-24; and the RA seconds in columns 26-29. Quite a few datasets will omit the RA S, de s, and/or de m fields, because they use decimal degrees or minutes; this is not a problem for Guide.

    The following lines may also appear in a dataset description:

    mag   40   5   # Magnitude is in columns 40 to 44
    sizs  33   5   # Size,  in decimal seconds,  in columns 33 to 37
    resize .5      # Multiply "size" by .5 to convert diameter to radius
    text   2  17   # Text for labelling this object,  in columns 2-18
    epoch 1950     # This dataset provides B1950.0 coordinates
    offset 23040   # The actual data starts 23,040 bytes into the file
    line size 102  # Each line in this dataset is 102 characters
    nlines 7437    # There are 7437 lines in this dataset
    sort 1         # The dataset is sorted in increasing RA
    type   4       # The dataset is shown with symbol 4 = galaxy
    type sc15;e0,0,32;c1;E20,20,12;E-20,-20,12;c2;m-45,0;l45,0;c14;
    type sc4;f3;-10,-5;10,-5;0,10;c2;
    # Above two lines are user-created symbols
    align 32       # The text labels are aligned at the bottom left

    The first line tells Guide that magnitudes are stored in columns 40-44 of each record. This will be used in, for example, determining the size at which stars are drawn.

    Along with the "sizs" (size in decimal seconds), one can use a "sizm" (size in minutes) or "size" (size in degrees). "resize" is useful for converting a diameter to a radius (as shown above), or for converting from arbitrary size units.

    The "text" line tells Guide what data to use in labelling objects (if any; some datasets don't have any designation to add to the object.)

    If the dataset doesn't start right at the beginning of the file, you will have to add the "offset" keyword to tell Guide how many bytes to skip. This will always be needed with FITS files.

    In general, if you are dealing with simple text files, you can ignore the "line size", "nlines", and "sort" keywords. But if every record is exactly the same length (as happens in many text files and in all FITS files), it can help to provide these fields. If the dataset has no carriage return or line feed at the end of each line, they are absolutely essential.

    If you've provided "line size" and "nlines", and the data is sorted in order of increasing RA, then it's a good idea to add the "sort 1" keyword. If Guide knows the dataset is already sorted, it can skip over large amounts of data and draw your dataset much faster! Fortunately, many datasets are provided in this order, or are small enough to make this improvement less important.

    Also, you'll need to provide a "type" keyword, to tell Guide how to display the object. The pre-defined values for "type" are...

    0 = open clus
    1 = globular
    2 = diffuse neb
    3 = planetary nebula
    4 = galaxy
    5 = OC & neb
    6 = star
    7 = circle/ellipse
    8 = radiation symbol (for X-ray or gamma-ray sources)
    9 = radio dish (already used for all catalogs in RADIO.TDF)

    For all types except 6 (star), the symbol size will be scaled by the "sizs" (or "sizm" or "size") data. Stars are sized by the "mag" data. For type 7, if a "siz2" field is supplied, it specifies one (usually minor) axis of an ellipse, with the "size" field supplying the other (usually major) axis. If no "siz2" is supplied, you just get a circle of the specified size. If both a "siz2" and a "pa" (position angle) field are given, then the ellipse is tilted at that angle. (Take a look at the UCAS Galaxies in CD_DATA2.TDF for an example of how this works.)

    Defining your own custom .TDF symbols

    Using a pre-defined type is easy. But you can also create your own symbols. To do this, you need to alter the 'type' line in a .TDF. An example is:

    type sc15;e0,0,32;c1;E-20,-20,12;f3;-10,-5;10,-5;0,10;c2;m-45,0;l45,0;c14;

    The 's' stands for 'symbol', and tells Guide you aren't using the usual pre-defined symbol types. Following are commands separated by semicolons:

    c15;                   means set color 15 (white)
    e0,0,32;               means draw a 32-unit circle centered at (0,0);
    c1;                    means set color 1 (green)
    E-20,-20,12;           means draw a 12-unit _filled_ circle at (-20,-20);
    f3;10,15;30,15;20,30;  means draw a 3-point _filled_ object (a triangle)
                           connecting (10,15) to (30,15) to (20,30)
    c2;                    means set color 2 (brown)
    m-45,0;                means move to (-45,0);
    m45,0;                 means draw a line connecting to (45,0);
    c14;                   means set color 14 (light gray)

    Examples of some additional commands are:

    Cff7e00;               set an RGB color (described below)
    b32;                   draw a square from (-32,-32) to (32,32)
    b32,20;                draw a rectangle from (-32,-20) to (32,20)
    b11,13,3;              draw a square six units on a side centered at (11,13)
    b11,13,3,2;            draw a 6x4 rectangle centered at (10,10)
    B(numbers);            same rectangle as lowercase 'b',  but filled
    s0;                    set a solid line style
    s1;                    set a dashed line style
    s2;                    set a dotted line style

    The following commands use the same 'box' definition (using one to four numbers) as the 'b' and 'B' commands:

    +(box area);           draw a '+' inside the defined box area
    x(box area);           draw an 'x' inside the defined box area
    d(box area);           draw a diamond inside the defined box area

    It used to be that color was always set with lowercase 'c' followed by a number from 0 to 15. That was a legacy of the old DOS days, in which objects could only have one of a palette of sixteen colors. This is still available for purposes of keeping Guide backward-compatible, but Guide 8.0 can set "full colors" with an uppercase 'C'.

    The easy way to set this is to create your symbol with either a 'c' or 'C'-type color selected basically at random. Then fire up Guide, click on an object in the dataset, and then on "Display", and then on the color button. Select a new color and click OK, and the symbol will be rewritten to use the new color.

    If you prefer to set the color directly from your text editor, though, you need to know that a command such as 'Cff7e01;' breaks down into "set color with blue component ff(hex)=255(decimal), green 7e(hex)=126(decimal), red 01(hex)=1(decimal), with component values running from 0 to 255." So in the above example, blue would be completely saturated, green about halfway, and red scarcely turned on at all. Believe me, setting the colors through Guide tends to be easier.

    The symbols are scaled just like the standard, predefined types; large objects are drawn with scaled-up symbols, just as large (for example) globular clusters are drawn with larger circles. The unit of measurement in the above commands is 1/32 of an object radius. For example, the 32-unit circle drawn above would exactly match the size of the object.

    By setting the color to light gray at the end, we make sure that the label for this object is in light gray.

    Here's a more practical example. Suppose you want to show a catalog of gamma ray burst events with radiation symbols: three triangles in light blue, with orange dots in the center.

    type sc3;f6;-32,0;32,0;16,26;-16,-26;16,-26;-16,26;c2;E0,0,15;c14;

    The three triangles are drawn as one six-point fill (that's the 'f6;' part). Then the color is set to 2 (orange) and a dot is drawn. Finally, the color is reset to 14 (light gray) for the label.

    The 'align' keyword is the sum of a number for the horizontal alignment (0 for left, 1 for center, 2 for right) and a number for the vertical alignment (0 for top, 16 for center, 32 for bottom). By default, the alignment is zero, and text is shown to the upper left of an object.

    Providing limits in RA and declination

    By default, Guide will assume that your dataset covers the entire sky, and will always examine it to see if any objects fall on the screen. Often, this is a waste of time. In such cases, you can tell Guide what rectangle of sky your dataset covers.

    For example, if you look at RADIO.TDF, you'll see that the 6C Radio Sources II dataset has the following fields in it:

    declimit 30  51   # This catalog extends from N 30 to N 51
    ralimit 8.5 17.5  # This catalog extends from 8h30m to 17h30m in RA

    It just so happens that this particular dataset covers a particular "rectangle" in RA/dec. Since Guide knows this, it can (often) compare that rectangle to the one on the screen, find that there is no overlap, and deduce that there is no point in even considering this dataset any further. If you have a lot of datasets each covering a small area (such as the sections of the 6C radio survey), this can speed matters up substantially.

    The "ralimit" field is unusual, but the "declimit" one is not. For example, most datasets created in the Northern Hemisphere have a southern declination limit (except for neutrino-based observations).

    Adding proper motions to your dataset

    You've seen above how to tell Guide how to extract RA/dec data from each line, handling assorted formats, units, and precisions. If the dataset also gives proper motions, you may want to specify those as well, so that Guide will plot the stars exactly where they ought to be at the currently set time (and will give corrected positions in More Info.) A few keywords are provided to allow this to be done. For an example, look at the file gsc22.tdf and look at the section for the "UCAC-2 data (downloaded from VizieR)":

    file u2.dat
    title UCAC-2 data (downloaded from VizieR)
    RA H  11  11
    units0 -2   # RA 'hours' are really decimal degrees
    ra p  90   7
    pm_ra_unit .001   # proper motions are in .001 arcsec/year units...
    de d  24  11
    de p 100   7
    pm_de_unit .001   # ...for both RA and dec

    The proper motion in RA appears in columns 90-96, in units of milliarcseconds/year. ra p 90 7 and pm_ra_unit .001 handle this. Similarly, the proper motion in declination appears in columns 100-106, and in the same units, and the last two lines above specify that.

    One slight wrinkle: a few datasets (not very many) store the RA proper motion in "seconds of RA", a unit that becomes smaller closer to the celestial poles. This has a couple of unpleasant side effects. Measured in seconds of RA, the RA proper motion tends toward infinity at the poles, and doesn't bear any intuitive relationship to the actual motion of the star. If you run into such a case, you should still use ra p to specify the proper motion in RA, but should add the keyword pm_ra_sec somewhere in the dataset specification, so Guide knows to "adjust" the proper motion in RA. The only current example of this appears in ucac2.tdf, the TDF for showing UCAC-2 data from the "raw" files.

    In that case, the units used are particularly perverse: they are equal to .0001 arcsecond/year at the celestial equator. Since one second = 15 arcseconds, this means that the unit of proper motion is (.0001/15) = .0000066666 seconds/year, and the pm_ra_unit line in ucac2.tdf dutifully reflects this oddity.

    Yet another way of representing proper motions appears at times: in units of total motion and position angle. Also, it's assumed in all of this that the base positions are in J2000; in some cases, each object may have its own epoch (and the proper motions should be added relative to that epoch), and there are older datasets where each object can have an RA and a dec given in different epochs. Let me know if you encounter such datasets and would like to display them. I haven't made that possible yet simply because such oddities seem most common in older datasets, and no one has (yet) expressed an interest in the capability.

    Adding "More Info" and "click" data for your dataset

    You will notice that each dataset also has a few lines starting with a tilde (~), followed by "c", "r", or "b". Each of these lines involves showing some data when the object is clicked on, when you get remarks ("more info"), or both, respectively.

    After the "~(letter)", two numbers are given: the starting column and length (as was true for most of the fields already discussed). Guide first checks to make sure that this field is not blank. If it is indeed not blank, Guide shows that field, using the remainder of the "~" line to decide what the format should be.

    For example, the following line (from the quasar dataset):

    ~r 46   5 ^Color index^ (B-V): %s\n

    tells Guide that the quasar dataset stores color index data in columns 46-50. If Guide finds data in those columns (this is not a "given", since not all quasars have had their color indices measured), then Guide will show "Color index (B-V):", the color index, and then skip to a new line (that's what the '\n' means). Because this is a "~r" line, Guide will only show this data in the Remarks, i.e., when you click for more info.

    Because "Color index" is in carets (^), Guide will show that text highlighted; when you click on it, you'll get a glossary definition.

    Some datasets store data as special flags. For example, in the quasar data, column 22 can contain an "A", "O", "R", or just a blank space. Each flag has a meaning. The following lines in the .TDF file translate:

    ~r 22 0 A Position is of low accuracy\n
    ~r 22 0 O Position was found optically,  and is good to 1" or better\n
    ~r 22 0 R Position was found by radio,  and is good to 1" or better\n

    As it stands, nothing is shown if column 22 is blank, but using

    ~r 22 0   Position is not of low accuracy\n

    (or something similar) would fix that problem.

    All this has limitations if you want to, for example, test two or more columns instead of just one. Some useful additions have been made to Guide 8 for this sort of work; to take advantage of them, you do need to have the current version of the Guide 8 software. These new features are used, for example, for the XZ80P occultation dataset. Consider, for example, this line from the TDF:

    ~b117 = 'H' HIP %[118,6]\n

    This was used because, in the XZ80P dataset, if column 117 is an 'H', then an Hipparcos number is given in columns 118-123. The above line simply says that if column 117 is indeed an 'H', the Hipparcos number should be displayed in the first "short info" box and in the "more info" section.

    One can also now test against more than just one column, and check for "greater than or less than" instead of just "is equal to". For example, this line from the same .TDF:

    ~r178 > '10'  XZ80N (see XZ80N documentation for detail)\n

    says that if bytes 178 and 179 form a value greater than 10, show the given text in the "more info" section.

    It wasn't needed in this dataset, but one can also do tests such as

    ~r178 >='10'  Columns 178-179 are greater than or equal to 10\n
    ~r178 <>'10'  Columns 178-179 aren't equal to 10\n
    ~r178 <>*1'*  Columns 178-179 aren't equal to 1'\n

    ...with the last illustrating how to handle situations where the test is against something containing a single-quote: Guide looks at the byte in column 9 and hunts ahead until it finds an identical byte.

    Adding note files for .TDF datasets

    It is also possible to add note files for a .TDF dataset. There is only one example available, for the Binary Star data (BINORBIT.DAT), the last dataset defined in CD_DATA.TDF.

    You'll notice that the format description for BINORBIT contains this line:

    ~n  2  15 binorbit.not

    In plain language, this means "Notes for this dataset are found in the file BINORBIT.NOT, and are indexed using the fifteen characters found starting at byte #2 in lines from BINORBIT.DAT."

    This example was chosen because the binary orbit dataset already provided notes for most of the stars, indexed by their RA/dec values (which are given in BINORBIT.DAT in columns 2-16). You will see that the .NOT file itself is of the sort common in Guide: the object is specified with a tilde (~) plus the object identifier, and then text is given for that object.

    Handling languages other than English:

    There are a couple of ways of doing this. One is to ignore the problem; if you do this, Guide will still display the dataset when you switch to other languages... but it will always show it in English.

    The alternative is to provide, in addition to a .TDF file, a .*DF file. For example, you'll see that, in addition to CD_DATA.TDF, Guide has a CD_DATA.FDF (for French), CD_DATA.IDF (for Italian), and so forth. The letters for the various languages are given here. When you toggle to French, Guide first notices that there is a CD_DATA.TDF file; but it then asks itself, "Is there a CD_DATA.FDF file? Yes, there is... we'll use that instead."

    In some cases, you can make use of a little trick with the file keyword to handle other languages; this is described here. Please note that this is a very limited method!

    Making lines and rectangles in .TDFs:

    Occasionally, one may encounter a dataset where two RA/dec points are given on each line, and one would like to have Guide show them with a line connecting them. (For example, a list of observed meteors, where for each meteor, the RA/decs of the "start" and "end" of the trail are given.) Or, perhaps, the two RA/dec points describe opposite corners of a rectangle. (For example, a list of star atlas pages.) With the
    updated version of Guide 8, one can display such datasets.

    Some additional .TDF keywords:

    de d, de m, de s: Used to indicate the columns where declination degrees, minutes, and seconds will be found. In geographic mode, it indicates where latitude degrees, minutes, and seconds will be found. You must specify at least a declination degrees field, and there are a few cases where this is the only one needed. More commonly, though, there will also be minutes and seconds to deal with.

    file: Normally, this just indicates the filename of the dataset to be shown. There are two subtleties. For an example of the first, look at CD_DATA.TDF. The first line indicates the filename for the quasar dataset:

     file !:\radio\quasars\table1.dat 

    The initial '!:\' indicates that the data is to be found on the CD-ROM; Guide is bright enough to replace it, as needed, with the actual path to the CD.

    The second subtlety allows Guide to use a different file if one is in a particular language. For an example, take a look at the first few lines of the CONSTELL.TDF file, which lets you display full constellation names in Guide.

    file !:\text\constell.nam
    file d !:\text\constell.nad
    file i !:\text\constell.nai
    file s !:\text\constell.nas
    file c constell.nac
    title Constellation labels

    By default, Guide will use the CONSTELL.NAM file; but if you're running in German ('d'='Deutsch'), it will use CONSTELL.NAD; or CONSTELL.NAI in Italian, and so on. (The Russian CONSTELL.NAC appears in your local Guide directory.) The full list of letters for each language is:

    English        e
    German         d
    French         f
    Italian        i
    Spanish        s
    Japanese       a
    Dutch          b
    Russian        c

    is geo: Indicates that this dataset is to be drawn only on maps of the earth. (At present, that means only in eclipse mode... but at some point, I'd like to use terrestrial maps for more than just eclipses and occultations.) Examples are display of MPC observatories, the (very rough) locations of Guide users over the earth (used in the map of the distribution of Guide users on the main page of this site), and some US cities. Most recently, it's been used for the display of two million place names worldwide on eclipse/occultation charts.

    pa: Used when the object has a 'major axis' and a 'minor axis'; pa indicates the columns where the position angle of the major axis will be found. It's assumed that this will be in decimal degrees. For an example of its use, look at \TEXT\IRAS.TDF on the Guide CD-ROM, or at the "UCAS Galaxies" definition in CD_DATA2.TDF in your Guide directory. See also size, siz2.

    pref: This keyword stands for "prefix", and is used in CD_DATA2.TDF for the description of the pulsar file. In that case, the lines:

    pref PSR
    # Above line ensures that,  _for labelling_,  a 'PSR' is inserted
    # as a prefix

    do exactly what the comment suggests.

    resize: This is used to scale the result of the size field. It has two uses. First, remember that the size field assumes a radius rather than a diameter. If you use the line

    resize .5

    will have the desired effect. (This is used in the description of the Caldwell data in CD_DATA2.TDF.)

    Second, it can be handy if you get a size in odd units such as radians, or (on the earth) kilometers or miles.

    RA H, RA M, RA S: Used to indicate the columns where RA hours, minutes, and seconds will be found. In geographic mode, it indicates where longitude degrees, minutes, and seconds will be found. You must specify at least an RA hours field, and there are a few cases where this is the only one needed. More commonly, though, there will also be minutes and seconds to deal with.

    siz2: Used when the object has a 'major axis' and a 'minor axis'; siz2 indicates the columns where the minor axis will be found. It's assumed that you have used a size, sizm, or sizs line to indicate a major axis in degrees, minutes, or seconds; it's also assumed that the minor axis is in the same units as the major axis. For an example of its use, look at \TEXT\IRAS.TDF on the Guide 7.0 CD-ROM, or at the "UCAS Galaxies" definition in CD_DATA2.TDF in your Guide directory. See also size, pa (position angle).

    size, sizm, sizs: Used to indicate the columns where the object radius in degrees, minutes, and seconds will be found. Usually, you'll only use one of the three; but if an object size is given in mixed units, you can use two or three of them. See also siz2, resize.

    sort: Omitted or set to zero, it indicates that the data isn't in any particular RA or declination order. The most common sort order is 1 (ascending order in RA), but -1 (descending order in RA), 2 (ascending order in declination) and -2 (descending order in dec) also happen on rare occasion.

    Use this keyword on a dataset that isn't actually sorted in that order, and Guide will usually be erratic in displaying it (i.e., it'll show up sometimes, but not always.)

    toler: The 'tolerance' keyword is most useful for extended objects, such as Palomar plates and atlas pages. Consider the description for the Millennium Star Atlas (in CD_DATA2.TDF), which contains this line:

    toler 4  # Page center must be within 4 degrees of a screen edge

    Without this keyword, if the center of a Millennium page was off the screen, Guide would skip it. With this keyword, though, any page within 4 degrees of the edge will be examined, and no pages are mysteriously omitted.

    units0... units5: These six keywords correspond to "RA H", "RA M", "RA S", "de d", "de m", "de s", respectively. Their first use was to indicate implicit decimal points. For example, look at the description for the Washington Double Star catalog in CD_DATA2.TDF. You'll notice the line

    units1 1    # RA minutes are in tenths,  decimal point omitted

    In truth, this is the only case so far where I've found that a dataset omitted a decimal point. But "units1 1" means "shift the decimal point over one place"; "units1 3" would mean that the RA minutes were in thousandths.

    Since then, the units keyword has been expanded to handle units in radians or decimal degrees. See the description for the pulsar data in CD_DATA2.TDF for an example of this; in this dataset, the RA is in decimal degrees, and the units0 -2 keyword indicates this.