About Gedcom Files

by Phil Albro


Those of us who have been doing our genealogy work on the computer for years often take it for granted that everyone knows what GEDCOM files are and how they can be used. Well, nobody was born knowing about gedcoms, and nobody ever will know anything about them unless somebody provides some explanation! This article is being written to provide a little practical, simple-minded information about these useful files.

You can think of "gedcom" as standing for Genealogy Data Communication. It is a standardized format (a format is a set of rules) for writing computer files containing family tree information, so that it can easily be communicated to others. The format was invented by the Latter Day Saints to assist in their genealogical research, and generally adopted by every other group that produces genealogy software. Assuming a given computer platform (Mac running whatever, PC running Windows, DOS, Unix, etc., a server running whatever they run, and so on) has an up-to-date genealogy software program installed, a given gedcom file should be readable on any machine. So you can convert your family tree into a gedcom file on a PC running Windows 99, email it as a file attachment to someone with a jMac, and they will be able to read it in whatever genealogy software they use. The gedcom file can contain all your names, dates, marriage linkages, notes about each person, source information, the whole collection. At present, the main limitation of gedcom files compared to the default file types specific to a given genealogy program is that they can't contain images (pictures.) I predict that some day these will be possible inclusions in gedcoms too.

There are various utility programs, most available as freeware or shareware, to manipulate your gedcom file and do all sorts of useful things before you send it off. One utility program will split a big gedcom into two smaller ones, to make it easier to send in email. Another will strip out all information about people who are still living. Another will convert upper case names and places into lower case or vice versa. Of course, there are utility programs that simply read gedcom files and display them on the screen.

I would like to recommend a few specific utility programs that you should own if you expect to do a lot of work with gedcoms. The easiest way to obtain these is to do a search for the file names on the internet. Some of the "hits" will be places from which you can download the files. Another option is to go to surnameweb and see a list of links to sites that have gedcom utilities available. I am only going to mention programs I've used - there are many others, but I don't want to recommend something I haven't tried. GEDCLEAN and GEDCLEAN32 are programs that remove the information about living people from a gedcom file. You can use a variety of criteria to decide who is living; what I like about them is that they let you override the determination for any individual, so you could leave specific living individuals in there if you have their permission. GEDSPLIT is the program that makes multiple small gedcoms out of one big one. GEDCOM FAMILIES 1.1 is a free utility that runs on the Mac, to convert a gedcom file into an ordinary ASCII text file. GENVIEWER in its most recent incarnation lets people who don't have a family tree program read, search, display, and print the contents of a gedcom file. Moreover, it will attach part of itself to the gedcom so you can send it as a "self viewing" file to someone who has no family tree software. GEDCAPS is the utility that converts between upper and lower case names and places.

One possible problem you may run into is that if a gedcom file was made according to the most recent set of rules, and you have an older program to read them, there may be fields (categories of data) that your software doesn't understand. It will still read the file, and pull in everything it DOES understand, but it will skip the parts that are alien to it. If you find that happening a lot, you might consider downloading the latest free genealogy software provided by the Latter Day Saints at their web site (www.familysearch.org).

Another problem is fairly specific - for some reason different genealogy programs accept approximate dates in different formats. For example some may use "Abt. 1844" only; others may accept "Bet. 1843 - 1845", and so on. In general, if you receive a gedcom file from a program that uses a different style for approximate dates than the program you're using, all those dates may get skipped. If that happens, you can take the drastic step of opening the gedcom file in an ordinary text editor like Notepad, and search for the skipped entries, writing the dates into your genealogy program by hand. Or you could download the utility program FIXDATES (which only runs under DOS) and fix the dates.

Your genealogy software probably has the ability to "import" a gedcom file in addition to (or instead of) merging it with your tree file. In this case it simply "translates" the information in the gedcom into the database format it uses for its own files and saves the results to disk.

As an example: If you happen to be running a genealogy program under Windows, you would open the File menu and select "load" or "open", whatever it calls loading in a new file. It will probably have a box for selecting the type of file to open, and one of those types should be files with .ged extensions in the file name. Select that, get the directory that contains your gedcom file on screen, and select the file. Your program may ask you which fields you want to include - if so, and you don't know what the different fields mean, just select all of them. Your program should do the rest.

Now for some advice, based on bad mistakes I've made in the past!

(1) Never merge a gedcom with your own database until you have a backup copy of the unmerged database file somewhere safe! There can be all sorts of garbage in gedcom files.

(2) Open and read through the gedcom file before you even consider a merge. The name format may be incompatible with what you've chosen for your data, there may be individuals listed that don't link to the others and just fill up space in the file, there may be multiple copies of the same individual that will be much harder to delete after a merge, and there may be individuals also in your database, but with slightly different information so your program won't recognize them as duplicates.

(3) The following has been my common experience - you may be a lucky exception. There are thousands of gedcom files available for downloading on the Internet. They are extremely tempting! You can add thousands of names to your family tree in one fell swoop! But do you know what is the commonest "source" listed for individuals in the on-line gedcoms? Somebody else's gedcom! And if you find that one, their main source was yet somebody else's gedcom! That is never a valid sole source listing. In short, you are importing claims, not data, into your database under these conditions, and the information needed to evaluate the claims won't be there. Obviously, there are exceptions. There are some excellent gedcom files with full, valid source information in them on the Internet. I wish you luck in finding them, but have done my duty in warning you that they are indeed the exception and not the rule.

Phil Albro, 28 June 2001.


[Return to Main Page]

[Return to Research Menu]

Copyright © 2001 Phil Albro. Commercial use prohibited.