The Gigatrees GEDCOM 5 Validator (VGedX)

The Gigatrees GEDCOM 5 () is a standalone application similar to in configuration and operation, but is limited to the validation of 5 files only. It can be downloaded here. is effectively just a small wrapper around The GEDCOM 5 Parser (TGP), VGedX validates both the GEDCOM grammar and well as the GEDCOM dictionary, and will display the parser's validation results. It supports the following GEDCOM versions.

  • GEDCOM 5.5 Rev. 1 (January 2, 1996)
  • GEDCOM 5.5 Rev. 2 (January 10, 1996)
  • GEDCOM 5.5.1
  • GEDCOM 5.6 (Draft)
TGP, upon which and is based validates the GEDCOM 5 Grammar as defined in the above listed specifications. GEDCOM 5.5.5 made changes to the underlying format and GEDCOM 7.0 uses another format entirely. As such, TGP cannot be used, without significant changes, to validate those data formats. It is for this reason that I made the source code for TGP available, giving other developers the opportunity to extend its capabilities by providing additional GEDCOM grammar and GEDCOM dictionary support. GEDCOM 6.0 XML is another beast entirely, being that it is in XML format.

Command Line Options

VGedX supports several command line options. The input file and output path are optional here and can be set via the <Main> configuration option show below. The log is always optional.

Config File
-c config.xml
Loads an individual configuration file.
Input File
-i family.ged
Loads a GEDCOM file.
Output Path
-o web
Sets the output path.
Log
-l build.log
Generates a log file.
Export Test File
-b[v] test.ged
Creates and exports a GEDCOM test file based on the GEDCOM version specified ( -b55R1, -b55R2, -b551, -b56 ).

The distribution file contains all four GEDCOM test files ( gedcoms/55r1.ged, gedcoms/55r2.ged, gedcoms/551.ged and gedcoms/56.ged ) created using the '-b' option. These test files can be used to test VGedX's importing capabilities. The test files use typical values only, and are therefore not useful for testing boundary uses cases. The distribution includes an additional file ( gedcoms/test_grammar.ged ) that can be used to test some boundary conditions.

Configuration

To configure VGedX you must provide the path to your GEDCOM file and the output folder either using the command line options listed above, or here. File and folder paths may be entered as absolute or relative paths. Relative paths are relative to the folder where the application file ( vgedx.exe ) resides, called the working directory.

<Main> Options

<GedcomFile>
[ ]
Expects an absolute or relative file path.
<OutputPath>
[ ]
Expects an absolute or relative file path.

VGedX categorizes its validation results as Errors, Warnings and Alerts (see TGP). Errors cannot be hidden.

<Validation> Options

<ShowValidationWarnings>
[ true ]
Shows validation warnings. It is sometimes useful to turn off warnings temporarily to reduce the size of the table used to hold the validation results.
<ShowValidationAlerts>
[ true ]
Shows validation Alerts.
<ShowUnusedRecords>
[ false ]
Shows records that are not referenced elsewhere in your GEDCOM file, making it effectively unused.
<ShowUserDefinedRecords>
[ true ]
Shows records that start with an underscore ( i.e. _PRIV ). These are not used internally by the GEDCOM standards and therefore come from various vendors. As such they may not be recognized by an importing application.
<ShowTrailingDelimiters>
[ true ]
Trailing delimeters such as spaces or tabs, are a violation of the GEDCOM standard, but cause no harm and were prevalent in my export testing of many genealogy applications.
<ShowTagWarningDuplicates>
[ true ]
Often when a particular tag causes a warning, thousands of that same tag will as well. If you are only interested in looking at the first occurance, you can disable this, reducing the validation table size significantly.
<ShowTagLists>
[ true ]
VGedX builds tag lists for each record ( i.e. INDI@.BIRT.@SOUR ), where and ampersand appearing after a tag indicates a level 0 record containing a record id ( INDI, SOUR, OBJE, FAM, etc.), and an ampersand appearing before the tag indicates it is a reference to another record and contains an xref_id. When disabled, these are not displayed, making it tricky to understand where the issue occured in the file without looking up the line number in the file. On the otherhad, crazy long tag lists can cause the validation table to display less ... efficiently.
<ValidationDataWidth>
[ 100 ]
When text is display in the validation table, it is limited in length and if cropped an ellipsis will be appended to the text. You can control the width of this text here.

Example

In the following example, the GEDCOM file and the output path are both located in the working directory.

<Options>

  <Main>
    <GedcomFile> gedcoms/test_grammar.ged </GedcomFile>
    <OutputPath> vgedx                    </OutputPath>
  </Main>
  
  <Validation>
    <ShowValidationWarnings>      true  </ShowValidationWarnings>
    <ShowValidationAlerts>        true  </ShowValidationAlerts>
    <ShowUnusedRecords>          false  </ShowUnusedRecords>
    <ShowUserDefinedRecords>      true  </ShowUserDefinedRecords>
    <ShowTrailingDelimiters>      true  </ShowTrailingDelimiters>
    <ShowTagWarningDuplicates>    true  </ShowTagWarningDuplicates>
    <ShowTagLists>                true  </ShowTagLists>
    <ValidationDataWidth>          100  </ValidationDataWidth>
  </Validation>

</Options>

Sample

The following image shows a portion of the validation report. In the GEDCOM file section are listed details about the file itself. It is important to make sure that the File Line Count matches the number of lines in the actual file. There are some unusual errors that can appear as an end-of-file marker, preventing the file from being completely parsed by TGP. A GEDCOM revision will be displayed only if detected. In the GEDCOM status section, status types are color coded, and the table can be resorted by clicking on any of the table headings.

VGedX - Report
VGedX - Report
Comments