SQLMatch(tm) Program Overview
A brief overview of the SQLMatch probabilistic linkage program
Record linkage is commonly carried out either by a large and complex program such as
AutoMatch or by a single person using SAS or some other statistical software program.
AutoMatch gives accurate results but is extremely expensive and uses proprietary
commands to accomplish tasks like data cleaning and the like. SAS, is less expensive and
is also commonly used, but it is not a relational database and the linkage program must be
written from scratch or purchased. So doing true probabilistic linkage with SAS can be
When I was faced with the task of doing probabilistic linkage on several large California Vital
Statistics datasets, these two options were not feasible. Buying an existing linkage program
was too costly, as was hiring a group of consultants. I did not have time during the work day
to write a probabilistic linkage program. So I developed SQLMatch in my spare time in SQL
using MS SQL Server as a tool to increase my work output. SQLMatch is a stored procedure
generator program which creates a customized set of SQL scripts that can be run to link any
two datasets using the standard Fellegi and Sunter (1969) probabilistic model in tandem
with the Jaro-Winkler fuzzy matching algorithm. I have used it for several years now on large
datasets with very good results.
The user interface program is written in Microsoft Visual Basic .Net 2005. So it is easy to
install and use.
Please download the user manual for a complete description of SQLMatch.