Professional Documents
Culture Documents
Record of Interview
Participants State:
Dave Williams
Cathy Baskay
GAP:
Judy McCloskey
Jody Woods
Kate Brentzel
Gabrielle Anderson
Richard Hung
Architectural Concepts Mr. Williams informed us that there are five basic questions that need to
be answered when constructing a namecheck system.
of Name Searching
The first question is whether or not each namecheck query will consult all
the records in the system. In the case of CLASS, there would clearly not
be enough capacity for all 6 million records in the system to be scanned
during one namecheck, especially with many namechecks being
conducted each day.
The fifth question concerns the way in which the resulting hit list will be
ordered, e.g., with exact name matches first, or with CLASS I hits first?
Mr. Williams ran through some of the techniques that may be used in
Namechecking order to run a namecheck system. He stressed that no system will use just
Techniques one of these techniques and that each technique should be considered as a
tool. Since each technique has its strengths and weaknesses, a good
namecheck system will combine a variety of them in order to achieve the
best possible results. He also stressed that visa adjudication involves a
good deal of subjective decision-making on the part of the consular
officer.
Svnonvm Association:
s
This technique can be used with several of the namecheck fields. For
example, synonym association can be used to establish a relationship
between the name "Joe" and its derivations such as Joey, Jose, Joseph,
Guiseppe, etc. Thus, a search for Joe would turn up not only persons with
this exact first name but also those whose name was one of these
derivatives, hi the case of country, Russia has been equated with all of the
former Soviet republics, so that a search for "Russia" will result in initial
hits on any of the current independent republics, e.g., Azerbaijan, Belarus,
Estonia, Georgia, etc. Additionally, the synonym association technique
ensures that surname qualifiers (e.g., Van, De, Al, etc.) are separated out
when namechecks are performed.
Position Discounting:
,
This technique allows you to determine how many of the bi-gram or tri-
gram hits fall into the same position as they do in the desired name. For
example, a namecheck on "Wilson," using a simple bi-gram analysis,
would return "Sonils" as a hit (since 4 of the 7 bi-grams in these names
match). However, when position discounting is used along with the bi-
gram analysis, "Sonils" is rejected as a hit, since none of the matching bi-
grams in "Sonils" occupy the same positions as they do in "Wilson."
Component Comparison:
This technique assigns a value to surname endings based on the likelihood
that a surname with a particular ending belongs to someone from a
particular country. For example, the Russian surname ending in "-ichna"
is assigned a value of 0.93, indicating that there is a 93% likelihood that a
person whose surname ends in "-ichna" is from a Russian-speaking or
Slavic country. Then it is clear that the most appropriate a^oritiim to use
is the Russian/Slavic algorithm.
Cultural Regularization: \/
This technique involves transliterating a name from its foreign alphabet
spelling into the many forms it could take using the Roman alphabet.
AO< \, Qadafi, Khadafi, Cadhafi, etc.) This ensures that one spelling of
We asked Mr. Williams about the Al-Jiddi namecheck done earlier this
Al-Jiddi Namecheck year by the U.S. Consulate in Montreal. They ran a namecheck on Al-
Jiddi, a known Al-Qaeda terrorist, entering in his known name, country of
birth, estimated date of birth, and current nationality. This did not result
in a hit. Only after country of birth and nationality were left blank, did the
system return a CLASS n hit for Al-Jiddi.
Mr. Williams gave the likely reason for this. When setting up the
namecheck system as w hole, one of the first problems that must be
addressed is establishing the criteria that will determine which records
(out of 6 million) will be checked. This is Phase I of the search, i.e., when
CLASS establishes a searchable subset of the 6 million total names. One
of the most important criteria used in Phase I is the country field. In
Phase I, the country field is analyzed using country-relationship tables.
These tables indicate the likelihood that a person from the country
entered in the search will also possess biographical data from another
country. The country-relationship tables in CLASS do not indicate that a
person of Canadian citizenship is likely to have a Tunisian background.
Hence, Al-Jiddi's record was thrown out in Phase I, i.e., it was not
included in the subset of names that were then searched. Once the
country fields were left blank, the country-relationship tables were not
used to establish a subset and therefore Al-Jiddi's record was returned as
a hit.
CLASS There are about 4 major CLASS releases each year, e.g., screen changes,
table changes, or new algorithms. Posts have access to the same
algorithms that exist at headquarters. The algorithms currently running in
CLASS are: Russian/Slavic; Arabic; Hispanic; generic; date of birth; and
country of birth. Linguistics teams usually put together four groups of
names to test the various algorithms, but it is important to note that they
cannot test outliers.
Mr. Williams mentioned that on April 22"", there would be a 4-day CLASS
course for mid-level and senior consular officers and visa managers,
though he admitted that the course might be of some interest to junior
officers as well. The focus of the course would be on the Arabic language
namecheck. Since this course was just starting up, there were still many
questions surrounding it.
The CLASS back-up system is known as BNS. When BNS is in use, posts
can make local updates on their local BNS system. But global changes to
BNS, i.e., incorporating the changes made at individual locations
Mr. Williams asserted that, despite vendor claims to the contrary, facial
recognition techniques are not especially successful. At present, both
facial recognition and fingerprinting run on very limited databases. If
either of these techniques were to become part of a standard identity
check, there would have to be a significant increase in resources to
accommodate the millions of new records. In checking fingerprints, for
example, a turn-around time of a few seconds would be needed. At
present, a fingerprint inquiry sent to the FBI takes 24-48 hours. The
introduction of biometrics would also have a significant impa'ct on
operations at post. Consular officers want to be able to adjudicate a visa
application in the course of one day, or in as little time as possible.