Logos

Advanced Deduplication & Matching

 

greenstone have developed a suite of Advanced Similarity Matching (ASM) software that finally resolves the traditional problems associated with this most complex area of data processing.

 

Our unique software brings more accurate deduplication and matching results than traditional bureau software. It is designed to handle UK consumer and business data and even mixed consumer / business files.

Crucially (and unlike most other advanced matching software), despite the highly complex techniques it uses to identify similarities between records, the software is proven to handle very high volumes of data at speeds comparable with some of the simplest software on the market.  greenstone’s ASM software is currently only available to greenstone clients.

The software brings multiple benefits – it identifies those records that matchkey based software would bring together as duplicates (but that should be kept separate) and it brings together as duplicates, records that most matchkey based software would never connect.

Accurate matching and deduplication is probably the most difficult data management process – it is truly an art, rather than a science – different complexities in different files of data present a wide range of issues to overcome. Even with fuzzy matching logic rules applied within standard matching software, clients using such software are applying a technology that has been available for over 15 years and that simply miss-matches many records.


Line Break

 

In addition, different clients have very different views as to whether two records are a “match” or not. For those clients who wish to make use of the real power of greenstone’s matching technology and define their own rules as to what is and what is not a duplicate (at multiple match levels and even for multiple job types), greenstone recommend running our ASM_MatchLab process. Working closely with the client, we run a series of matching tests on client-supplied ‘real’ data, varying and amending the multiple similarity scores and matching rules until the optimum level of “accuracy of match” is achieved for that client, for that job type. These specific rules are then applied as standard to future client jobs of this type.

In independent tests on real consumer data, greenstone’s software has demonstrated over a 16% uplift in accuracy of matching on some of the largest, apparently "clean" UK consumer files. On significant volumes of data or in large database systems these levels of improvement can make significant differences to the bottom line when cleaning customer files or generating campaign files.

Some examples of the records that this unique software will match as standard and that traditional bureau software will simply not find are listed below.

 

Line Break


In all cases, the postcodes of these sample records are the same. For Data Protection reasons the names and addresses listed are, fabricated:

 

MRSPBURCHMOORE16A HOLBERN WAY
matches to
MRSPHYLLISBIRCHMAW16 HOLBEIN WAY
MRSJBALLTREE41 FOYLE PARK
matches to
MRSJENNIFERBAWTREE41 FOYLE PARK
MRAYACKTREE22B ACACIA AVENUE
matches to
 TONYYACKTREE228 ACACIA AVENUE
MISSHIPSCOMBE8 KINGSMILL STREET
matches to
MISSHELENLIPSCOMBE8 KINGSMILL STREET
MRSP VMUNNS58 SHEPPARD STREET
matches to
MRSPATRICIANUNNS58 SHEPPARD STREET
MRSL FLETCHER46 CULVER STREET
matches to
MRSELIZABETHLFETCHER46c CULVER STREET
MISSKMERGER75 BROOK STREET
matches to
MISSKATYMETZGER75 BROOK STREET
MRGECKET19 SLYLVIA WAY
matches to
MRGRAHAMPECKET19 SYLVIA WAY
MRSBLAKE45 RUDKIN WAY
matches to
MRSTEPHENLAKE45 RUSKIN WAY
MRSBETTYHOAREFOUR THE LANE
matches to
MRSELIZABETHWHORE4 THE LANE
MRSLJOATSFLAT 32 THATCH PARK
matches to
MRSLYNNE JOATS32 THATCH PARK
MRAOLIVIE12 HIGH STREET
matches to
MRBERTOGILVIE12 HIGH STREET
MRSCLAIREDE WULF12 FOUR POOTER WAY
matches to
MRSCLAREDE NULF12 4 PEWTER WAY
MRM DBUNNING23 CROFT ROAD
matches to
MRMARKDUNNING23 CROFT RD
MRG CEPRINGTON1000 NURSERY WAY
matches to
MRGRAHAM CSPRINGTON10 NURSERY WAY
MRSKATESTIRLING17 MEADOWBANK
matches to
MRSCATHERINESTIRLING17 MEADOWBAVK