<Insert Lesson, Module, or Course Title>

&lt;Insert Lesson, Module, or Course Title&gt;

Oracle Enterprise Data Quality Product Briefing for Sales Consultants Matching Copyright 2011, Oracle and/or its affiliates. All rights reserved. Objectives After completing this course, you should be able to: Describe the need for and uses of matching. Describe the essentials of matching in Oracle Enterprise Data Quality. Be able to set up matching processes in Oracle Enterprise Data Quality to identify and, if necessary, consolidate matching data records. 1-2 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Agenda Lesson 1 Matching Overview Lesson 2 Basic Matching Configuration

Lesson 3 Match Rule Hierarchies Lesson 4 Clustering Lesson 5 Merging Records Lesson 6 Reviewing Possible Matches Lesson 7 Exporting Data (Optional) Lesson 8 Case Study (Optional) 1-3 Copyright 2011, Oracle and/or its affiliates. All rights reserved. 1 Oracle Enterprise Data Quality Product Briefing for Sales Consultants Matching Overview Copyright 2011, Oracle and/or its affiliates. All rights reserved. What is Matching? Identifies multiple records relating to a single item. E.g.:

Customers. Products. Employees. 1-5 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Single View Of... Matching helps to give you a single view of your data. E.g.: Customers. Products. Employees. 1-6 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Business Examples of Matching You have acquired another company. Both companies have Customer Relationship Management (CRM) systems. You need to match and merge the customer records.

You have a number of product databases. Same products appear in multiple databases. You need to migrate to a single system. Your customer data has degraded over time: Due to house moves, name changes, deaths etc. You need to validate your data against a trusted data source. 1-7 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Uses of Matching De-duplication: Find and remove duplicate records in a single system. Consolidation: Combine multiple systems: create best records. Enhancement: Improve data by comparison with trusted reference data. Linking: Establish links between multiple systems.

1-8 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle Enterprise Data Quality Matching Processors Matching processors share code, and are partially pre-configured: Deduplicate. Consolidate. Enhance. Link. Advanced Match. Group and Merge. In this course we will use the Deduplicate match Processor. 1-9 Copyright 2011, Oracle and/or its affiliates. All rights reserved. What constitutes a match? Should we treat these two customer records as a match?

How many different companies are there? 1 - 10 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Three Possible Outcomes of Matching Record 2 Matching Match Match Record 1 Record n Record 1

Review Review Record 14 matches matches Record 251 Record 251 Record 14 may match may match Record 462 Record 462 Record 6 Record 6 matches matches 378

Record Record 378 Record 32 Record 32 may match may match834 Record Record 834 etc. etc. etc. etc. No Match No Match Record 2 Record 2 Record

Record 3 Record Record 4 Record Record 5 Record Record 7 Record Record 8 Human scrutiny required 1 - 11 Copyright 2011, Oracle and/or its affiliates. All rights reserved. etc. etc. 3 4 5 7 8

Matching Decision Road Map Where is my data? Which attributes will I compare? Identifiers How will I ensure performance? Clusters How will I compare these attributes? What will constitute a match and a review? 1 - 12 Input: Data Stores, snapshots Comparisons Match Rules and review What data should be in my merged records?

Merge Where should I output my results? Export Copyright 2011, Oracle and/or its affiliates. All rights reserved. 2 Oracle Enterprise Data Quality Product Briefing for Sales Consultants Basic Matching Configuration Copyright 2011, Oracle and/or its affiliates. All rights reserved. Business Scenario Objective: remove duplicate records from a table of customer data. In this module: carry out simple configuration of the Deduplicate Match Processor.

In the next module: refine our configuration to make it more effective. 1 - 14 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Matching Decision Road Map Where is my data? Which attributes will I compare? Identifiers How will I ensure performance? Clusters How will I compare these attributes? What will constitute a match and a review? 1 - 15

Input: Data Stores, snapshots Comparisons Match Rules and review What data should be in my merged records? Merge Where should I output my results? Export Copyright 2011, Oracle and/or its affiliates. All rights reserved. Business Question: What are Your Inputs? The data you want to match could be held in: Databases. E.g. Oracle, PostgreSQL, MySQL, DB2, MS Access etc. Files.

E.g. Text, XML, MS Excel. Is the data within one data source? E.g. de-duplicate a database table. Is the data held in multiple data sources? E.g. Compare database table with reference data file. 1 - 16 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Enterprise Data Quality Data Stores Set up a data store for each input source. A data store is a connection to a source of data. A data store does not itself hold data. Oracle Oracle Database Database

MySQL MySQL Database Database Text File Text File Access Access Database Database 1 - 17 Data Data Store Store Data Data Store Store EDQ

Data Data Store Store Data Data Store Store Copyright 2011, Oracle and/or its affiliates. All rights reserved. Enterprise Data Quality Snapshots A snapshot is a staged copy of data taken from a data store and held within Enterprise Data Quality. One table or view only; select attributes. Streaming also possible. Oracle Oracle Database Database MySQL MySQL

Database Database Text File Text File Access Access Database Database 1 - 18 Data Data Store Store Snapshot Snapshot Data Data Store Store

Snapshot Snapshot Snapshot Snapshot Data Data Store Store Snapshot Snapshot Data Data Store Store Snapshot Snapshot

EDQ Copyright 2011, Oracle and/or its affiliates. All rights reserved. Enterprise Data Quality Reader A reader connects data from a single snapshot to a process. A process can have multiple readers. Oracle Oracle Database Database MySQL MySQL Database Database Text File Text File Access Access Database Database

1 - 19 Data Data Store Store Snapshot Snapshot Data Data Store Store Snapshot Snapshot Snapshot Snapshot Data Data

Store Store Snapshot Snapshot Data Data Store Store Snapshot Snapshot Process Process Reader Reader Reader Reader EDQ Process

Reader Reader Copyright 2011, Oracle and/or its affiliates. All rights reserved. Practice 1 Overview Setting up your Process: Create a Project. Add a Data Store. Stage the Data by Adding a Snapshot. Create a Process. Import and Run a Published Process. 1 - 20 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Inputs to Matching Matching processor must be connected to: One or more readers. In this course, we will connect to a single reader. Oracle Oracle

Database Database MySQL MySQL Database Database Text File Text File Access Access Database Database 1 - 21 Data Data Store Store Snapshot Snapshot

Data Data Store Store Snapshot Snapshot Snapshot Snapshot Data Data Store Store Snapshot Snapshot Data Data Store Store

Snapshot Snapshot Process Process Reader Reader Matching Reader Reader EDQ Process Reader Reader Matching Copyright 2011, Oracle and/or its affiliates. All rights reserved. Practice 2 Overview Adding the Deduplicate Match Processor and Selecting its

Inputs. 1 - 22 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Matching Decision Road Map Where is my data? Which attributes will I compare? Identifiers How will I ensure performance? Clusters How will I compare these attributes? What will constitute a match and a review? 1 - 23 Input: Data Stores, snapshots

Comparisons Match Rules and review What data should be in my merged records? Merge Where should I output my results? Export Copyright 2011, Oracle and/or its affiliates. All rights reserved. Before You Can Match... Examine your business requirements. You must determine: What information should you match on? Focus on this information alone. 1 - 24 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

Identifier Mapping Select the attributes (fields) from the data source that you want to: Use to determine a match. You can use Auto Map: If you want to map all attributes. Or map attributes manually. 1 - 25 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Practice 3 Overview Selecting Your Identifiers: Given Name. Family Name. State. Whole Address. Date of Birth. 1 - 26

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Matching Decision Road Map Where is my data? Which attributes will I compare? Identifiers How will I ensure performance? Clusters How will I compare these attributes? What will constitute a match and a review? 1 - 27 Input: Data Stores, snapshots Comparisons Match Rules and review

What data should be in my merged records? Merge Where should I output my results? Export Copyright 2011, Oracle and/or its affiliates. All rights reserved. Why Cluster for Performance? You have 10 records to match. How many pairwise comparisons? 45. Now imagine you have 5 million records to match. How many pairwise comparisons? 12,499,997,500,000 (about 12.5 trillion). Comparing each record with every other record is not efficient. Number of pairwise comparisons = (n2 n) / 2 where n=number of records. 1 - 28

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Clustering: Definitions Cluster = a group of similar records within which matching occurs. Identifier(s) = data attribute(s) used to determine which records are placed in which cluster. Also used for matching. 1 - 29 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Fundamentals of Clustering Clustering groups similar records before they are matched. Cluster on surname and date of birth Smith / Smith / 8 April 72 8 April 72 Smith /

Smith / 10th May 83 10th May 83 Wilson / Wilson / 8 April 72 8 April 72 Jones / Jones / 8 June 56 8 June 56 etc. etc. Matching only takes place within clusters. Smith / 10th thMay 83 Smith / 10 May 83 1 - 30

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Practice 4 Overview Configure a Simple Cluster Scheme: Cluster on State. 1 - 31 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Matching Decision Road Map Where is my data? Which attributes will I compare? Identifiers How will I ensure performance? Clusters How will I compare these attributes? What will constitute a match and a review?

1 - 32 Input: Data Stores, snapshots Comparisons Match Rules and review What data should be in my merged records? Merge Where should I output my results? Export Copyright 2011, Oracle and/or its affiliates. All rights reserved. Comparisons and Match Rules Comparison: Used to evaluate the similarity of the value in a single identifier across multiple records. Match Rule:

Based on the outcome of one or more comparisons, determines whether a pair of records: Is a match. Is marked for review. Does not match. 1 - 33 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Comparisons and Match Rules: Example Exact Exact String Match String Match 1 - 34 Comparison Comparison Match Rule

Decision Name exact string match Match Copyright 2011, Oracle and/or its affiliates. All rights reserved. Match Rule Browsing Matching Results You can see information including: Number of matching records. Number of possible matches (those to be reviewed). Number of records matching against each match rule. You can drill down to see the data records that match. 1 - 35

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Practice 5 Overview Adding Comparisons: Given Name. Family Name. Adding a Match Rule: 1 - 36 Name. Run the matching process and study the results. Copyright 2011, Oracle and/or its affiliates. All rights reserved. The Results Browser

Tabs include: Matching. Review Status. Rules. A tab for each cluster scheme. Match Groups. Groups Output. Relationships Output. 1 - 37 Copyright 2011, Oracle and/or its affiliates. All rights reserved. 3 Oracle Enterprise Data Quality Product Briefing for Sales Consultants Match Rule Hierarchies Copyright 2011, Oracle and/or its affiliates. All rights reserved.

Matching on Multiple Identifiers A Match Rule can be based on a combination of comparisons. E.g.: If exact string match on name = true and exact string match on address1 = true, then match. Within the Match Rule, and logic always applies. 1 - 39 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Match Rule Using Multiple Comparisons: Example Exact String Exact String Matches Matches 1 - 40 Comparisons Comparisons

Match Rule Decision Name exact string match + Title exact string match Match Copyright 2011, Oracle and/or its affiliates. All rights reserved. Match Rule Match Rule Hierarchies Indicates likelihood of a match. Used to prioritise work. Matching works from top of hierarchy downwards. Individual match rules can be disabled.

1 - 41 When a decision is made, it stops. Copyright 2011, Oracle and/or its affiliates. All rights reserved. Practices 7, 8 and 9 Overview Add a Match Rule Based on Several Attributes: Add a comparison for Whole Address. Add a comparison for Date of Birth. Create a Match Rule for Name, Whole Address, DoB. Run the process and study the results. Manipulating Your Match Rule Hierarchy:

Change the Order of Execution. Change a Decision. Run the process and study the results. Switch a Match Rule Off. Add a Further Match Rule 1 - 42 Name, Building Number, ZIP, DoB. Copyright 2011, Oracle and/or its affiliates. All rights reserved. Fuzzy Match Examples If metaphone on name = true and words match count on address1 > 1, then review. If Date Edit Difference on Date of Birth < 2 and character edit distance on email address < 2, then review.

1 - 43 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Comparison Types Examples of comparison types: String Number Date Exact String Match Equals Exact Date Match Character Edit Distance Absolute Difference

Date Difference Word Match Count Percentage Difference Date Edit Difference Some comparison types have configurable results bands: 1 - 44 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Comparisons Decide: Which identifiers you want to compare across records. What type of comparison you want to perform. Whether you want to transform the identifiers.

1 - 45 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Matching Continuum Fuzzy match Tight match Family name Typos, Family name Typos, Standardised name, Family name Typos, Family name Typos, Standardised name, Address 1 Gender, Address 1, Address,

Address 1 Gender, Address 1, Address, Postcode close Postcode Postcode close Postcode Exact name, Exact name, Address, Address, Postcode Postcode Considerations:

What is your organisations risk appetite? What are the consequences of a missed-match or false positive? How much human effort can you dedicate to review? Where are the lines between match, review and no match? 1 - 46 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Match Rule Hierarchy: Simple Example Elimination rule: If Year Too Different (Year of Birth) = true, then No Match. Tight Match: If exact string match on name = true and exact string match on address1 = true, then match. Fuzzy match: If metaphone on name = true and words match count on address1 > 1, then review. 1 - 47 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

Practice 10 Overview Add Fuzzy Match Rules: 1 - 48 Add character edit distance comparisons for Given and Family Names. Add a fuzzy match rule for Name Typos, ZIP, DoB. Run the process and study the results. Add a character match percentage comparison for the whole address. Create a fuzzy match rule for Name Typos, Address Similar, DoB. Run the process and study the results.

Add a date edit distance comparison for Date of Birth. Copyright 2011, Oracle and/or its affiliates. All rights reserved. Transformation Within Matching You can add simple transformations to comparisons and clusters. These include: Denoise Trim or Normalize Whitespace First or Last N Characters or Words Replace (use for name standardization) Soundex or Metaphone For example: Replace short versions of names with long versions (e.g. replace Bill or Will or Billy with William), then match. The original value will be output in results. 1 - 49 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Practices 11, 12 and 13 Overview Transform the Name: Standardize the name.

Amend Name, Whole Address, DoB rule to use the standardized name. Run the process and Study the Results. Transform Names with Metaphone. Add a Match Rule for Name Meta, Address Similar, DoB Typos. Run the process and Study the Results. Add Priority Scores for Reviewers Find Customers whose Family Names have Changed (optional) 1 - 50 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Tuning Match Rules Start with a basic matching strategy. Change Change match rule match rule configuration

configuration Study results Study results 1 - 51 Run Run matching matching Copyright 2011, Oracle and/or its affiliates. All rights reserved. Match Rule Groups Set of matching rules performing similar functions. Can be managed as a unit: Position of group in decision hierarchy can be moved. All rules in group can be enabled / disabled en masse. Decisions can be changed en masse. Comparisons used can be changed en masse. Make it easier to manage long lists of match rules.

1 - 52 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Match Rules - Recap Each match rule is associated with a decision: Match. Review. No Match. Decision is based on the outcome of 1 or more comparisons: True. False. Results bands. Arranged into hierarchies. 1 - 53 Copyright 2011, Oracle and/or its affiliates. All rights reserved. 4 Oracle Enterprise Data Quality Product Briefing for Sales Consultants

Clustering Copyright 2011, Oracle and/or its affiliates. All rights reserved. Matching Decision Road Map Where is my data? Which attributes will I compare? Identifiers How will I ensure performance? Clusters How will I compare these attributes? What will constitute a match and a review? 1 - 55 Input: Data Stores, snapshots

Comparisons Match Rules and review What data should be in my merged records? Merge Where should I output my results? Export Copyright 2011, Oracle and/or its affiliates. All rights reserved. Why Cluster for Performance? You have 10 records to match. How many pairwise comparisons? 45. Now imagine you have 5 million records to match.

How many pairwise comparisons? 12,499,997,500,000 (about 12.5 trillion). Comparing each record with every other record is not efficient. 1 - 56 Number of pairwise comparisons = (n2 n) / 2 where n=number of records. Copyright 2011, Oracle and/or its affiliates. All rights reserved. Clustering: Definitions Cluster = a group of similar records within which matching occurs. Identifier(s) = data attribute(s) used to determine which records are placed in which cluster. Also used in matching.

Transformation = alteration of identifier values before clustering. E.g. Use the first 3 characters of the identifier only. 1 - 57 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Fundamentals of Clustering Clustering groups similar records before they are matched. Cluster on surname and date of birth Smith / Smith / 8 April 72 8 April 72 Smith / Smith / 10th May 83 10th May 83 Wilson /

Wilson / 8 April 72 8 April 72 Jones / Jones / 8 June 56 8 June 56 etc. etc. Matching only takes place within clusters. Smith / 10th thMay 83 Smith / 10 May 83 1 - 58 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Cluster Size Strategies You can tune cluster sizes by deciding: Which identifier(s) to cluster on.

These yield the cluster key. Whether and how to transform your identifier(s). Clusters that contain only a single record are not considered for matching. 1 - 59 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Cluster Size Strategies - Examples E.g. Customer records: Clustering on: Gender will produce (inadvisably) large clusters. Last name will produce smaller clusters. First 3 letters of last name increases cluster size.

Adding a metaphone transformation increases cluster size further. First 3 letters of last name and first 3 letters of postcode decreases cluster size. 1 - 60 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Cluster Size Continuum Large clusters Matching slower, Matching but mayslower, be more but may be likely tomore find likely to

find matches. matches. Small clusters Matching Matching faster, but may be faster, butpossibility may be greater greater possibility of missed-matches. of missed-matches. Considerations: What is your business context? How critical is it to find all matches?

1 - 61 How powerful is your hardware? Copyright 2011, Oracle and/or its affiliates. All rights reserved. Tuning Clusters An art, not a science. A possible strategy: Profile your candidate attributes: Evaluate their consistency and completeness. Begin with: Large, loose clusters. E.g. Cluster on family names beginning with letter M. Relatively small sample of data.

Set up your match rules. Iteratively: Tighten your clusters. Stop when you begin to lose hits. 1 - 62 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Multiple Cluster Schemes Why? You can have: Small clusters = high performance, but possible missed matches. Large clusters = fewer missed matches, but slower performance. Or: Multiple cluster schemes: several layers of smaller cluster groups. Better performance and less chance of missed matches. 1 - 63

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Multiple Cluster Schemes - Concept Scheme 1: surname and date of birth Scheme 2: surname and postcode Scheme 3: postcode and date of birth 1 - 64 Smith / 8 April 72 Wilson / 8 April 72

Smith / 10th May 83 Jones / 8 June 56 Smith / CB3 Wilson / CB3 Smith / BN25 Jones / MK45 etc. MK45 / 8 June 56

etc. etc. CB3 / 8 April 72 BN25 / 10th May 83 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Multiple Cluster Schemes Advantages Mitigates effects of missing or partial data. E.g. missing date of birth: Record will not be included in the right cluster in a last name + DoB scheme. But it will be included in the right cluster in a last name + postcode cluster. Multiple cluster schemes: Provide a safety net, and Since clusters can be small, matching is efficient.

1 - 65 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Practices 14 and 15 Overview Configure a Second Cluster Scheme: Cluster on customers telephone numbers. Run the process and Study the results. Configure a Third Cluster Scheme: Cluster on customers Family names: First 4 Characters Only with a metaphone transformation. 1 - 66 Run the process and Study the results. Copyright 2011, Oracle and/or its affiliates. All rights reserved. Clustering Defaults NULL Groups NULL Groups are allowed. Records with a NULL cluster key will be included in their

own cluster. Example: Clustering with email as the cluster key: All records with NULL email values put in same cluster. Large NULL groups can slow performance. If you deselect Allow NULLs: Records with NULL cluster keys: 1 - 67 Not put into cluster. Disregarded for matching. Copyright 2011, Oracle and/or its affiliates. All rights reserved.

Clustering Defaults Group and Comparison Limits Cluster Group Limit set to 500 by default. Clusters with more than 500 records disregarded for matching. Cluster Comparison Limit not set by default. Triggered by number of pairwise comparisons within clusters. Can be valuable when dealing with multiple input sources. E.g. 499 records from data set A; 1 record from data set B; Only 499 comparisons carried out (not n2-n/2). You can override defaults at cluster scheme level. 1 - 68 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Clustering: a Summary Clustering groups similar records. E.g. All records where first 3 letters of surname sound similar. Matching only occurs within the clusters.

Major performance benefit. You must define the criteria for determining clusters. You can configure multiple cluster schemes. A fail safe in case of missing data. These work in parallel. 1 - 69 Copyright 2011, Oracle and/or its affiliates. All rights reserved. 5 Oracle Enterprise Data Quality Product Briefing for Sales Consultants Merging Records Copyright 2011, Oracle and/or its affiliates. All rights reserved. Matching Decision Road Map Where is my data? Which attributes will I compare?

Identifiers How will I ensure performance? Clusters How will I compare these attributes? What will constitute a match and a review? 1 - 71 Input: Data Stores, snapshots Comparisons Match Rules and review What data should be in my merged records? Merge Where should I output my results?

Export Copyright 2011, Oracle and/or its affiliates. All rights reserved. Merging Defaults and Options The Deduplicate match Processor automatically generates merged records. One merged record for each match group. Does not overwrite original records. You can turn off generate merged output. You can output: Related records only (i.e. merged matches). Unrelated records only (i.e. records that dont match). Both. 1 - 72 Copyright 2011, Oracle and/or its affiliates. All rights reserved. How are Attributes Merged? By default the most common value is output for each attribute. You can change how each attribute is merged: Examples of other options:

Sum. Highest value. Average. Longest String. First non-empty value. See F1 Help for more options. You can decide what to do next with the merged records. 1 - 73 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Practices 16 and 17 Overview Merging Records:

Interrogating Merged Records. Remove an Attribute from Merged Records. Manipulate the Merge Sub-Processor. 1 - 74 Copyright 2011, Oracle and/or its affiliates. All rights reserved. 6 Oracle Enterprise Data Quality Product Briefing for Sales Consultants Reviewing Possible Matches Copyright 2011, Oracle and/or its affiliates. All rights reserved. Matching Decision Road Map Where is my data? Which attributes will I compare? Identifiers

How will I ensure performance? Clusters How will I compare these attributes? What will constitute a match and a review? 1 - 76 Input: Data Stores, snapshots Comparisons Match Rules and review What data should be in my merged records? Merge Where should I output my results? Export

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Review Groups Review Groups: Contain two or more related records. Are displayed on a single screen in Match Review module. May be linked by single or multiple match rules. 1 - 77 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Match Review Module Match Review module: Provides controlled manual review process with audit trail. A useful tool during development. You can filter review groups, for example on match rule. 1 - 78 Default filter is Review Status = Awaiting Review.

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Review Process Two part process: Relationship review: Change state of a possible match to match, no match or pending. Merged review: Check and change merged records. 1 - 79 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Practice 18 Overview Review Possible Matches in Match Review. 1 - 80 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

7 Oracle Enterprise Data Quality Product Briefing for Sales Consultants Exporting Data Copyright 2011, Oracle and/or its affiliates. All rights reserved. Matching Decision Road Map Where is my data? Which attributes will I compare? Identifiers How will I ensure performance? Clusters How will I compare these attributes? What will constitute a match and a review?

1 - 82 Input: Data Stores, snapshots Comparisons Match Rules and review What data should be in my merged records? Merge Where should I output my results? Export Copyright 2011, Oracle and/or its affiliates. All rights reserved. What Can Be Output? Merged (deduplicated) records. Either related records, unrelated records or both. Groups. Relationships.

Clustered. Decisions. 1 - 83 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Outputting Data Step 1: Write to staged data. Step 2: Export from staged data via a data store. Process Matching Matching Writer Staged Staged Data Data EDQ

1 - 84 Writer File Export Export Process Matching Export Export Data Data Store Store Staged Staged Data Data

Data Data Store Store Copyright 2011, Oracle and/or its affiliates. All rights reserved. Database Database Optional Practice 19 Overview Exporting Data: Write to Staged Data. Export your Staged Data to a File. Confirm that your File has been Created. 1 - 85 Copyright 2011, Oracle and/or its affiliates. All rights reserved. 8

Oracle Enterprise Data Quality Product Briefing for Sales Consultants Case Study Copyright 2011, Oracle and/or its affiliates. All rights reserved. Challenges in Matching Free text fields: Free text fields: data entered in different data entered in different formats and conventions. formats and conventions. Matching software must Matching software must capture

the users knowledge capture the users knowledge and experience together with and experience together with the business rules surrounding the business rules surrounding the data. the data. 1 - 87 Source data incomplete / incorrect: Source data incomplete / incorrect:

data in the wrong place. data in the wrong place. Context is critical: Context is critical: Knowledge that these 4 fields Knowledge that these 4 fields represent an address allows us represent an address allows to recognise a potential match.us to recognise a potential match. Copyright 2011, Oracle and/or its affiliates. All rights reserved. Examples of Transformation

Source data incomplete / incorrect: Source data incomplete / incorrect: data in the wrong place. data in the wrong place. Bill Jones Billy Jones William Jones Will Jones 1 - 88 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Common Pre-Matching Transformations

1 - 89 Concatenate. Convert data type. Denoise. Convert case. Make array from string. Trim or Normalise Whitespace. Sounds like - Metaphone and Soundex. Merge attributes. Copyright 2011, Oracle and/or its affiliates. All rights reserved. Business Decision Road Map What is my business What isChallenge? my business Challenge?

Where is my data? Where is my data? Do I need to transform How will I cluster Do I need to transform How I cluster these before matching? forwill performance? these before matching? for performance? What will result in What will result in a human

review? a human review? 1 - 90 Which attributes will Which attributes will determine a match? determine a match? What will result in a Whatdefinite will result in a match? definite match? What output do I want? What output do I want?

Copyright 2011, Oracle and/or its affiliates. All rights reserved. Case Study Rationalizing the Parts table. 1 - 91 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Summary After completing this course, you should be able to: Describe the need for and uses of matching. Describe the essentials of matching in Oracle Enterprise Data Quality. Be able to set up matching processes in Oracle Enterprise Data Quality to identify and, if necessary, consolidate matching data records. 1 - 92 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

Recently Viewed Presentations

  • IETF Activities Update

    IETF Activities Update

    IETF Activities Update. Marla Azinger. [email protected] Thomas Narten. [email protected] ARIN XXIII. April 26, 2009
  • Chapter 4

    Chapter 4

    Chapter 4. Objectives. Define "whole grain," and explain what occurs when grain is refined. Identify the primary functions of carbohydrates in food and in the body
  • January 10, 2019 Environmental Management Commission Request to

    January 10, 2019 Environmental Management Commission Request to

    EMC Adoption of Final Rules. Action Requested. The Division is requesting approval to proceed to public notice and hearing with the Fiscal Analysis and proposed rule amendments to Nutrient Strategy Rules 15A NCAC 02B .0229 through .0258 and Proposed 15A...
  • Heat and Thermodynamics Phases of Matter The three

    Heat and Thermodynamics Phases of Matter The three

    Phases of Matter. The three most common phases of matter are called solid, liquid, and gas. At temperatures greater than 10,000 K the atoms in a gas start to break apart.
  • Firm-Level Shocks and Labor Flows Mikael Carlsson, PhD1;

    Firm-Level Shocks and Labor Flows Mikael Carlsson, PhD1;

    The Core Structural VAR Equations ... (re-)allocation towards firms with (permanently) higher product demand. Rigidities do not appear to hamper adjustments relative to permanent demand shocks, but may limit substantially adjustment to temporary shocks ... PowerPoint Presentation Author ...
  • Poetry - 7th Grade Language Arts

    Poetry - 7th Grade Language Arts

    POETRY TERMS and DEFINITIONS TERMS Anthology - collection of poems grouped together Assonance - vowel rhyme - repetition of a pattern of similar sounds Consonance - repetition of the same consonant Chorus - part of poem repeated after each verse...
  • Drugs and Drug Abuse

    Drugs and Drug Abuse

    metabolic tolerance - enzyme induction- enzymes - speed up a chemical reaction . with repeated exposure, enzymes get better at breaking down drug or liver makes more enzymes. Mechanisms for Tolerance
  • The Legal Basis of Planning

    The Legal Basis of Planning

    The Shimizu TRY 2004 Mega-City Pyramid is a proposed project for construction of a massive pyramid over Tokyo Bay in Japan. The structure would be about 14 times higher than the Great Pyramid at Giza, and would house 750,000 people