1 Week 11 November 7 Data Normalization and

1 Week 11 November 7  Data Normalization and

1 Week 11 November 7 Data Normalization and ERD Conceptual, Logical and Physical Database Design R. Ching, Ph.D. MIS California State University, Sacramento Data Normalization 2 The purpose of normalization is to produce a stable set of relations that is a faithful model of the operations of the enterprise. Achieve a design that is highly flexible Reduce redundancy Ensure that the design is free of certain update, insertion and deletion anomalies Catherine Richardo, 1990 R. Ching, Ph.D. MIS California State University, Sacramento Normalization 3 1NF

1NF Flat file 2NF 2NF Partial dependencies removed 3NF 3NF Transitive dependencies removed BCNF BCNF Every determinant is a candidate key 4NF 4NF Non-tivial multi-valued dependencies removed R. Ching, Ph.D. MIS California State University, Sacramento Order No. Date:

Stereos To Go 6 / 15 / 99 Address: Invoice John Smith 2036-26 Street Sacramento CA 95819 City Date Shipped: Item Number Product Code State 0000 000 0000 0 John Smith 1/05

Zip Code 6 / 18 / 99 Product Description/Manufacturer 1 SAGX730 Pioneer Remote A/V Receiver 2 AT10 CDPC725 Cervwin Vega Loudspeakers Sony Disc-Jockey CD Changer 3 Go, Hogs Stereos To Go 0000-000-0000-0 Account No.

Customer: 10001 Qty Price 1 56995 1 35995 1 39995 4 5 Subtotal Shipping & Handling Sales Tax Total R. Ching, Ph.D. MIS California State University, Sacramento 132985 10000 10306 153291 4 Unnormalized Relation

(Invoice_number, Invoice_date, Date_delivered, Cust_account Cust_name Cust_addr Cust_city Cust_state Zip_code, Item1 Item1_descrip Item1_qty Item1_price, Item2 Item2_descrip Item2_qty Item2_price, . . . , Item7 Item7_descrip Item7_qty Item7_price) How would a program process the data to recreate the invoice? R. Ching, Ph.D. MIS California State University, Sacramento 5 Unnormalized to 1NF 6 (Invoice_number, Invoice_date, Date_delivered, Cust_account Cust_name Cust_addr Cust_city Cust_state Zip_code, Item1, Item1_descrip, Item1_qty, Item1_price, Item2, Item2_descrip, Item2_qty, Item2_price, . . . , Repeating groups Item7, Item7_descrip, Item7_qty, Item7_price) A flat file places all the data of a transaction into a single record. This is reminiscent of a COBOL or BASIC program processing a single transaction with one read statement. R. Ching, Ph.D. MIS California State University, Sacramento Unnormalized to 1NF

7 (Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code, Item, Item_descrip, Item_qty, Item_price) Nominated group of attributes to serve as the key (form a unique combination) Eliminate the repeating groups. Each row retains data for one item. If a person bought 5 items, we would have five tuples R. Ching, Ph.D. MIS California State University, Sacramento 1NF ce i vo n I

8 r e be m m a Flat File u n n r e nt m u o to c s Ac Cu Item Description r be m

nu Item Item Quantity Price 10001 10001 123456 123456 John John Smith Smith SAGX730 SAGX730 Pioneer Pioneer Remote Remote A/V Rec 10001 10001 123456 123456 John John Smith Smith AT10 AT10 11 569.95 569.95

Cerwin Cerwin Vega Vega Loudspeakers Loudspeakers 11 359.95 359.95 10001 10001 123456 123456 John John Smith Smith CDPC725 CDPC725 Sony Sony Disc Disc Jockey Jockey CD CD 11 399.95 399.95 10001 10001 123456 123456 John

John Smith Smith S/H S/H Shipping Shipping 11 100.00 100.00 10001 10001 123456 123456 John John Smith Smith Tax Tax Sales Sales Tax Tax 11 103.06 103.06

R. Ching, Ph.D. MIS California State University, Sacramento From 1NF (Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code, Item, Item_descrip, Item_qty, Item_price) Functional dependencies and determinants Example: item_descrip is functionally dependent on item, such that item is the determinant of item_descript. R. Ching, Ph.D. MIS California State University, Sacramento 9 From 1NF to 2NF (Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code) (Item, Item_descrip, Item_qty, Item_price) Is this unique by itself? What happens if the item is purchased more than once? R. Ching, Ph.D. MIS California State University, Sacramento

10 From 1NF to 2NF 11 (Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code) Partial dependency (Invoice_number, Item, Item_descrip, Item_qty, Item_price) Composite key (forms a unique combination) R. Ching, Ph.D. MIS California State University, Sacramento From 1NF to 2NF (Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code) (Invoice_number, Item, Item_qty, Item_price) (Item, Item_descrip) R. Ching, Ph.D. MIS California State University, Sacramento 12

From 2NF to 3NF (Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code) (Invoice_number, Item, Item_qty, Item_price) (Item, Item_descrip) Which attributes are dependent on others? Is there a problem? R. Ching, Ph.D. MIS California State University, Sacramento 13 Transitive Dependencies and Anomalies Insertion anomalies To add a new row, all customer (name, address, city, state, zip code, phone) and products (description) must be consistent with previous entries Deletion anomalies By deleting a row, a customer or product may cease to exist Modification anomalies To modify a customers or products data in one row, all modifications must be carried out to all others R. Ching, Ph.D. MIS California State University, Sacramento 14

Insertion and Modification Anomalies For example 15 Insert a new Panasonic product Product_code Manufacturer_name DVD-A110 PV-4210 PV-4250 Panasonic Panasonic Panasonic CT-32S35 PAN Inconsistency DVD-A110 PV-4210 PV-4250 CT-32S35 Panasonic

PanaSonic Pana Sonic PAN R. Ching, Ph.D. MIS California State University, Sacramento Change all Panasonic products manufacturer name to Panasonic USA Deletion Anomaly For Example 4377182 4398711 4578461 4873179 John Smith Arnold S Gray Davis Lisa Carr 16

Sacramento Davis Sacramento Reno CA CA CA NV 95831 95691 95831 89557 By deleting customer Arnold S, we would also be deleting Davis, California. R. Ching, Ph.D. MIS California State University, Sacramento Transitive Dependencies A condition where A, B, C are attributes of a relation such that if A B and B C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B

or C). R. Ching, Ph.D. MIS California State University, Sacramento Invoice_number Invoice_date 17 Date_delivered Cust_account Cust_name Cust_addr Cust_city Cust_state Zip_code Item Item_descrip Invoice_number+Item Item_qty Item_price Why Should City and State Be Separated from Customer Relation? City and state are dependent on zip code for their values and not the customers identifier (i.e., key). Zip_code City, State Otherwise, Cust_account Cust_addr, Zip_code City, State In which case, you have transitive dependency.

R. Ching, Ph.D. MIS California State University, Sacramento 18 3NF Invoice Relation (Invoice_number, Invoice_date, Date_delivered, Cust_account) Customer Relation (Cust_account, Cust_name, Cust_addr, Zip_code) Zip_code Relation (Zip_code, City, State) Invoice_items Relation (Invoice_number, Item, Item_qty, Item_price) Items Relation (Item, Item_descrip) R. Ching, Ph.D. MIS California State University, Sacramento 19 3NF 20 Invoice Relation

(Invoice_number, Invoice_date, Date_delivered, Cust_account) Customer Relation (Cust_account, Cust_name, Cust_addr, Zip_code) Zip_code Relation (Zip_code, City, State) Invoice_items Relation (Invoice_number, Item, Item_qty, Item_price) Items Relation Manufacturers Relation (Item, Item_descrip) (Manuf_code, Manuf_name) Since the Items relation contains the manufacturers name in the description, a separate Manufacturers relation can be R.created Ching, Ph.D. MIS California State University, Sacramento 21 R. Ching, Ph.D. MIS California State University, Sacramento First to Third Normal Form

(1NF - 3NF) 1NF: A relation is in first normal form if and only if every attribute is single-valued for each tuple (remove the repeating or multi-value attributes and create a flat file) 2NF: A relation is in second normal form if and only if it is in first normal form and the nonkey attributes are fully functionally dependent on the key (remove partial dependencies) 3NF: A relation is in third normal form if it is in second normal form and no nonkey attribute is transitively dependent on the key (remove transitive dependencies) R. Ching, Ph.D. MIS California State University, Sacramento 22 23 Putting It Together ERD of the Normalized Data Model R. Ching, Ph.D. MIS California State University, Sacramento 3NF 24 Invoice Relation (Invoice_number, Invoice_date, Date_delivered, Cust_account)

Customer Relation (Cust_account, Cust_name, Cust_addr, Zip_code) Zip_code Relation (Zip_code, City, State) Invoice_items Relation (Invoice_number, Item, Item_qty, Item_price) Items Relation Manufacturers Relation (Item, Item_descrip, Manuf_code) (Manuf_code, Manuf_name) R. Ching, Ph.D. MIS California State University, Sacramento ERD Invoices Invoice_number Invoice_date Date_delivered Cust_account Invoice_items Invoice_number

Item Item_qty Item_price 25 Customers Cust_account Cust_name Cust_addr Zip_code Zip_Codes Zip_code City State Items Item Item_descrip Manuf_code Manufacturers R. Ching, Ph.D. MIS California State University, Sacramento Manuf_code Manuf_name ERD

Invoices Invoice_number Invoice_date Date_delivered Cust_account 26 Order (0..*) (1..1) Customers Cust_account Cust_name Cust_addr Zip_code Locate (0..*) (1..1) Zip_Codes Zip_code City State

(1..1) Have (1..*) Invoice_items Items Invoice_number Appear on Item (0..*) (1..1) Item_descrip Item Manuf_code Item_qty Item_price R. Ching, Ph.D. MIS California State University, Sacramento Manufacturers Produce (0..*) (1..1) Manuf_code Manuf_name ERD Invoices

Invoice_number Invoice_date Date_delivered Cust_account (1..1) Have (1..*) Invoice_items 27 Order (0..*) (1..1) Customers Cust_account Cust_name Cust_addr Zip_code Locate (0..*) (1..1)

Zip_Codes Zip_code City State Partial Zip codes locate Customers. () A zip code can be related to a minimum of zero and a maximum of Manufacturers Items many customers. Produce () A customer can be related to a Item Manuf_code (1..1) Item_descrip minimum and(0..*) maximum of Manuf_name one zip code. Invoice_number Appear on (0..*) (1..1)

Item Manuf_code Item_qty Item_price R. Ching, Ph.D. MIS California State University, Sacramento ERD Invoices Invoice_number Invoice_date Date_delivered Cust_account 28 Order (0..*) (1..1) Partial Customers Cust_account Cust_name Cust_addr Zip_code

Locate (0..*) (1..1) Zip_Codes Zip_code City State (1..1) Customers order (items) on invoices. Have () A customer can be related to a (1..*) minimum of zero and a maximum of Invoice_items Items many invoices. () An invoiceAppear can beon related Item to a Invoice_number (0..*) (1..1)ofItem_descrip Item

minimum and maximum one Manuf_code Item_qty customer. Item_price R. Ching, Ph.D. MIS California State University, Sacramento Manufacturers Produce (0..*) (1..1) Manuf_code Manuf_name ERD Invoices Invoice_number Invoice_date Date_delivered Cust_account 29 Customers Zip_Codes Order

Cust_account Locate Zip_code (0..*) (1..1) Cust_name (0..*) (1..1) City Cust_addr State Invoices possess invoice items Zip_code () An invoice can be related to a minimum of one and a maximum of Have Mandatory many invoice items. (1..*) () An invoice item can be related to a minimum invoice. Invoice_items Items and maximum of oneManufacturers (1..1) Invoice_number Appear on Item (0..*) (1..1) Item_descrip

Item Manuf_code Item_qty Item_price R. Ching, Ph.D. MIS California State University, Sacramento Produce (0..*) (1..1) Manuf_code Manuf_name ERD 30 Invoices Customers Items are sold on invoice items. Invoice_number () An item canOrder be relatedCust_account to a Invoice_date Cust_name (0..*)

(1..1) minimum of zero and a maximum of Date_delivered Cust_addr many invoice items. Cust_account Zip_code Locate (0..*) (1..1) Zip_Codes Zip_code City State ((1..1) ) An invoice item can be related to a minimum and maximum of one item. Have (1..*)

Partial Invoice_items Items Invoice_number Appear on Item (0..*) (1..1) Item_descrip Item Manuf_code Item_qty Item_price R. Ching, Ph.D. MIS California State University, Sacramento Manufacturers Produce (0..*) (1..1) Manuf_code Manuf_name ERD Invoices Invoice_number Invoice_date

Date_delivered Cust_account (1..1) Have Order (0..*) (1..1) Manufacturers produce items. (Customers ) A manufacturer can be Zip_Codes related to a Cust_account Zip_code minimum of zero Locate and a maximum of Cust_name (0..*) (1..1) City many items. Cust_addr State (Zip_code

) An item can be related to a minimum and maximum of one manufacturer. (1..*) Partial Invoice_items Items Invoice_number Appear on Item (0..*) (1..1) Item_descrip Item Manuf_code Item_qty Item_price R. Ching, Ph.D. MIS California State University, Sacramento 31 Manufacturers Produce (0..*) (1..1)

Manuf_code Manuf_name 32 Higher Forms of Data Normalization Boyce-Codd Normal Form (BCNF) Fourth Normal Form (4NF) Fifth Normal Form (5NF) Domain Key Normal Form (DKNF) R. Ching, Ph.D. MIS California State University, Sacramento Boyce-Codd Normal Form (BCNF) A relation is in Boyce-Codd normal form if and only if every determinant is a candidate key A determines B Attribute Attribute A A Attribute Attribute B

B Determinant (B is functionally dependent on A) For a relation with only one candidate key, 3NF and BCNF are equivalent. Usually occurs when keys of different relations overlap R. Ching, Ph.D. MIS California State University, Sacramento 33 BCNF Example User (UserID, Dept, Name, ComputerID, EmpClassification) ComputerID Dept (a department issues a computer) UserID, Dept ComputerID, Name, EmpCassification (Employees may have the same name and UserIDs are unique within the department only) UserID, ComputerID Dept, Name, EmpClassification BCNF UserComputer (ComputerID, Dept) User (UserID, ComputerID, Name, EmpClassification) R. Ching, Ph.D. MIS California State University, Sacramento 34 From 3NF to BCNF

35 Invoice Relation (Invoice_number, Invoice_date, Date_delivered, Cust_account) Customer Relation (Cust_account, Cust_name, Cust_addr, Zip_code) Zip_code Relation (Zip_code, City, State) Invoice_items Relation (Invoice_number, Item, Item_qty, Item_price) Items Relation (Item, Item_descrip) Manufacturers Relation (Manuf_code, Manuf_name) R. Ching, Ph.D. MIS California State University, Sacramento Candidate keys? Fourth Normal Form (4NF) A relation is in fourth normal form if and only if it is in Boyce-Codd normal form and there are no nontrivial dependencies. Identify all determinants and make sure they are

candidate keys R. Ching, Ph.D. MIS California State University, Sacramento 36 4NF Example 37 Employee (EmployeeID, Dept, Project) Matrix management 100 100 100 102 102 102 Finance Marketing Finance Finance Marketing Finance 4NF

F177-99 F177-99 F288-00 F288-00 F177-99 F177-99 Employee (EmployeeID, Dept) Projects (EmployeeID, Project) R. Ching, Ph.D. MIS California State University, Sacramento Multivalued dependencies 4NF Example Matrix management 100 100 100 102 102 102 100 100

102 102 38 Finance Marketing Finance Finance Marketing Finance 4NF Finance Marketing Finance Marketing R. Ching, Ph.D. MIS California State University, Sacramento F177-99 F177-99 F288-00 F288-00 F177-99 F177-99 100 100 102

102 Multivalued dependencies F177-99 F288-00 F288-00 F177-99 Fifth Normal Form (5NF) aka Project-Join NF 39 A relation is in fifth normal form if no remaining nonloss projections (i.e., all projects preserve all information contained in the original relation)are possible, except the trivial one in which the key appears in each project. The join of all projects will result in the original relation No systematic method exists for obtaining 5NF or for ensuring that a set of relations is indeed 5NF Ricardo, 1990 R. Ching, Ph.D. MIS California State University, Sacramento Domain-Key Normal Form (DKNF) A relation is in domain-key normal form if every constraint

is a logical consequence of domain constraints or key constraints (i.e., all possible values are a result of an imposed constraint) There is no proven method of converting a design to DKNF, so it remains an ideal rather than a state that can readily be achieved Ricardo, 1990 R. Ching, Ph.D. MIS California State University, Sacramento 40 DKNF 41 For example: Emp_ID, Emp_name, Classification, Position, Salary Domain for Classification: Executive Manager Staff R. Ching, Ph.D. MIS California State University, Sacramento Domain for Position Strategic Planner CIO

Vice President Domain for Position Programmer/Analyst I Programmer/Analyst II Database/Analyst I Database Design Methodology Conceptual database design Build conceptual representation of the database Logical database design Translate conceptual representation to logical structure of the database Physical database design Operatioanlize logical structure in a physical implementation R. Ching, Ph.D. MIS California State University, Sacramento 42 Conceptual Database Design The process of constructing a model of the data used in an

enterprise, independent of all physical considerations Whats involved Identify entity types, relationship types Identify and associate attributes with entity or relationship types Determine attribute domains Determine candidate, primary and alternate key attributes Consider use of enhanced modeling concepts Check model for redundancies Validate conceptual model against user transactions Review conceptual data model with the users R. Ching, Ph.D. MIS California State University, Sacramento 43 Logical Database Design The process of constructing a model of the data used in an enterprise based on a specific data model, but independent of a particular DBMS and other physical considerations Whats involved

Derive relations for logical data model Validate relations using data normalization Validate relations against user transactions Check integrity constraints Review logical data model (ERD) with the users Merge logical data models into global data model Check for future growth R. Ching, Ph.D. MIS California State University, Sacramento 44 Gather Information Meet with the users to get gather information Interviews Documents R. Ching, Ph.D. MIS California State University, Sacramento 45

Derive Relations 46 Invoices have invoice items Invoice Items One-to-many relationship Have 1..1 Invoice number (pk) 1..* Product code (pk) Manufacture code Quantity Mandatory (all Sales Price Weak entity type (Invoice number is part of key) invoices must have at least one invoice item Invoice Invoice number (pk)

Invoice date Delivery date Sales type Customer account Strong entity type Strong and weak entity types Relationship types (cardinality) Participation (mandatory vs. partial) R. Ching, Ph.D. MIS California State University, Sacramento Validate Relations Normalize relations Validate against transactions - Can a transaction be recreated given the data retained in the relations?) Check integrity constraints Required data (not null) Domain constraints (in, references) Multiplicity Entity integrity (primary key) Referential integrity (foreign key) General constraints (business rules) R. Ching, Ph.D. MIS California State University, Sacramento 47 Review Data Model with the Users Be pleasant and professional, not arrogant, challenging or

condescending Not everyone is receptive to change Your role is to facilitate change The user is always right Its his/her data Document all change requests (CYA) Listen, listen, listen (Even if you dont agree) R. Ching, Ph.D. MIS California State University, Sacramento 48 Logical Global Data Model 49 Invoice Records Transactions Inventory Counts and retail prices Local Data Models Cust Accounts Customer credit accounts Cust Billing

Customer credit sales Vendor History Vendor performance Product Sales Sales history R. Ching, Ph.D. MIS California State University, Sacramento Global Data Model Local data models are merged to create a (near) normalized global data model Physical Database Design The process of producing a description of the implementation of the database on secondary storage It describes the base relations, fle organizations and indexes used to achieve efficient access to the data and nay associated integrity constraints and security measures Whats involved Translate logical data model for target DBMS: Design base relations, representation of derived data and general constraints Design file organizations and indexes: Analyze transactions, choose file organizations, choose indexes, Dictated by the DB product estimate disk space requirements Design user views and security mechanisms

Consider the introduction of controlled redundancy Monitor and tune the operational system R. Ching, Ph.D. MIS California State University, Sacramento 50 Logical vs. Physical Database Design Logical The process of constructing a model of the information use the enterprise based on one model of data, BUT independent of a particular DBMS and other physical aspects. Physical The process of producing a description of the implementation of the database on secondary storage; it describes the storage structures and access methods used to gain access effectively. Whereas the logical database design is concerned with the what, physical database design is concerned with the how. R. Ching, Ph.D. MIS California State University, Sacramento 51 Physical Database Design Five steps: Translate the global (enterprise) logical data model for the target DBMS Design files organizations and indexes, estimate database

space (disk space requirements) Design and implement user views and security mechanisms Consider the introduction of controlled redundancy (denormalization) Monitor and tune the operational system R. Ching, Ph.D. MIS California State University, Sacramento 52 Translate the Global Logical Database Model for the Target DBMS Design the relations for the target DBMS Decide how to represent the base relations in the global logical data model in the target DBMS Specify keys (primary, foreign), default values, integrity constraints (table, column), and indexes Design integrity rules for the target DBMS Design the enterprise constraints for the target DBMS Applies to updates and inserts R. Ching, Ph.D. MIS California State University, Sacramento 53 Design and Implement the Physical Representation Determine the file organizations and access methods that will be used to store the base relations (i.e., the way in which

relations and tuples will be held in secondary storage) Depends on the vendor! Understand the system resources Understand the capabilities of the hardware (CPU, memory, disk I-O) Analyze the softwares performance and limitations on the network (client/server) and Internet R. Ching, Ph.D. MIS California State University, Sacramento 54 Design and Implement the Physical Representation Analyze the transactions - understand the functionality of the transactions that will run on the database, and analyze the import transactions Choose file organization Choose secondary indexes - determine whether secondary indexes will enhance performance Index the primary key (if it is not the key of the file organization) Do not index small relations Add a secondary index to a heavily used secondary key Add a secondary index to a frequently used foreign key R. Ching, Ph.D. MIS California State University, Sacramento

55 Design and Implement the Physical Representation (cont.) AVOID INDEXING AN ATTRIBUTE OR RELATION THAT IS FREQUENTLY UPDATED Avoid indexing an attribute if the query will retrieve a large portion of the tuples in a relation Avoid indexing attributes that consist of long character strings R. Ching, Ph.D. MIS California State University, Sacramento 56 Design and Implement the Physical Representation Consider the introduction of controlled redundancy Determine whether introducing redundancy in a controlled manner by relaxing the normalization rules will enhance performance Denormalize only when necessary However, denormalizing Makes implementation more complex Sacrifices flexibility May slow down updates (although retrievals may be increased)

R. Ching, Ph.D. MIS California State University, Sacramento 57 3NF (Logical Database Design) 58 Invoice Relation Invoice_number Invoice_date Date_delivered Cust_account Customer Relation Cust_account Cust_name Cust_addr Zip_code Zip_code Relation Zip_code City State Invoice_items Relation Invoice_number Item Item_qty Item_price Items Relation

Manufacturers Relation Item Item_descrip Manuf_code Manuf_code R. Ching, Ph.D. MIS California State University, Sacramento Manuf_name Denormalization 59 Duplicating attributes or combining relations Combining 1:1 relationships Customers Relation Cust_account Cust_name Cust_addr Zip_code Customer_accounts Relation Cust_account Account_type Credit_limit Current_balance Pay_history Customers Relation Cust_account Cust_name Cust_addr Zip_code Credit_limit Current_balance Pay_history R. Ching, Ph.D. MIS California State University, Sacramento

Account_type Denomalization 60 Duplicating attributes or combining relations Duplicating nonkey attributes in 1:M relationships to reduce joins (creating partial or transitive dependencies) Customers Relation Cust_account Cust_name Cust_addr Zip_code Credit_limit Current_balance Pay_history Account_type Zip_codes Relation Zip_code City State Customers Relation Cust_account Cust_name Cust_addr City Account_type Credit_limit Current_balance

R. Ching, Ph.D. MIS California State University, Sacramento State Zip_code Pay_history Denomalization 61 (cont.) Reference tables (introducing transitive dependencies) Invoice_items Relation Invoice_number Item Item_qty Item_price Items Relation Manufacturers Relation Item Item_descrip Manuf_code Manuf_code Manuf_name Problem: In order to know the manufacturers name of a customers purchased item, a join between Items and Manufacturers must be performed R. Ching, Ph.D. MIS California State University, Sacramento

Denomalization 62 (cont.) Reference tables (introducing transitive dependencies) Invoice_items Relation Invoice_number Item Item_qty Item_price Items Relation Manufacturers Relation Item Item_descrip Manuf_code Manuf_code Manuf_name Invoice_number Item Manuf_code Item_price Manuf_name Item_qty R. Ching, Ph.D. MIS California State University, Sacramento Denomalization

63 (cont.) Duplicating foreign key attributes in 1:M relationships to reduce joins Invoice_items Relation Invoice_number Item Item_qty Item_price Items Relation Manufacturers Relation Item Item_descrip Manuf_code Manuf_code Manuf_name Problem: To find the manufacturers name of a product (e.g., Sony CDP-525) from line_items (relation), two joins must be made: manufacturers to products, and products to manufacturers. R. Ching, Ph.D. MIS California State University, Sacramento Denomalization 64 (cont.)

Invoice_items Relation Invoice_number Item Item_qty Item_price Items Relation Manufacturers Relation Item Item_descrip Manuf_code Manuf_code Invoice_number Item Manuf_code R. Ching, Ph.D. MIS California State University, Sacramento Manuf_name Item_qty Item_price Denomalization 65 (cont.) Duplicating attributes in M:N relationships to reduce joins If joint accounts are allowed and different types of accounts (i.e., long term, revolving) are available:

Customers Relation Cust_account Cust_name Cust_addr Zip_code Soc_Sec_Num M:N Customer_accounts RelationM:N Cust_account Account_type Credit_limit Current_balance Pay_history Soc_Sec_Num R. Ching, Ph.D. MIS California State University, Sacramento Denormalization 66 123456789 John Smith 123456789 Jane Smith 112233445 John Doe ... 123-45-6789 987-65-4321 567-32-1234 A customer can have

several accounts... 123456789 123456789 543219876 678901234 548794133 123-45-6789 987-65-7321 123-45-6789 987-65-7321 567-32-1234 R. Ching, Ph.D. MIS California State University, Sacramento An account can have several owners... Denormalization 67 (cont.)

Customers Relation Cust_account Cust_name Cust_addr Zip_code Soc_Sec_Num M:N Customer_accounts RelationM:N Cust_account Account_type Credit_limit Current_balance Pay_history Soc_Sec_Num R. Ching, Ph.D. MIS California State University, Sacramento Denormalization 68 (cont.) Duplicating attributes in M:N relationships to reduce joins Customers Relation Cust_account Cust_name Cust_addr Zip_code Soc_Sec_Num Customer_accounts Relation Cust_account Account_type Credit_limit Current_balance Pay_history Soc_Sec_Num Cust_account Account_type Credit_limit Current_balance Pay_history

Soc_Sec_Num Cust_name R. Ching, Ph.D. MIS California State University, Sacramento Denomalization 69 (cont.) Introducing repeating groups (if the number of occurrences is known and/or constant) Creating extract tables (in an extreme case, an unnormalized relation) - frees computing resources Cust_account Account_type Credit_limit Current_balance Pay_history Soc_Sec_Num1 Cust_name1 Soc_Sec_num2 Cust_name2 R. Ching, Ph.D. MIS California State University, Sacramento Denomalization 70 (cont.) Introduction of codes to Simplify the composite key Retain the original sequence Invoice_items Relation Invoice_number Item Manuf_code Item_qty Item_price Invoice_number Item_number Item Manuf_code Item_qty Item_price

R. Ching, Ph.D. MIS California State University, Sacramento Invoice No. Order No. Date: 6 / 15 / 93 1005-25941-2 Account No. Customer: Address: William Tell 2036-26 Street Sacramento CA 95819 City Date Shipped: Item Number 1 2 3

Stereos To Go 10001 State Zip Code 6 / 18 / 93 Product Code Product Description/Manufacturer Qty SAGX730 AT10 CDPC725 Pioneer Remote A/V Receiver Cervwin Vega Loudspeakers Sony Disc-Jockey CD Changer 1 1 1

4 5 R. Ching, Ph.D. MIS California State University, Sacramento Price 56995 35995 39995 71 10001 AT10 10001 CDPC725 10001 SAGX730 CV Loudspeakers 2 SON Disc-Jockey CD Changer 1 PIO Remote A/V Receiver 1 Re ta il_ pr i ce

72 Qt y M an uf _c od Ite e m_ de sc rip ti o n In vo ic e_ nu mb Ite er m

Denormalization 359.95 399.95 569.95 Key Problem: These items are not in the sequence as they appear on the original document when retrieved from the table. R. Ching, Ph.D. MIS California State University, Sacramento 10001 01 SAGX730 PIO Remote A/V Receiver 1 10001 02 AT10 CV Loudspeakers 2 10001 03 CDPC725 SON Disc-Jockey CD Changer 1 Key R. Ching, Ph.D. MIS California State University, Sacramento Re ta il_ pr i ce

73 Qt y M an uf _c od Ite e m_ de sc rip ti o n In vo ic e_ nu It e mb m_ er nu

mb er Ite m Denormalization 569.95 359.95 399.95 Denomalization 74 (cont.) Introducing calculated attributes Simplify processing Invoice_items Relation Invoice_number Item Manuf_code Item_qty Item_price Invoice_number Item_number Item Manuf_code Item_qty Item_price Extended_price Item_qty x Item_price R. Ching, Ph.D. MIS California State University, Sacramento 10001 01 SAGX730 PIO Remote A/V Receiver 1 10001 02 AT10

CV Loudspeakers 2 10001 03 CDPC725 SON Disc-Jockey CD Changer 1 Re tai l_ pr i ce Ex te n de d_ pr ic e Qt y 75 Ite m_ de sc r ip tio n

M an uf _ co de In vo ice _n Ite um m_ be nu r mb It e er m Denormalization 569.95 569.95 359.95 719.90 399.95 399.95 Calculation

R. Ching, Ph.D. MIS California State University, Sacramento 76 R. Ching, Ph.D. MIS California State University, Sacramento

Recently Viewed Presentations

  • Understanding Participles - Monroe Co Schools KY

    Understanding Participles - Monroe Co Schools KY

    Verbals. A verbal is a verb that is being used as another part of speech rather than a verb. Examples . The frightened cat scratched my eye.. Running. seems tiresome at first. Sometimes I like
  • Set Title in 40pt. No more than 2 lines

    Set Title in 40pt. No more than 2 lines

    Don't jump to these. Used to synchronize parallel query workers. Just means you have a parallel query. High wait times mean long running parallel queries. Look at the Tasks. You may not need to do anything. Find queries and tune...
  • Overview - Open Spaces Society

    Overview - Open Spaces Society

    CROW Act 2000. Public has the right to walk on. all land mapped as open country, all registered common land outside Inner London. all land over 600 metres . in England and Wales, subject to restrictions and stipulations.
  • ESD Control Experts HBM vs. CDM vs. CBE

    ESD Control Experts HBM vs. CDM vs. CBE

    CDM - Charge Device Model. Most relevant in today's electronic production. Rapidly discharge of energy from a charged device to ground. Dominant discharge on automated operation electronic assembly. Voltage induction to component and tribocharging is a must.
  • Connectivity - Wellcome Trust Centre for Neuroimaging

    Connectivity - Wellcome Trust Centre for Neuroimaging

    Cortical connectivity and cognition Functional segregation and integration Intrinsic and extrinsic connections ... Bilaminar - Bilaminar more laminar-specific with increasing distance G SG IG Hierarchies - parallel and serial pathways Reciprocal connections Feedback connections (quantitative and ...
  • Making Applications - Utah

    Making Applications - Utah

    Cubic Feet per Second (CFS) A unit of measure used to quantify the rate of flow from a water source. Height (feet) * Width (feet) * Velocity (feet per second) Acre-Feet (AF) A unit of volume defined as one acre...
  • Formátujeme pomocí kaskádových stylů

    Formátujeme pomocí kaskádových stylů

    Základní škola a mateřská škola Bzenec Číslo projektu: CZ.1.07/1.4.00/21.2769 Číslo a název šablony klíčové aktivity: III/2: využívání ICT - inovace
  • Comm. And Networking

    Comm. And Networking

    WAN. Wide Area Network. As the term implies, a WAN spans a large physical distance. The Internet is the largest WAN, spanning the Earth. A WAN is a geographically-dispersed collection of LANs. A network device called a router connects LANs...