CS 452: DATABASE MODELING CONCEPTS

Fall 2011


[Course Description] [Class Schedule] [Reference Material] [Miscellaneous Material] [Blackboard: HW Answers, Grades, ...]


Course Description

  • Instructor: David W. Embley
  • Teaching Assistant:
  • Prerequisite: CS236 (programming experience, sets, relations, relational algebra, functions, propositional logic, predicate logic); CS240 (programming maturity); CS340 (conceptual modeling); 360 (databases on the web)
  • Objectives
    • Expected Learning Outcomes
    • Objectives (for the students): (1) Become familiar with the terminology and fundamental concepts of relational databases and database management systems. (2) Learn SQL well. This includes relational algebra and relational calculus as the basis for SQL queries. It also includes embedding SQL in a high-level programming language, and it includes triggers and transaction processing. (3) Understand performance issues and optimization strategies. This includes query rewriting, secondary storage characteristics, and access strategies. (4) Be able to design and develop database applications. This includes conceptual modeling and normalization theory. (5) Become familiar with some of the current challenges facing database professionals (e.g. semistructured data management, XML databases, information extraction, and semantic-web technology).
    • Objectives (for the instructor): (1) Provide a learning environment in which students can achieve the course objectives. (2) Meet student expectations.
    • Objectives (for all): D&C 50:17–22, so that all "are edified and rejoice together."
  • Text:
    • Many database textbooks are available; commonly used textbooks include:
      • Database Systems: The Complete Book, by Garcia-Molina, Ullman, and Widom
      • A First Course in Database Systems, second edition by Ullman and Widom
      • Database System Concepts, sixth edition, by Silberschatz, Korth, and Sudarshan
      • Fundamentals of Database Systems, sixth edition, by Elmasri and Navathe
      All are expensive and cover a wide variety of topics beyond what we can possibly cover in a semester, and none cover topics currently challenging database professionals well enough to be useful.
    • Text material will come from four sources: (1) on-line reference manuals and tutorials, (2) selected chapters from textbooks (used by permission of publishers), (3) journal articles, and (4) Wikipedia. See the class schedule for links to this material.
  • Course Content: See the class schedule for a detailed course outline.
  • Grading: Final grades will be calculated as follows with these weights:
                     Homework Assignments	20%
                     Projects		30%
                     Midterm Examinations	25%
                     Final Examination	25%
    
    			A  93.3–100%	A- 90–93.2%		
    	B+ 86.7–89.9%	B  83.3–86.6%	B- 80–83.2%
    	C+ 76.7–79.9%	C  73.3–76.6%  	C- 70–73.2%
    	D+ 66.7–69.9%	D  63.3–66.6%	D- 60–63.2%
    			E  Below 60%
    	I & W: given according to university policies
    Note: Grades are non-competitive—everyone can get an A.
  • Class Policies:
    • Homework Assignments: Written homework assignments are due as specified in the schedule. Submit assignments by handing them in in class or by slipping them under your instructors door (3332 TMCB). Unless otherwise explicitly stated on the schedule, assignments are to be submitted on (or before) the due date before the Talmage Building closes for the night. (Late homework assignments receive zero credit. However, you may always turn in late homework assignments to be "graded" for feedback. Further, some late homework assignment may receive credit, but only if the circumstances warrant an exception and are discussed with the instructor in advance or within reasonable time soon after receiving the zero score.)
    • Projects: There will be several projects, designed to help meet the objectives of the class. All requirements specified in the project descriptions are to be met by the due date as specified on the schedule. Up to 90% credit may be received for late projects turned in within one school day of the due date, up to 80% for late projects turned in within two school days of the due date, and up to 70% for late projects turned in within three school days of the due date. Projects turned in more than three school days late may receive up to 60% credit.
    • Examinations: There will be two midterm exams. These tests will be given in the Testing Center as specified on the schedule. The final exam will be comprehensive, but weighted somewhat towards the material in the last third of the class. The final exam will be taken in the class room, as scheduled by the university. Examinations are closed book/notes/calculators/laptops. Midterm examinations have no time limit; the final has a time limit as scheduled by the university.
    • Handback Box: Homework, project write-ups, and examinations not picked up when returned in class will be in a CS452 Handback Box in the the instructors office. You can pick up items during office hours.
  • Personal Integrity:
    • Learning to Learn: In today's rapidly changing world, the ability to acquire knowledge and skills and then use them is more important than the ability to store knowledge and facts and then recall them. How do you learn to learn? You probably already know a lot about learning to learn, but here are some reminders.
      Reminders for Students: Learning to learn requires doing—not just thinking; and it requires being responsible—not expecting knowledge and skill to be attained without effort. Responsible learners
      • ask questions ("Ask and ye shall receive."),
      • pursue learning diligently,
      • realize that learning is largely under their own control,
      • know when they do and do not understand, and
      • have the ability to get help from learning material and from other people.
      Although taking responsibility for learning is largely internal, it does have external manifestations. Responsible learners engage in the learning process during class—they listen, ask questions, and help clarify points their classmates may not understand. They do not multi-task during class (e.g., they do not read a newspaper, surf the web, or do homework). Responsible CS learners start projects early enough to ensure that technical difficulties with tools can be resolved and that they can "sleep on" programming difficulties when a resolution is not immediate. Responsible learners start their homework early enough to leave themselves time to get any needed help from TAs, instructors, or classmates. Responsible learners, realizing that explaining or teaching a concept to someone else deepens learning, are willing to help others learn and understand class material.
      Reminders for Instructors: Recall the old maxim: "It is good to give a hungry person a fish to eat, but it is even better to teach a hungry person how to fish." Apply the maxim: Provide students with fish (subject knowledge), but also teach students how to fish (how to learn subject knowledge). Then give students opportunities to fish (to acquire knowledge and skills) and to enjoy catching fish (to use their newly acquired knowledge and skills in meaningful ways).
    • Working Together: You may work together with other members of the class; however, do not turn in other people's work. There is a fine line here that requires judgment on your part. Your objective when working with others should be to maximize your learning and the learning of others. You maximize everyone's learning when you help each other come to a greater understanding, but not when you accept or provide an unacceptably easy way to complete an assignment. (There are no shortcuts to learning.) Examples of acceptable collaboration: discussing ideas and concepts related to assignments and projects; developing high-level pseudo-code for a project; and working on homework problems in a study group in a way that enhances everyone's learning. Examples of unacceptable collaboration: copying homework answers; allowing someone to copy your homework; using a copy of the provided homework answers; using some of another person's code to complete an assignment; allowing someone else to use some of your code to complete an assignment; and accepting help from a TA or classmate without really internalizing the material yourself so that you cannot fully claim the work you turn in as your own.
  • Departmental/College/University Policies:
    • Disabilities: If you have a disability that may affect your performance in this course, you should get in touch with the office of Services for Students with Disabilities (1520 WSC). This office can evaluate your disability and assist the professor in arranging for reasonable accommodations.
    • Children in the Classroom: The serious study of the physical and mathematical sciences requires uninterrupted concentration and focus in the classroom. Having small children in class is often a distraction that degrades the educational experience for the entire class. Please make other arrangements for child care rather than bringing children to class with you. If there are extenuating circumstances, please talk with your instructor in advance.
    • Zero Tolerance: Be sure you know and comply with the Computer Science Department's systems abuse policy and university's honor code and sexual harrasement policies.
      • Honor Code Standards. In keeping with the principles of the BYU Honor Code, students are expected to be honest in all of their academic work. Academic honesty means, most fundamentally, that any work you present as your own must in fact be your own work and not that of another. Violations of this principle may result in a failing grade in the course and additional disciplinary action by the university. (Faculty members in the Department of Computer Science are encouraged to report academic dishonesty to the honor-code office.) Students are also expected to adhere to the Dress and Grooming Standards. Adherence demonstrates respect for yourself and others and ensures an effective learning and working environment. It is the university's expectation, and my own expectation in class, that each student will abide by all Honor Code standards. Please call the Honor Code Office at 422-2847 if you have questions about those standards.
      • Preventing Sexual Harassment. Title IX of the Education Amendments of 1972 prohibits sex discrimination against any participant in an educational program or activity that receives federal funds. The act is intended to eliminate sex discrimination in education and pertains to admissions, academic and athletic programs, and university-sponsored activities. Title IX also prohibits sexual harassment of students by university employees, other students, and visitors to campus. If you encounter sexual harassment or gender-based discrimination, please talk to your professor; contact the Equal Employment Office at 801-422-5895 or 1-888-238-1062 (24-hours), or http://www.ethicspoint.com; or contact the Honor Code Office at 801-422-2847.
  • Career Services:
    • The Computer Science Department is working to prepare students for influential positions in industry. In order to accomplish this goal, we encourage students to begin preparation for their future careers early in their educational experience.
    • The Department and University offer services and resources that will be invaluable in this preparation. Among these are the following.
      • The university's career and internship placement site eRecruiting (www.byu.erecruiting.com), which gives students access to all of the coporate recruiters who come to BYU.
      • For career advisement students may visit the "Counseling and Career Center" (www.byu.edu/ccc/placement), a site with information on careers, salary ranges, resume writing, and preparation for the work environment.
      • For personal help, visit career advisors in the College of Physical and Mathematical Sciences (N-179 ESC) and in the Computer Science Department (3361 TMCB).
  • Extenuating Circumstances: Extenuating circumstances may cause alterations to the course schedule and procedures. If extenuating circumstances arise in your life, see your instructor.

  • Class Schedule

    Date Topics Reading Supplements Assignments Due
    8/29 DB Introduction   Intro  
    8/31 Relational Data Model; Relational Algebra Wikipedia (Relational Algebra) Tables; BandB; RelAlg  
    9/2 Relational Algebra     Homework 1
    9/5 Labor Day Holiday      
    9/7 Relational Algebra      
    9/9 Basic SQL Queries; Embedded SQL Queries Wikipedia (SQL) SQL; EmbeddedSQL  
    9/12 Web DBs; php; Project 1; Recursive SQL Queries Wikipedia (php) php Example; The Elements of Style; Recursive Queries Homework 2
    9/14 SQL Queries continued; Relational Calculus Wikipedia (Relational Calculus); PredCalcRelCalc Review MoreSQL; RelCalc  
    9/16 Guest Speaker      
    9/19 Relational Calculus continued; Universal-quantification and SQL Queries      
    9/21 Relational-Division SQL Queries; Counting-All Queries; Skyline Queries; Project 2 Opt. SQL Design Patterns by Tropashko RelationalDivision; CountingAll; SkylineQueries  
    9/22       Project 1
    9/23 Secondary Storage; Files ScndryStor.pdf; FileOrg.pdf (Sections 1&2); Opt. Pathologies of Big Data; MagDisk FileOrg;  
    9/26 B+-Tree Indexing FileOrg.pdf (Section 3) B+Trees Homework 3
    9/28 Query Rewriting QueryOpt.pdf QueryRewriting  
    9/30 Cost Estimation   CostEst Homework 4 by 5:00 pm
    10/3 Database Modifications; View Update   DB Modifications; ViewUpdate  
    10/4       Project 2
    10/5 ER Data Modeling The Entity-Relationship Model, Pages 9–20; classic article—among the top 50 all time in CS ER; Keys  
    10/7 Conversion from ER to Relational DB The Entity-Relationship Model, Pages 25–29; ER to RDB Homework 5 by 5:00 pm
    10/10 Review   Interim Evaluation 1  
    10/10–11 Exam 1      
    10/12 Extended ER Data Modeling and Conversion to Relational DB Mapping Conceptual Models to Database Schemas, Sections 4.1–4.4; Extended Models & Mappings  
    10/13       Homework 6 by 5:00 pm
    10/14 Extended ER (cont.); UML; Data Modeling Tools; Project 3a   OSM & the Ontology Editor  
    10/17 Functional Depencencies FD Theory section 9.2 (Pages 431–432) and section 9.4 (Pages 434–438), Wikipedia (Functional Dependency) FDs Homework 7
    10/19 FDs (continued)      
    10/21 Normal Form Design: BCNF NormalForms; Wikipedia (BCNF) BCNF and 3NF Homework 8 by 5:00 pm
    10/24 Dependency Preservation & 3NF; 2NF & 1NF Wikipedia (3NF)    
    10/25       Project 3a
    10/26 MVDs and JDs; 4NF & PJNF   MVDs & JDs  
    10/28 Canonical ER Diagrams; Hypergraph Normalization; Project 3b—normalization Mapping Conceptual Models to Database Schemas, Section 4.5; Opt: Data Design—Reductions; Opt: Data Design—Synthesis Mappings and Normal Form Guarantees; Hypergraph Normalization Homework 9 by 5:00 pm
    10/31 Semantic Web; RDF & RDFS; SPARQL Wikipedia (Semantic Web); Opt. "The Semantic Web" in ScientificAmerican.com, May 2001 Semantic Web  
    11/2 Extraction Ontologies; Project 4 Opt. Web of Knowledge (WoK) Overview Opt. Semantic Web Information Management: A Model-Based Perspective Springer, 2010 WoK Presentation;  
    11/4 no class; instead, attend Michael Cafarella's colloquium on November 17th at 11:00 in TMCB 1170      
    11/7 Normalization Wrap-up Opt. "Is Abstraction the Key to Computing"    
    11/8       Homework 10 by 5:00 pm
    11/9 Cost Analysis; Project 3c; Constraints   CostAnalysis Constraints.doc  
    11/11 Triggers   TriggersTransactions.doc Project 3b by 5:00 pm
    11/14 Review   Interim Evaluation 2  
    11/14–15 Exam 2      
    11/16 Transactions TransProcessing (Intro) Transactions & Concurrency  
    11/17 Michael Cafarella colloquium, 11:00 in 1170 TMCB     Homework 13
    11/18 Crash Recovery; Concurrency Control TransProcessing (Sections 1 & 2)   Project 4
    11/21 Beyond RDBMSs; Semi-structured Data Management; Dataspaces; Presentation Project Managing Semi-structured Data; Beyond Relational Databases; Dataspaces Managing Semi-structured Data  
    11/22 Nested Schemes; Mappings to ORDB & XML Schema Mapping Conceptual Models to Database Schemas, Section 4.6; Opt: Generating Compact Redundancy-Free XML Documents from Conceptual-Model Hypergraphs NestedRelations; XNF; XNF Presentation; Mappings from ER to OR & XML Schema Homework 11 by 5:00 pm
    11/23–25 Thanksgiving Holiday      
    11/28 XML; XQuery; XML DTDs; XML Schema   XQuery; XQuery demo  
    11/29       Project 3c
    11/30 ACM-L Initiative; Web of Knowledge for Historical Documents; DB-like Initiatives on the Web Opt. WoK-HD; Opt. Freebase; Opt. Google Squared; Opt. Google Refine Opt. DBpedia; Opt. Fusion Tables From Information to Knowledge; ACM-L Initiative; WoK-HD  
    12/1       Homework 12
    12/2 Current/Future Data Engineering Challenges     student presentations
    12/5 Current/Future Data Engineering Challenges     student presentations
    12/7 Review   Expected Learning Outcomes  
    12/13 Final, 3:00-6:00      

    Class Bed & Breakfast Database

    You can download a copy of the Bed & Breakfast Database Instance used as an example in class.

    You can use the Bed & Breakfast Database on the Internet. You can also look at the source to see how to use SQLite with PHP to allow databases to be used on the Internet.


    Reference Material

    Miscellaneous

    Google Refine

    Google Fusion Tables Tutorial