Extensible Markup Language (XML) has been the standard for persistent data storing and data interchange via the Internet, due to its openness, self-descriptiveness and flexibility. As such, more and more data have been converted into XML format or handled in XML format, and the chance for software developers to handle XML documents is getting higher. The following is a list of common ways of manipulating XML documents, 1. extracting desired data from the documents for further processing, 2. translating their contents for generating the resultant XML documents that mostly in a different structure (or schema), or even in different document formats, such as plain text or HTML document, or 3. designing the necessary data structures, such as table schema in a relational database, for storing their contents Given a small to medium sized XML document, it is possible to use a text editor to view their contents for a better understanding to achieve the above goals. However, XML documents can be gigantic that cannot be loaded into the computer memory. As such, it is impossible to understand its contents by viewing it with a text editor. There are software that enables the users to examine the XML document, such as with queries in XPath, provided that the user has gone through the entire XML document for its structure or has studied the hard to understand schema document, Document Type Definition (DTD) [1] or the more powerful but yet more complicated XML Schema Document (XSD) [2]. It would be even worse if it is necessary to handle a huge sized XML document and the corresponding schema is missing. Then, the user has no choice but to view the document manually. With the above scenario in mind, we propose a systematic approach to reverse engineer arbitrary XML documents to their conceptual schema, DTD Graphs…
Contents
CHAPTER 1 INTRODUCTION
1.1 BACKGROUND
1.2 MOTIVATION
1.3 OBJECTIVES
1.4 INTRODUCTION OF THE PROPOSED SOLUTION
1.5 REPORT ORGANIZATION
CHAPTER 2 RELATED WORKS
2.1 COMPARISON OF RELATIONAL DATABASE AND XML TECHNOLOGIES
2.2 THE DETERMINATION OF DATA SEMANTICS FROM XML DOCUMENTS
2.2.1 Reverse engineering of schema from XML
2.3 THE APPROACHES OF GENERATING XML DOCUMENTS
2.4 THE APPROACHES OF STORING XML DOCUMENTS TO RELATIONAL DATABASE
2.5 THE IMPLEMENTATION OF INHERITANCE AMONG XML ELEMENT
CHAPTER 3 IMPLEMENTING AND DETERMINING VARIOUS DATA SEMANTICS
3.1 OVERVIEW OF XML
3.1.1 Referential Integrity in XML documents
3.1.2 Common XML document structures and their interpretations
3.2 XML SCHEMA OVERVIEW
3.2.1 XML Schema – DTD, XSD and alternatives
3.3 DTD GRAPH AND ALTERNATIVES
3.3.1 The DTD Graph to be used
3.4 THE ALGORITHM FOR SIMPLIFYING THE DTD
3.4.1 Constructing DTD graph based on XSD
3.4.2 Data Semantic Graph for XML
3.5 VARIOUS DATA SEMANTICS IN XML
3.5.1 Cardinalities – one-to-many/one-to-one
3.5.2 The Algorithm for Determining Cardinality Relationships
3.5.3 Cardinality – many-to-many and N-ary
3.5.4 The Algorithm for Determining Many-to-many and N-ary Relationships
3.5.5 Participation
3.5.6 Aggregation
3.5.7 The algorithm for Determining Aggregation Relationships
3.5.8 Weak-entity
3.5.9 Is-A relationship
3.5.10 The algorithm for Determining Is-a Relationships
3.5.11 Generalization relationships
3.5.12 Categorization relation
3.5.13 Unary relation
3.5.14 Data semantics natively supported by XML
3.6 VERIFYING AN XML DOCUMENT WITH THE ALGORITHMS
3.7 LIMITATIONS OF THE PROPOSED ALGORITHMS
CHAPTER 4 CASE STUDIES
4.1 EXAMPLE ONE
4.2 EXAMPLE TWO
CHAPTER 5 PROTOTYPE
5.1 THE DEVELOPMENT OF THE PROTOTYPE
5.2 SAMPLE OUTPUTS FOR THE CASE STUDIES
5.2.1 Execution of the prototype
5.2.2 Example On
5.2.3 Example Two
CHAPTER 6 EVALUATION
6.1 CORRECTNESS ANALYSIS
6.1.1 Group one – Cardinality and participation data semantics
6.1.2 Group two – is-a data semantic
6.1.3 Group three – aggregation data semantic
6.1.4 Group four – generalization, many-to-many/n-ary, unary data semantics
6.2 PERFORMANCE ANALYSIS
6.2.1 Performance testing of the prototype
6.3 PRACTICAL LIMITATIONS AND CONSIDERATIONS
CHAPTER 7 CONCLUSION
7.1 WHAT’S NEXT
REFERENCES
APPENDIX A LIST OF PUBLICATIONS
APPENDIX B SAMPLE XML AND DTD DOCUMENTS
Author: Shiu, Hoi Cheung
Source: City University of Hong Kong
Download URL 2: Visit Now