Virginia Polytechnic Institute and State University
March 9, 1997 - Version 1.3
Copyright 1996, 1997, Neill A. Kipp and the Electronic Thesis and Dissertation Project
We introduce SGML and ETDs and instruct how to build an ETD in SGML. We explain how to run the formatting software and view an ETD as formatted in HTML.
This work was made possible by grants from Southeastern Universities Research Association (SURA) and the Department of Education's Fund for the Improvement of Post-secondary Education (FIPSE).
Electronic Thesis and Dissertation Markup Language (ETD-ML):
The graduate school chose ETD-ML as the markup language for ETDs, mainly because HTML does not define ...
What does ETD-ML look like? This is an example from a bibliography.
<worktitle>Being Green Revisited</worktitle> <workauthor>Frog, K. T., Jr.</workauthor>
HTML and ETD-ML are similar. Both languages are constructed using the Standard Generalized Markup Language (SGML).
In SGML, all documents are structured data.
SGML lets you use markup to mark the data.
Markup tells what each data item means.
The system determines how to format the data and where the links go.
The building block of SGML is the element.
Each element has a start-tag, content, and an end-tag.
Inside the start-tag and end-tag is a generic identifier (GI) that tells the name of element.
Elements can contain data or other elements.
Attributes are [name, value] pairs that help describe elements in an SGML document.
We declare document structures in the Document Type Definition (DTD)
The DTD declares...
In short, all possible document structures are declared in the DTD.
Our document type is described in the DTD for ETDs.
NOTE: HTML has its own DTD, too.
QUESTION: So what do I type?
First, in your file ``thesis.etd'' you make the document type declaration :
<!DOCTYPE etd SYSTEM "etd.dtd" [ ]>
The "DOCTYPE etd" tells the computer what type your file is. The structure is defined in the DTD file etd.dtd.
Next you put the start-tag for the ETD element.
An ETD is made up of front matter, body matter, and back matter.
We will begin with the front matter.
Front matter contains information about the thesis or dissertation.
It includes title, author, submission school, degree, major, approvals, date, city, state, keywords, copyright, abstract, grant (optional), dedication (optional), and acknowledgments (optional).
Use the front tag to begin the front matter. Use apprpriate markup to mark the title, author, etc.
<front> <title>Use of Metaphor in Shakespeare's Plays and its Potential Application in Twenty-first Century Literature <author> <given>Albert J. <surname>Kippleby
NOTE: To save typing, you may omit end-tags of elements that cannot contain the following element.
NOTE: You may not omit end-tags of items within paragraphs (more on this later); put empty end-tags (</>) instead.
Continue the front matter with submission, school, degree, and major
<submission>Dissertation <school>Virginia Polytechnic Institute and State University <degree>Doctor of Philosophy <major>Literature and Technology
Continue by listing all the approvals.
Note that the approvals element contains name elements.
<approvals> <name>Laura Weiss <name>Emilio J. Arce <name>M. J. Bean
Continue by listing the date of your defense, city, and state.
<date>July 16, 1996 <city>Blacksburg <state>Virginia
Continue by listing the keywords. Keywords contain keyword elements.
NOTE: keywords is plural and keyword is singular.
<keywords> <keyword>Metaphysics <keyword>Information Retrieval <keyword>Spacecraft
Although copyright implicitly devolves to the author, you should explicitly claim the copyright of your work.
<copyright>Copyright 1996, Albert J. Kippleby
The abstract follows. It contains one or more paragraphs. Each paragraph contains data, or perhaps tagged data like foreign words (bien sur!), strong words , emphasized words, etc.
<p>The need for concrete examples increases when technology becomes difficult to explain. In documentation for computer systems especially, we see a wide audience of field experts attempting to understand documentation for computer software and hardware of which they should only require a cursory understanding. Additionally, as the pace of the information age quickens, we see document authors struggle for <foreign>examplia concretes</> with wide applicability, and consistently rely on excerpts from Shakespearean literature as as public-domain source for their various explications.
If your work was funded by a granting institution, list the grant information here.
<grant>This work received support from the Southern Universities Research Association (SURA) <q>Monticello Library Project.</q>
You may provide a dedication if you want:
<p>I dedicate this work to all the dolphins.
Acknowledge everyone you know here:
<p>I would like to thank my loving spouse, my parents, my siblings, my beautiful and patient children, ...
<p>... my major professor (who never sleeps), committee, teachers, loyal staff, administrators, ...
<p>... Mill Mountain, Bollo's, SubStation II, the Cellar, Old Dominion Brewery...
After the front matter, an ETD has body matter
The body matter contains one or more chapters.
Use the <body> tag to begin the body.
NOTE: The <front> element closes automatically.
Chapters have a head (optional) and contain paragraphs followed by sections.
Each chapter in the ETD is numbered: 1, 2, 3, etc.
<p>William Shakespeare has profoundly affected the field of technology worldwide. In the United States there was a huge surge of Shakespearean resurgence literature starting in the early 1980s, beginning with the Shakespearean Festival in Montgomery, Alabama and spreading outward...
In the system,
<section><head>East Coast Revival
<p>After the uprising in Montgomery, the entire East Coast of the United States reveled in the motif. Inevitably, software developers would join, and inevitably, it would affect their writing in extreme ways.
Paragraphs contain the body text of the ETD, along with lists, quotations, and mathematical formulae.
They are the building block of any document, particularly an ETD.
Because there are so many paragraphs, the generic identifier is very short: i.e., ``p''.
The following elements are allowed anywhere text may occur.
The following are more elements that are allowed anywhere text may occur.
Paragraphs can contain any of the above, but can also contain:
Note that the processing system chews up leftover whitespace.
You may use spaces or tabs or newlines to indent the source file (thesis.etd) all you want. The output will be the same!
The above is not true for the element pre , however. Inside that element, all whitespace is preserved.
Ordered lists look like this:
<ol> <li>Veni <li>Vedi <li>Veci </ol>
Unordered lists look like this:
<ul> <li>Veni <li>Vedi <li>Veci </ul>
Description lists look like this:
<dl> <dt>Veni <dd>I came <dt>Vedi <dd>I saw <dt>Veci <dd>I conquered </dl>
The link element (link) connects your document together.
Links connect chapters, sections, figures, tables, citations, terms, glossaries, ...
Creating a link has only four steps:
Footnotes can occur anywhere in the ETD. Use the link element to refer to each footnote.
You can include a multimedia object (picture, video, animation, soundclip) anywhere in the ETD.
In the preamble (between ``['' and ``]''),
And in your document:
<mm entity=lander>color graphic, gif, 12k</mm>
You make a multimedia object like this:
Floating matter goes anywhere in the document. The formatter will determine where/how to display it.
You may float paragraphs, multimedia objects, or tables.
<float id=cat> <mm entity=cat>color graphic</> <caption>Shakespeare's Cat</> </float>
<float id=cat> <mm entity=cat>color graphic</> <caption>Shakespeare's Cat</> </float>
Encode a column-major table by listing the columns, one after the next.
Encode a row-major table by listing the rows, one after the next.
Each column can have a column head.
Columns can contain cells or row heads.
<table> <colhead>Word <column> <rowhead>hello <rowhead>goodbye <colhead>Spanish <column> <c>hola <c>adios <colhead>French <column> <c>bonjour <c>au revior
Each row can have a row head.
Rows can contain cells or column heads.
<table> <rowhead>Word <row> <colhead>Spanish <colhead>French <rowhead>hello <row> <c>hola <c>bonjour <rowhead>goodbye <row> <c>adios <c>au revoir
In this part we discussed:
ETD back matter contains the bibliography, appendices, and vita (your life and career history).
The bibliography is simply a list of citations.
Citations may contain any of the following elements:
You refer to a citation like this:
Appendices are exactly like chapters except that they are numbered A, B, C, ...
Sections within an appendix are numbered like A.1, A.2, ...
Subsections within an appendix are numbered like A.1.1, A.1.2, ...
Your life story goes in the vita
<p>Albert J. Kippleby was born on a sunny day...
<p>Later he attended Virginia Polytechnic Institute and State University where he completed this auspicious document.
Recall the system diagram, where the user writes ``thesis.etd,'' runs etd2html, then views ``thesis.html'' with a Web browser.
Just like software, documents may have bugs.
Debugging is part of SGML document development.
If the beginning of the output file (thesis.html) has a message that begins with: nsgmls:SGML error, then you need to fix one or more bugs.
HINT: Most bugs come from simple typos.
NOTE: We have designed ETD-ML to avoid as many possibilities for bugs as possible.
The parts of a system error or warning message are as follows: