Electronic Thesis and Dissertation Markup Language (ETD-ML)
User's Guide

Neill A. Kipp

Virginia Polytechnic Institute and State University



March 9, 1997 - Version 1.3

Blacksburg, Virginia

KEYWORDS:

Copyright 1996, 1997, Neill A. Kipp and the Electronic Thesis and Dissertation Project

(go to table of contents)

(ABSTRACT)

We introduce SGML and ETDs and instruct how to build an ETD in SGML. We explain how to run the formatting software and view an ETD as formatted in HTML.

This work was made possible by grants from Southeastern Universities Research Association (SURA) and the Department of Education's Fund for the Improvement of Post-secondary Education (FIPSE).


1. Instructional Overview

The language
Introduce the ETD markup language (ETD-ML)
The encoding
Show how to encode the front matter, body matter, and back matter
The system
Show how to run the formatting system to get results

2. What is ETD-ML?

Electronic Thesis and Dissertation Markup Language (ETD-ML):


3. Looks like HTML, but isn't

The graduate school chose ETD-ML as the markup language for ETDs, mainly because HTML does not define ...


4. Example of ETD-ML

What does ETD-ML look like? This is an example from a bibliography.

      <worktitle>Being Green Revisited</worktitle>
      <workauthor>Frog, K. T., Jr.</workauthor>

HTML and ETD-ML are similar. Both languages are constructed using the Standard Generalized Markup Language (SGML).


5. How does SGML work?

In SGML, all documents are structured data.

SGML lets you use markup to mark the data.

Markup tells what each data item means.

The system determines how to format the data and where the links go.


6. How SGML works: Elements

The building block of SGML is the element.

Each element has a start-tag, content, and an end-tag.

Inside the start-tag and end-tag is a generic identifier (GI) that tells the name of element.


Figure 6.1. Anatomy of an element.

7. How SGML works: Nested Elements

Elements can contain data or other elements.


Figure 7.1. Nested elements.

8. How SGML works: Attributes

Attributes are [name, value] pairs that help describe elements in an SGML document.

Attributes


Figure 8.1. Anatomy of an attribtue.

9. Document Type Definition

We declare document structures in the Document Type Definition (DTD)

The DTD declares...

In short, all possible document structures are declared in the DTD.


10. Document Type Definition, Formal Definitions

Our document type is described in the DTD for ETDs.

NOTE: HTML has its own DTD, too.


11. The ETD File

QUESTION: So what do I type?

First, in your file ``thesis.etd'' you make the document type declaration :

    <!DOCTYPE etd SYSTEM "etd.dtd" [
    ]>

The "DOCTYPE etd" tells the computer what type your file is. The structure is defined in the DTD file etd.dtd.


12. Document Element

Next you put the start-tag for the ETD element.

Like this:

    <etd>

13. Parts of an ETD

An ETD is made up of front matter, body matter, and back matter.


Figure 13.1. Main structures of an ETD

We will begin with the front matter.


14. ETD Front Matter

Front matter contains information about the thesis or dissertation.

It includes title, author, submission school, degree, major, approvals, date, city, state, keywords, copyright, abstract, grant (optional), dedication (optional), and acknowledgments (optional).


Figure 14.1. ETD Front Matter

15. Front Matter: Title, Author

Use the front tag to begin the front matter. Use apprpriate markup to mark the title, author, etc.

  <front>
    <title>Use of Metaphor in Shakespeare's Plays 
    and its Potential Application 
    in Twenty-first Century Literature
    <author>
      <given>Albert J. 
      <surname>Kippleby

NOTE: To save typing, you may omit end-tags of elements that cannot contain the following element.

NOTE: You may not omit end-tags of items within paragraphs (more on this later); put empty end-tags (</>) instead.


16. Front Matter: Submission, School, Degree, Major

Continue the front matter with submission, school, degree, and major

    <submission>Dissertation
    <school>Virginia Polytechnic Institute 
    and State University
    <degree>Doctor of Philosophy
    <major>Literature and Technology

17. Front Matter: Approvals

Continue by listing all the approvals.

Note that the approvals element contains name elements.

    <approvals>
      <name>Laura Weiss
      <name>Emilio J. Arce
      <name>M. J. Bean

18. Front Matter: Data, City, State

Continue by listing the date of your defense, city, and state.

    <date>July 16, 1996
    <city>Blacksburg
    <state>Virginia

19. Front Matter: Keywords

Continue by listing the keywords. Keywords contain keyword elements.

NOTE: keywords is plural and keyword is singular.

    <keywords>
    <keyword>Metaphysics
    <keyword>Information Retrieval
    <keyword>Spacecraft

20. Front Matter: Copyright

Although copyright implicitly devolves to the author, you should explicitly claim the copyright of your work.

    <copyright>Copyright 1996, Albert J. Kippleby

21. Front Matter: Abstract

The abstract follows. It contains one or more paragraphs. Each paragraph contains data, or perhaps tagged data like foreign words (bien sur!), strong words , emphasized words, etc.

<abstract>

<p>The need for concrete examples increases when technology becomes difficult to explain. In documentation for computer systems especially, we see a wide audience of field experts attempting to understand documentation for computer software and hardware of which they should only require a cursory understanding. Additionally, as the pace of the information age quickens, we see document authors struggle for <foreign>examplia concretes</> with wide applicability, and consistently rely on excerpts from Shakespearean literature as as public-domain source for their various explications.


22. Front Matter: Grant Information (optional)

If your work was funded by a granting institution, list the grant information here.

<grant>This work received support from the Southern Universities Research Association (SURA) <q>Monticello Library Project.</q>


23. Front Matter: Dedication (optional)

You may provide a dedication if you want:

<dedication>

<p>I dedicate this work to all the dolphins.


24. Front Matter: Acknowledgments (optional)

Acknowledge everyone you know here:

<acknowlegments>

<p>I would like to thank my loving spouse, my parents, my siblings, my beautiful and patient children, ...

<p>... my major professor (who never sleeps), committee, teachers, loyal staff, administrators, ...

<p>... Mill Mountain, Bollo's, SubStation II, the Cellar, Old Dominion Brewery...


25. Quick Review, Front Matter


26. Body Matter

After the front matter, an ETD has body matter

The body matter contains one or more chapters.


Figure 26.1. ETD Body Matter

27. Body Matter: Start Tag

Use the <body> tag to begin the body.

NOTE: The <front> element closes automatically.

    <body>

28. Body Matter: Chapter

Chapters have a head (optional) and contain paragraphs followed by sections.

Each chapter in the ETD is numbered: 1, 2, 3, etc.

<chapter><head>Introduction

<p>William Shakespeare has profoundly affected the field of technology worldwide. In the United States there was a huge surge of Shakespearean resurgence literature starting in the early 1980s, beginning with the Shakespearean Festival in Montgomery, Alabama and spreading outward...


29. Sections, Subsections, Blocks, Subblocks

In the system,

<section><head>East Coast Revival

<p>After the uprising in Montgomery, the entire East Coast of the United States reveled in the motif. Inevitably, software developers would join, and inevitably, it would affect their writing in extreme ways.


30. Paragraphs

Paragraphs contain the body text of the ETD, along with lists, quotations, and mathematical formulae.

They are the building block of any document, particularly an ETD.

Because there are so many paragraphs, the generic identifier is very short: i.e., ``p''.


31. Text

The following elements are allowed anywhere text may occur.

#PCDATA
Literally, ``parsed character data;'' this is the data content of an element
em
Denotes emphasized text
strong
Denotes strong text
tt
Denotes typed text
q
Denotes ``quoted'' text
foreign
Denotes that le text est d'un autre langue

32. More Text

The following are more elements that are allowed anywhere text may occur.

link
Makes a hypertext reference (hyperlink) to a citation, footnote, another chapter or section, etc. [more on hyperlinking later].
target
Targets the contained text as a potential anchor of a hyperlink (does not necessarily change the formatting)
a
Denotes that the contained text is an anchor of an HTML hyperlink reference to a URL.
sup
Denotes superscript (i.e., a2 + b2 = c2)
sub
Denotes subscript (i.e., He2)
worktitle
Denotes that the content is Title of a Work
articletitle
Denotes that the content is ``Title of an Article''

33. Paragraphs

Paragraphs can contain any of the above, but can also contain:

head
Head. A paragraph head
ol
Ordered list (numbered)
ul
Unordered list (bullets)
dl
Description list (like this one)
pre
Preformatted text
verse
Verse from a poem or play
blockquote
Long quote from another source (indents on both sides)
attrib
To whom the verse or blockquote is attributed

34. Body Matter: Hints

Note that the processing system chews up leftover whitespace.

You may use spaces or tabs or newlines to indent the source file (thesis.etd) all you want. The output will be the same!

The above is not true for the element pre , however. Inside that element, all whitespace is preserved.


35. Ordered Lists

Ordered lists look like this:

  1. Veni
  2. Vedi
  3. Veci
    <ol>
    <li>Veni
    <li>Vedi
    <li>Veci
    </ol>

36. Unordered Lists

Unordered lists look like this:

    <ul>
    <li>Veni
    <li>Vedi
    <li>Veci
    </ul>

37. Description Lists

Description lists look like this:

Veni
I came
Vedi
I saw
Veci
I conquered
    <dl>
      <dt>Veni
        <dd>I came
      <dt>Vedi
        <dd>I saw
      <dt>Veci
        <dd>I conquered
    </dl>

38. Hyperlinks

The link element (link) connects your document together.

Links connect chapters, sections, figures, tables, citations, terms, glossaries, ...

Creating a link has only four steps:

  1. Create the link element.
  2. Put the unique identifier (ID) of the target as the value of the ``goesto'' attribute.
  3. Put the text that the reader will select inside the link element.
  4. Set the ID of the target.


Figure 38.1. Hyperlink to a citation.

39. Footnotes

Footnotes can occur anywhere in the ETD. Use the link element to refer to each footnote.


Figure 39.1. Footnote example.

40. Multimedia

You can include a multimedia object (picture, video, animation, soundclip) anywhere in the ETD.

In the preamble (between ``['' and ``]''),

  1. Declare the object type (gif, jpeg, etc.)
    <!NOTATION gif SYSTEM >

  2. Delcare the multimedia file name
    <!ENTITY lander SYSTEM "lander.gif" NDATA gif >

And in your document:

    <mm  entity=lander>color graphic, gif, 12k</mm>

41. Multimedia Object Example

You make a multimedia object like this:


Figure 41.1. Reference to Multmedia

42. Floating Matter

Floating matter goes anywhere in the document. The formatter will determine where/how to display it.

You may float paragraphs, multimedia objects, or tables.

    <float id=cat>
    <mm entity=cat>color graphic</>
    <caption>Shakespeare's Cat</>
    </float>

43. Floating Matter, Results


Figure 43.1. Shakespeare's Cat (color graphic)
    <float id=cat>
    <mm entity=cat>color graphic</>
    <caption>Shakespeare's Cat</>
    </float>

44. Tables: Column-major

Encode a column-major table by listing the columns, one after the next.


Figure 44.1. Column-major table

45. Tables: Row-major

Encode a row-major table by listing the rows, one after the next.


Figure 45.1. Row-major table

46. Tables: Column major

Each column can have a column head.

Columns can contain cells or row heads.
<table>

<colhead>Word
<column>
<rowhead>hello
<rowhead>goodbye

<colhead>Spanish
<column>
<c>hola
<c>adios

<colhead>French
<column>
<c>bonjour
<c>au revior

Word Spanish French
hello hola bonjour
goodbye adios au revior


47. Tables: Row major

Each row can have a row head.

Rows can contain cells or column heads.
<table>

<rowhead>Word
<row>
<colhead>Spanish
<colhead>French

<rowhead>hello
<row>
<c>hola
<c>bonjour

<rowhead>goodbye
<row>
<c>adios
<c>au revoir

Word Spanish French
hello hola bonjour
goodbye adios au revoir


48. Quick Review, Body Matter

In this part we discussed:


49. ETD Back Matter

ETD back matter contains the bibliography, appendices, and vita (your life and career history).


Figure 49.1. ETD Back matter

50. Bibliography

The bibliography is simply a list of citations.


Figure 50.1. Bibliography section structure

51. Citations

Citations may contain any of the following elements:


52. Refer to a Citation

You refer to a citation like this:


Figure 52.1. Referring to a Citation

53. ETD Back Matter: Appendix

Appendices are exactly like chapters except that they are numbered A, B, C, ...

Sections within an appendix are numbered like A.1, A.2, ...

Subsections within an appendix are numbered like A.1.1, A.1.2, ...


54. ETD Back Matter: Vita

Your life story goes in the vita

<vita>

<p>Albert J. Kippleby was born on a sunny day...

<p>Later he attended Virginia Polytechnic Institute and State University where he completed this auspicious document.


55. Quick Review, Back Matter


56. Making it all work

Recall the system diagram, where the user writes ``thesis.etd,'' runs etd2html, then views ``thesis.html'' with a Web browser.


Figure 56.1. ETD Format System.

57. Messages from etd2html

Just like software, documents may have bugs.

Debugging is part of SGML document development.

If the beginning of the output file (thesis.html) has a message that begins with: nsgmls:SGML error, then you need to fix one or more bugs.

HINT: Most bugs come from simple typos.

NOTE: We have designed ETD-ML to avoid as many possibilities for bugs as possible.


58. Format of the Message

The parts of a system error or warning message are as follows:


Figure 58.1.