Introduction
This article concludes our introduction to HTML with a
presentation of some valuable guidelines for working with HTML documents
and code that will help maximize their maintainability and reusability.
Of central importance is the need to understand HTML and its role in
Web applications, to plan ahead for maintainable and reusable code, and
to adopt a consistent policy on coding style.
Coding Style Guidelines
Consistency is absolutely a prerequisite for maximizing
maintainability and reusability. These general guidelines for coding
style can form the basis of a set of standards that will help ensure
that all developers in a project—or, better, in all projects across an organization—write code consistently.
-
Use well-formed HTML.
-
Pick good names and ID values.
-
Indent consistently.
-
Limit line length.
-
Standardize character case.
-
Use comments judiciously.
Use Well-formed HTML
Although Web browsers are generally forgiving and can ignore many
mistakes, rendering most HTML as the document author intended, it is
still a good idea to use well-formed HTML code, for a number of reasons.
Well-formed markup code is a concept that has gained importance
with increased implementation of XML. While browsers did not, in
general, enforce HTML language rules very closely, XML parsers do. Code
is considered well formed when it is structured according to the rules
for XML 1.0. These rules relate to character case, tags, nesting, and
attribute values.
In general, when most browsers encounter an unrecognized or
extraneous tag, they ignore them. However, different browsers might
deliver results in different—and unpredictable—ways. In addition, future
versions of browsers might adhere to standards more closely than do
current versions. Finally, code that includes such elements can be
harder to read and understand, making maintenance more difficult.
-
Lowercase names—To be well-formed, element and attribute names must be in all lower case. In versions through 4.01, HTML is not case-sensitive. However, XML is case-sensitive, and it follows that the XHTML 1.0 recommendation is also case-sensitive. So, to ensure that code keeps working and to maximize reusability, this must be planned for.
-
Closing tags—All nonempty elements must have corresponding closing tags. Empty elements—those previously signified with a single tag, such as
and
—must be followed immediately by a corresponding closing tag, or the tag must end with "/". For example,
and
are both examples of well-formed code.
-
Nested elements—All nested attributes must be properly nested—for example:
-
Some text
-
Note that the tag and its corresponding closing tag, , are both nested inside the
and tags.
-
If elements overlap, then they are not properly nested, as illustrated in the following code:
-
Some text
-
While many browsers have accepted overlapping elements and given the expected results, they have always been, strictly speaking, illegal in HTML, and future versions of browsers might not support them.
-
Attribute values—Attribute values, even numeric attributes should be quoted—for example:
-
-
Code validation: Another step toward improving HTML code is to validate it against a formal published grammar and to declare this validation at the beginning of the HTML document. For example, the following line declares validation against the public HTML 3.2 Final grammar:
-
-
A list of formal published grammars is available from the W3C at http://validator.w3.org/sgml-lib/catalog. The W3C also has a public HTML validation service at http://validator.w3.org/.
Assign meaningful Names and ID Values
Use a consistent scheme for assigning the value of name and ID
properties. They should be as short as reasonably possible, but without
giving up descriptive power. Also, use mixed-case property values to
help readability (see Listing 2). In this code snippet, the check box
names express not only what the purpose of the element is, but also
information about the element's type. The code also illustrates the use
of mixed case to help readability.
Listing 2: Example of Good Element Names
Member?
Admin?
Owner?
Admin?
Owner?
HTML primarily refers to elements by their name property, while DHTML and client-side scripts use the ID
property. Although DHTML documents IDs must be unique in the document,
in general, there is no reason not to use the same value for an
element's name and ID properties. Using the same value for these
properties can reduce confusion that might arise when mixing HTML and
client-side scripting.
Indent Consistently
Use indentation consistently to enhance the readability of the
code. When elements carry over more than one line of code, indent the
contents of elements between the start tag and the end tag. This will
make it easy to see where the element begins and ends. Also, use
indentation to align code at attribute names (see Listing 3).
It is a good idea to use no more than two to four spaces for each
level in indentation, so as not to use up all the available line length
in indentation. If possible, set up the development tool to convert
tabs to spaces so that the indentation will be the same when the source
is viewed in different editors or as printed output.
Listing 3: Indent Code Consistently
|
To log into the system, enter your user name and password in the text boxes. Then click the "Login" button. |
Limit Line Length
Break up lines when they run too long. It is much easier to read
and understand code when you can see the entire line at once. When lines
of code are so long that the reader must scroll right and left to read
them, it requires much more cognitive effort to understand what the code
is doing. Alternatively, in some applications, long lines might wrap to
the next line at the nearest word break. In either case, source code is
much easier to read and understand if the developer takes explicit
control of line length.
HTML is not sensitive to line breaks, so the developer can break
lines at will between keywords for readability. For example, Listing 4
illustrates a code snippet in which two elements have word-wrapped to
the next line because they were two long for the editor window.
Listing 4: HTML Source Code with Uncontrolled Line Breaks
"JavaScript" onclick="return NameValid();">
language="JavaScript" onclick="return AddrValid();">
Compare this with Listing 5, where the developer took explicit
control of line length. Here the code is much easier to read because the
developer used line breaks and indenting to visually organize the
source code.
Listing 5: HTML Source Code with Explicit Line Breaks
name="txtName"
language="JavaScript"
onclick="return NameValid();">
name="txtAddress"
language="JavaScript"
onclick="return AddrValid();">
Keep the limitations of printed output in mind as well. Lines
longer than 80 characters will often wrap in printed output without
consideration for word breaks, making source code very difficult to
read.
Standardize Character Case
Source code is easier to read if the developer has applied a
consistent set of rules for the use of character case—for example, the
use of lower case exclusively for HTML tags. When scanning source code,
the reader can unconsciously apply a visual filter, focusing attention
on the HTML keywords.
The approach taken in code that appears in this article is to use
all lowercase letters for HTML tags and the names of its attributes,
while using mixed case and a modified form of Hungarian Notation for
some attribute values (see the sidebar entitled "Hungarian Notation").
Hungarian Notation
Hungarian Notation is a convention for naming identifiers that
adds a prefix to the name to provide information about the type and
scope of the identifier. Dr. Charles Simonyi, a Microsoft Chief
Architect at the time, introduced Hungarian Notation in the early
1980's. Long an internal Microsoft standard, variants of the convention
have been widely adopted outside of Microsoft as well.
As an example of a simplified Hungarian Notation scheme, variables that contain a string could be prefixed with the character s, and a variable with global scope could be indicated with a gprefix. In this case, then, the variables sTemp and gsName in source code would be immediately identifiable as string variables with local and global scope, respectively.
In general, HTML is not a typed language, and Hungarian Notation
plays a more important role in other types of Web development. However,
in some cases it can add to readability. For example, the names or IDs
of form elements are likely candidates for a modified form of Hungarian
Notation. The prefix "btn" or "cmd" might be used for an input button.
Text boxes might be prefixed with "txt," and check boxes might be
prefixed with "chk" or "cb."
Use Comments Judiciously
Good comments can be invaluable for understanding and maintaining
code. However, the unique nature of HTML introduces a trade-off between
the value of thorough comments and the efficiency of the Web
application.
The Web server reads in the HTML code and sends it as a stream of
text over the network to the browser. Only after arriving at the client
does the browser parse and interpret the HTML code, displaying the
visible elements and ignoring the comments. The obvious implication is
that the comments add nothing to the document as the browser displays
it, yet they add to the processing overhead on both the server and
client computers, and they increase the amount of data transferred. With
almost 50 percent comments, Listing 6 illustrates what is probably
excessively commented code.
Listing 6: Heavily Commented HTML Code
The trick is to find an appropriate level of commenting that
balances these two issues. It is a good idea to comment the major
logical flow and document sections to help readers quickly gain an
overview of the code. Also comment dependencies and assumptions.
Consistently following the other design and coding guidelines as
suggested in this article—especially the ones related to naming and
metadata—will help create self-documenting code.
Listing 7 illustrates how fewer comment lines and more
descriptive element names can combine to provide effective documentation
with a lot less overhead.
Listing 7: Lightly Commented HTML Code
Check list
Use Well-formed HTML
Avoid Style attributes in html
All non empty elements must have corresponding closing tags.
use Lowercase names
All nested attributes must be properly nested—for example:
Attribute values, even numeric attributes should be quoted
Pick Good Names and ID Values
Use a consistent scheme for assigning the value of name and ID properties.
Documents IDs must be unique in the document
Indent Consistently
Use indentation consistently to enhance the readability of the code
Standardize Character Case
Hungarian Notation is a convention for naming identifiers that
adds a prefix to the name to provide information about the type and
scope of the identifier.e.g. txt for text
Use Comments Judiciously
No comments:
Post a Comment