Posterous theme by Cory Watilo

HTML; Curious?

The Hypertext Index
An Absolute Beginners Guide to Hypertext's Markup Language

In the beginning God scientists spawned the interweb.

800px-first_web_server

src: WikiMedia GNU GPL Free Documents License

The sciencefolk (after much deliberation) decreed that the default page for a site would be an index. So what is an index?

An index, traditionally, is a table of contents; where the items refer to their destination (eg. with page numbers).

Books are printed in plain text, whilst websites are served in hypertext. Hypertext replicates printing press typography with italic and bold; but extends it with features not possible in print [warning: pleasant music autoplays].

If you've ever made your own site, you likely did it in one of two ways;

  1. You used plain old text, and a language of tags with special meanings (eg. <p>paragraph!</p>) and marked up your text with hyper powers; or,
  2. You used a WYSIWYGwhat you see is what you get — editor, where you can type and edit the actual hypertext!

The languages of special tags are called markup languages. If you've used one, it's likely it was the official markup language for hypertext — HTML.

There are, however, many markup languages. For example, web forum frequenters may recognise [b]BBcode[/b], **Textile** or __Markdown__, usually providing a simpler, subset of HTML.

Fancy WYSIWYG editors usually convert hypertext to plaintext HTML for storage. But they could use any language - you just don't have to worry about which one!

Where was I? Oh yeah, indexes. Indexes refer to pages. And in HTML when you want to refer to another page, you create what is called a hyperlink.

Typically blue unless visited, and underlined; hyperlinks are often restyled to fit the theme of the site (care must be taken not to confuse visitors).

So, here's some really simple HTML.

NOTE: Until posterous allows proper escaping of HTML entities you'll have to hover over the link to see the code behind it!

 I like turtles

It's so simple, in fact, that it does nothing. "a" means anchor. An anchor not just to the page, but to that very spot. The <a>content</a> describes the destination in the user's native language.

It likely does nothing - it has no functionality - but it could be styled, it could be used later (eg. with javascript). But let's learn something more useful. (hover over it to see the source code)

 Keeping Bees

An anchor, with it's very own "id". Now this has a function! By the power of grimskul we can now reference the anchor directly! But how? May you ask? We just say our magic words. (source if you hover!)

I like to keep bees

And voila! The text "keep bees" (itself an anchor, but which can't be referenced) is now a hyperlink to the previous anchored text "Keeping Bees" via it's id with #bees.

You're a wizard Harry.

Waaait a minute. What's this href business. Well; the latter "ref" part is for reference. Can you guess what "h" stands for? I'll give you three guesses. And it's not Harry. Or Hermoine (or Hagrid for that matter).

It's h for hypertext. It's a hypertext reference

Hypertext references, or "hyperlinks", come in various forms.

href="#bees" 

References starting with "#" are said to be "in page"; i.e. the referring anchor must be on the same page as the anchor referred to. They are similar in function to an astericks in printed text; the id need only be unique to the page.

What if your site is more than one page? Ohgnoes! It'll clog the tubes! Have no fear, scientists are here.

href="/bees"

href="/bees#honey"

href="/dogs"

I present to you relative referencing! Relative to where? Relative to the "root" of your site (your "/" page). For example, http://google.com/ is a root page, http://google.com/anything-else is not! If a hyperlink doesn't have an anchor (it has no "#" part) it refers to the page as a whole, starting where the page begins.

The opposite of relative? Absolute! For those times when the page referenced is on a completely different site.

http://example.com/over-there/be/bees#honey

Done! Now you can make an index (table of contents) for your site and completely understand some of it.

Consider this a "first principles" type approach to learning HTML. Feedback is welcome and encouraged.

- Glenn

WARNING : The Following Contains Opinion

re: "Best Practices" for relative and absolute hyperlinks

This part can get confusing, so don't worry about it until you start actually making sites yourself, with lots of pages, across multiple sites.

Relative hyperlinks can get stupid, eg. "//link/to/../parent/././where/am/i/../know/virus.exe", feel free to learn more, however I encourage as little use of these variants on websites as possible. It's also true that they need not be relative to the root of the site, "bees#honey" is a valid reference on its own. The problem with this type of link is that it's relative to what page the link is found on. If that link is on your home page "/" it'll link to "/bees#honey", probably what you expect, but if you put that link on the "/frogs" page? It'll link to "/frogs/bees#honey" - chances are not what you wanted. Reference from root wherever possible is my advice, and if you need a more complicated setup - get your software to do it for you.

Its' similar to how you can use absolute links everywhere on your own site, even for in page anchors. But what if you ever want to move your site from "myawesomesite.com" to "themostawesomesiteever.com". Do you feel like going through and fixing all your links? If you have the need to prepend all of your links with your domain (say you want to export it) - again - get software to do it!

Google is your friend. May the search be with you. Always.