Internet Tips and Tricks

Internet Addresses and URLs Explained

How Internet addresses (URLs) are constructed and regulated and the relation to numerical Internet protocol addresses (IPs) is described

What are Internet Protocol (IP) Addresses?

Everyone is acquainted with the usual type of address or URL (Uniform Resource Locator) used for Web sites. Things like “www.microsoft.com” have become as familiar to us as street addresses. The word “dotcom” has even become part of the language (and not just in English). However, the common form of addresses that we use, which contain letters and recognizable names, is for the convenience of humans only and is not actually the kind of Web addresses used by computers.

Human memory being what it is, some form of mnemonic aid is necessary for us but computers need no such help and use numbers called IP addresses. Ultimately, computers work with binary representations but there are several equivalent formats for IP (Internet protocol) addresses. One that is ofteny used is the so-called "dotted-decimal" form (also known as "dotted quad"). A dotted-decimal IP address has 4 numeric segments, each separated by a period. The numbers must range from 0 to 255 (eight bits). In this representation, Microsoft's real Web address is 64.4.11.37. (At least it was when I wrote this. Try it in your browser.) There are also other formats that you may encounter. For example, spammers may try to obfuscate an address with octal or hex formats. A good discussion of the multitude of ways to make addresses obscure is at http://www.pc-help.org/obscure.htm. If you encounter a long address with a number of % signs, you have likely run into one of these other formats.

The new IPv6 addresses

The address format using 32-bit representation discussed above has run out of new addresses to allocate. This system (designated IPv4) will be gradually replaced by a new 128-bit system known as IPv6. The number of addresses available under the new system is far larger (2128) and is necessary to accomodate the huge increase in connected devices that is occurring.

Syntax of IPv6 addresses

The 128-bit IPv6 address consists of eight 16-bit blocks. Hexadecimal notation is used and the 16-bit blocks become 4-digit hexadecimal numbers. The blocks are separated by colons. An example address is 2001:0:4137:9e76:8f0:202c:b9c0:2f05

Coordinating Internet Addresses

Obviously, there must be some consistent system to the form that both IPs and URLs have. Also every computer on the Internet has to have a unique address assigned to it. Keeping order and coordinating addresses on the Internet was originally done under the auspices of the US Federal government through organizations like the Internet Assigned Numbers Authority (IANA) and the Inter-Networking Information Center (InterNic). The coordination has been privatized and is now administered by The Internet Corporation for Assigned Names and Numbers (ICANN).

The Structure of an URL

The formats that are used for URLs come from standards proposed and agreed to in the beginnings of the Internet. Those with a historical and technical bent can read the paper where Tim Berners-Lee and others set forth the form for URLs.

In schematic form, the prescription for an URL is:

<protocol:>//<user>:<password>@<host>:<port>/<url-path>

In this expression, the brackets indicate particular individual components and are not part of the actual URL. Let us consider each of these parts in turn. “Protocol” designates how the information is transmitted and retrieved. A colon and two forward slashes always follow the designation of the protocol. The colon is an integral part of the protocol name and denotes a device or process (similar to the colon in drive designations). The two slashes help label what we are seeking as a source that is not a local file. There are 10 different protocols mentioned in the document by Berners-Lee cited above as well as others but the typical PC user will almost always be dealing with the familiar Hypertext Transfer Protocol (http or in its more secure form https.) This refers to the format that uses hyperlinks, graphics, etc. coded in various versions of hypertext markup language (HTML) that our browsers know how to download and resolve into pages to display on our computers. This protocol is so overwhelmingly used that the http:// part of URLs is often omitted and need not be entered into browsers anymore. The only other protocol that the typical user may encounter is File Transfer Protocol (ftp) whose name is self-explanatory.

The “user:password@“ section is rarely encountered by the average user but allows you to enter your user name and password for sites where that is required. It can also be used to obscure an URL. For example, http://myname:mypassword@www.microsoft.com works the same as http://www.microsoft.com. Any fake name and password you want can be put in. Try it.

We almost always encounter the “host:port” segment without the port specification. The average user need never think about “ports” in this context and discussing exactly what they are would lead us too far afield. Suffice it to say that certain ports are assigned by default to each protocol when none is specifically designated. For example, HTTP is assigned port number 80. More information on ports is on another page. The “host” is the computer where the information that we want resides. As is the case for the example www.microsoft.com there are often three parts to the host name separated by periods or “dots”. The first part refers to a name for a specific computer. It is often called “www” for World Wide Web but other names are common. Also, the practice is growing to omit this part entirely. The second part (“microsoft”) is the local network where the computer resides, and the third (“com”) is one of the so-called “generic top-level domains.” Top-level domains are discussed in the next secttion.

Top-level domains

For some time, these domains were limited to com, edu, org, net, mil, gov, and int. The com, net, and org domains are available to persons and businesses worldwide (after proper registration). Originally, they were supposed to be descriptive with com, net, and org meaning “commercial”, “network”, and “(non-profit) organization”, respectively. However, these meanings have become blurred. Anyone can apply for a domain containing org, for example. The edu (education) domain is for properly registered four-year institutions of higher learning, i.e. colleges and universities. The mil domain is reserved exclusively for the United States Military and gov is reserved exclusively for the US Government. Certain organizations established by international treaty use int (for example, the UN). In recent years ICANN has added or proposed many new domains. Check at http://www.icann.org/tlds/ for more discussion of top-level domains. Two of the more common of the new domains are .biz and .info

In addition to the generic top-level domains, there are domains assigned to a specific country (some are very small countries) to use as they may see fit. France is fr, Germany is de, and so forth. A discussion of country domains is at Wikipedia. Incidentally, the little Pacific island nation of Tuvalu decided to cash in on its domain tv and sold the rights to it for $50 million.)There is also a domain .eu for Europe.

URL-path

Finally we come to the last part of an Internet address, the part called “/url-path”. This is the path on the host computer to the particular page or file that we wish to download. This part will often have the name of a directory, then some sub-directories and then perhaps an html file or one of the server generated file-types such as asp. The naming will be similar to paths on your own computer with the big exception that forward slashes are used instead of back slashes. For example in the URL http://www.microsoft.com/windows98/usingwindows/maintaining/default.asp “default.asp” is a page located in the sub-directory “maintaining” which is in the higher sub-directory “usingwindows” located in the directory “windows98” stored on the computer called “www” at Microsoft.

Name or DNS Servers

When your browser sends off a request to your Internet service provider (ISP) asking to connect you to www.microsoft.com, the computer at Verizon or Comcast or wherever has to go look up what those letters mean. The ISP will make use of a special computer (or probably computers) called a "name server" or sometimes a "DNS server." DNS stands for "Domain Naming System". These computers have databases that allow them to translate the human-friendly form into something that computers understand. These translation processes take a certain amount of time and if the name servers used by your ISP happen to be busy, the delay may be noticeable. If the DNS computer can't find the address you send it, then you will get an error message. An error message could also mean that the computer at the site you are addressing is busy, that the ISP server is busy, or come from a variety of other causes. If you are reasonably sure you have entered a correct URL, try again.

Finding the best DNS servers

If you are not happy with the DNS servers provided by your ISP, there are a number of free alternative services fom respected providers like Google. Finding free alternate DNS services is described in this article.

Find Your Computer's Internet Address (IP)

Note that your own computer must have an IP address when it is connected to the Internet. ISPs are assigned a block of these addresses and they temporarily provide your computer with one of them each time you sign on. Thus the address may vary each time you go online (but always from the ISP's assigned block). However, the address often stays the same for some days. If you are curious about which address you have when you are on the Internet, you can make use of the command-line accessory. ipconfig. Its use is described here. Note that, if you are on a local network, your computer will have a local IP address. Tthe Internet IP address for the network as a whole will be different and is often called the gateway address

Your browser broadcasts your IP to any Web site that you go to. (It has to. How else could you get information back?) There are many sites that will tell you what IP you have. An easy one to remember is http://whatismyip.com. Another site is http://www.ip-adress.com/.