Web - URL
Table of Contents
About
An Uniform Resource Locator (URL) is a universal identifier for a resource.
Because the resource can be created dynamically, an URL is also logically a request.
It's the string that is understood by a browser when you put it in the address bar.
When the HTTP protocol is used as scheme, it's a identifier for a Web resource.
An URL was originally created to provide a method for finding an item such as a person's street address.
On a format level, the URL is a subset of an URI.
Mr. Berners-Lee, the creator of the Web’s bedrock software standards, would get rid of the double slash “//” after the “http:” in Web addresses. The double slash, though a programming convention at the time, turned out to not be really necessary, Mr. Berners-Lee explained.
Syntax
scheme://[[email protected]]host[:port]/path?query#fragment
where:
- scheme is the scheme and defines the access protocol that can be:
- http or https defines the HTTP protocol (and we will make HTTP request)
- file defines the local file system (and will make call to the local OS)
- user-info is optional and defines the credential (not used with the file scheme)
- host[:port]] defines the endpoint (not used with the file scheme)
- path is the path to the resource
- query is the query string
- fragment is the fragment that identify a part in the resource.
Management
Rewrite
Encoding
Length
Lowest common denominator max URL length among popular web browsers is 2100 (Reference)
Shortener
An URL shortener is an application that creates a shorter URL.
Typically:
- the URL is added to a table with a numerical id
- with optionally a hash of the URL (to fasten the lookup by hash URL)
- the numerical id is used in the new shorten version
http://do.com/id
The numerical id based on the decimal system is generally converted to a greater base (ie above 10 with characters ) making it shorter. The hashid library being the most known example.
How to create an identifier
You can create unique same-length identifier via hashing
Note that the version 3 and 5 of UUID specification have an url namespace to create a url hash.
How to get the url of the actual HTML page with Javascript
See Browser URL
Canonical
Documentation / Reference
- The W3C URL specification defines the term URL, various algorithms for dealing with URLs, and an API for constructing, parsing, and resolving URLs.