Web - URL
Table of Contents
About
An Uniform Resource Locator (URL) is a universal identifier for a resource.
Because the resource can be created dynamically, an URL is also logically a request.
It's the string that is understood by a browser when you put it in the address bar.
When the HTTP protocol is used as scheme, it's a identifier for a Web resource.
An URL was originally created to provide a method for finding an item such as a person's street address.
On a format level, the URL is a subset of an URI.
Mr. Berners-Lee, the creator of the Web’s bedrock software standards, would get rid of the double slash “//” after the “http:” in Web addresses. The double slash, though a programming convention at the time, turned out to not be really necessary, Mr. Berners-Lee explained.
Articles Related
Syntax
scheme://[[email protected]]host[:port]/path?query#fragment
where:
- scheme is the scheme and defines the access protocol that can be:
- http or https defines the HTTP protocol (and we will make HTTP request)
- file defines the local file system (and will make call to the local OS)
- user-info is optional and defines the credential (not used with the file scheme)
- host[:port]] defines the endpoint (not used with the file scheme)
- path is the path to the resource
- query is the query string
- fragment is the fragment that identify a part in the resource.
Management
Rewrite
Encoding
Length
Lowest common denominator max URL length among popular web browsers is 2100 (Reference)
How to create an identifier
You can create unique same-length identifier via hashing
Note that the version 3 and 5 of UUID specification have an url namespace to create a url hash.
How to get the url of the actual HTML page with Javascript
See Browser URL
Canonical
Documentation / Reference
- The W3C URL specification defines the term URL, various algorithms for dealing with URLs, and an API for constructing, parsing, and resolving URLs.