Media types
An Internet media type is,
generally speaking, a property of a data set, describing
both the general type of data (such as "text" or "image" or "application";
the last one refers to program-specific internal data formats) and,
as a subtype, a specific format for the data.
The concept was originally defined as
"MIME content types".
Media types relate to HTML as follows:
- When a Web server sends an HTML document, it should
specify the correct media type (
text/html
)
in the HTTP
headers it sends along with the document. Normally
servers are configured to do this by default when the file name
ends with .html
or .htm
(depending on the system; please consult local documentation).
- In a FORM element, the value of the
ENCTYPE
attribute specifies the media type to be used then
encoding and sending the content of the form.
- When referring to various resources, such as embedding images
using IMG elements or linking to binary files
using an A element, there is no way to
tell the media type in HTML. Things must be handled in the server.
Typically, a Web server uses some mapping table to map
file name extensions to media types (eg mapping extension
.zip
to media type application/zip
),
and it may provide users some tools for overriding such
mappings or otherwise specifying the media type to be
associated with a file or set of files.
The description of the A element contains some additional notes
related to
audio and video and
binary files in general.
The
HTML 3.2 Reference Specification
refers to RFC 1521
but that specification was superseded by
RFC 2046
(in November 1996).
The procedure for registering types in given in
RFC 2048;
according to
it, the registry is kept at
ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/
For less authoritative but more readably presented information,
see document
MIME Types by
Chris Herborth.
In addition to standardized media types, there are media types which
are in fact supported by popular servers and browsers.
Appendix B
of
Special Edition Using CGI lists many of them.
You can check what is the media type information sent by a server
as follows:
Assuming we are interested in the media type of the document at
URL
http://
host/
path,
establish
a Telnet connection to host
using the port number in the URL if present, port 80 otherwise.
Then give the command
HEAD
/
path HTTP/1.0
and then an empty line.
Example (where the Telnet connection is established by starting the
telnet
program from Unix command level):
beta ~ 51 % telnet www.hut.fi 80
Trying 130.233.224.28...
Connected to info-e.hut.fi.
Escape character is '^]'.
HEAD /home/jkorpela/perhe.jpg HTTP/1.0
HTTP/1.1 200 OK
Date: Tue, 23 Sep 1997 12:37:05 GMT
Server: Apache/1.2.4
Last-Modified: Tue, 08 Aug 1995 08:29:53 GMT
ETag: "16391-9232-30272081"
Content-Length: 37426
Accept-Ranges: bytes
Connection: close
Content-Type: image/jpeg
Connection closed by foreign host.
beta ~ 52 % exit
Here the
Content-Type:
field tells that the media type
is
image/jpeg
.