[TOC]

0x00 前言

什么是 ISO 字符集?
答:ISO 字符集是国际标准组织 (ISO) 针对不同的字母表/语言定义的标准字符集。

WeiyiGeek.

WeiyiGeek.

HTML字符实体描述HTML 中的预留字符必须被替换为字符实体。

HTML 4.01 支持 ISO 8859-1 (Latin-1) 字符集
ISO-8859-1 的较低部分(从 1 到 127 之间的代码)是最初的 7 比特 ASCII
ISO-8859-1 的较高部分(从 160 到 255 之间的代码)全都有实体名称
这些符号中的大多数都可以在不进行实体引用的情况下使用,但是实体名称或实体编号为那些不容易通过键盘键入的符号提供了表达的方法

在 HTML 中,某些字符是预留的,在 HTML 中不能使用小于号(<)和大于号(>),这是因为浏览器会误认为它们是标签。
如果希望正确地显示预留字符,我们必须在 HTML 源代码中使用字符实体(character entities)。

Demo字符实体类似这样:

1
2
3
4
5
6
&entity_name;  == &#entity_number;
<!--如需显示小于号,我们必须这样写:&lt; 或 &#60;-->
<p>&lt; == &#60;</p>
<!--不间断空格(non-breaking space)
HTML 中的常用字符实体是不间断空格(&nbsp;)。-->
<p>这是多个空格&nbsp;&nbsp;空格完毕</p>

浏览器总是会截短 HTML 页面中的空格,如果您在文本中写 10 个空格,在显示该页面之前,浏览器会删除它们中的 9 个,如需在页面中增加空格的数量,您需要使用 &nbsp; 字符实体。
WeiyiGeek.

WeiyiGeek.

提示:使用实体名而不是数字的好处是,名称易于记忆。不过坏处是浏览器也许并不支持所有实体名称(对实体数字的支持却很好),实体名称对大小写敏感。

0x01 实体字符集

描述:现代浏览器支持的字符集

  • ASCII字符集
  • 标准ISO字符集
  • 数学符号、希腊字母、其他符号

(1) Ascll 字符集
对于ASCLL码的转换前127都可以&#0~127;实体编号来一一对应。

WeiyiGeek.

WeiyiGeek.

7比特 设备控制 ASCII代码:ASCII设备控制代码最初被设计为用来控制诸如打印机和磁带驱动器之类的硬件设备,在HTML文档中这些代码不会起任何作用。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#结果	描述	实体编号
NUL null character &#00;
SOH start of header &#01;
STX start of text &#02;
ETX end of text &#03;
EOT end of transmission &#04;
ENQ enquiry &#05;
ACK acknowledge &#06;
BEL bell (ring) &#07;
BS backspace &#08;
HT horizontal tab &#09;
LF line feed &#10;
VT vertical tab &#11;
FF form feed &#12;
CR carriage return &#13;
SO shift out &#14;
SI shift in &#15;
DLE data link escape &#16;
DC1 device control 1 &#17;
DC2 device control 2 &#18;
DC3 device control 3 &#19;
DC4 device control 4 &#20;
NAK negative acknowledge &#21;
SYN synchronize &#22;
ETB end transmission block &#23;
CAN cancel &#24;
EM end of medium &#25;
SUB substitute &#26;
ESC escape &#27;
FS file separator &#28;
GS group separator &#29;
RS record separator &#30;
US unit separator &#31;
DEL delete (rubout) &#127;


(2)ISO 8859-1 符号实体

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#结果	描述	实体名称	实体编号
non-breaking space &nbsp; &#160;
¡ inverted exclamation mark &iexcl; &#161;
¢ cent &cent; &#162;
£ pound &pound; &#163;
¤ currency &curren; &#164;
¥ yen &yen; &#165;
¦ broken vertical bar &brvbar; &#166;
§ section &sect; &#167;
¨ spacing diaeresis &uml; &#168;
© copyright &copy; &#169;
ª feminine ordinal indicator &ordf; &#170;
« angle quotation mark (left) &laquo; &#171;
¬ negation &not; &#172;
soft hyphen &shy; &#173;
® registered trademark &reg; &#174;
¯ spacing macron &macr; &#175;
° degree &deg; &#176;
± plus-or-minus &plusmn; &#177;
² superscript 2 &sup2; &#178;
³ superscript 3 &sup3; &#179;
´ spacing acute &acute; &#180;
µ micro &micro; &#181;
¶ paragraph &para; &#182;
· middle dot &middot; &#183;
¸ spacing cedilla &cedil; &#184;
¹ superscript 1 &sup1; &#185;
º masculine ordinal indicator &ordm; &#186;
» angle quotation mark (right) &raquo; &#187;
¼ fraction 1/4 &frac14; &#188;
½ fraction 1/2 &frac12; &#189;
¾ fraction 3/4 &frac34; &#190;
¿ inverted question mark &iquest; &#191;
× multiplication &times; &#215;
÷ division &divide; &#247;


(3)ISO 8859-1 字符实体

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
结果	描述	实体名称	实体编号
À capital a, grave accent &Agrave; &#192;
Á capital a, acute accent &Aacute; &#193;
 capital a, circumflex accent &Acirc; &#194;
à capital a, tilde &Atilde; &#195;
Ä capital a, umlaut mark &Auml; &#196;
Å capital a, ring &Aring; &#197;
Æ capital ae &AElig; &#198;
Ç capital c, cedilla &Ccedil; &#199;
È capital e, grave accent &Egrave; &#200;
É capital e, acute accent &Eacute; &#201;
Ê capital e, circumflex accent &Ecirc; &#202;
Ë capital e, umlaut mark &Euml; &#203;
Ì capital i, grave accent &Igrave; &#204;
Í capital i, acute accent &Iacute; &#205;
Î capital i, circumflex accent &Icirc; &#206;
Ï capital i, umlaut mark &Iuml; &#207;
Ð capital eth, Icelandic &ETH; &#208;
Ñ capital n, tilde &Ntilde; &#209;
Ò capital o, grave accent &Ograve; &#210;
Ó capital o, acute accent &Oacute; &#211;
Ô capital o, circumflex accent &Ocirc; &#212;
Õ capital o, tilde &Otilde; &#213;
Ö capital o, umlaut mark &Ouml; &#214;
Ø capital o, slash &Oslash; &#216;
Ù capital u, grave accent &Ugrave; &#217;
Ú capital u, acute accent &Uacute; &#218;
Û capital u, circumflex accent &Ucirc; &#219;
Ü capital u, umlaut mark &Uuml; &#220;
Ý capital y, acute accent &Yacute; &#221;
Þ capital THORN, Icelandic &THORN; &#222;
ß small sharp s, German &szlig; &#223;
à small a, grave accent &agrave; &#224;
á small a, acute accent &aacute; &#225;
â small a, circumflex accent &acirc; &#226;
ã small a, tilde &atilde; &#227;
ä small a, umlaut mark &auml; &#228;
å small a, ring &aring; &#229;
æ small ae &aelig; &#230;
ç small c, cedilla &ccedil; &#231;
è small e, grave accent &egrave; &#232;
é small e, acute accent &eacute; &#233;
ê small e, circumflex accent &ecirc; &#234;
ë small e, umlaut mark &euml; &#235;
ì small i, grave accent &igrave; &#236;
í small i, acute accent &iacute; &#237;
î small i, circumflex accent &icirc; &#238;
ï small i, umlaut mark &iuml; &#239;
ð small eth, Icelandic &eth; &#240;
ñ small n, tilde &ntilde; &#241;
ò small o, grave accent &ograve; &#242;
ó small o, acute accent &oacute; &#243;
ô small o, circumflex accent &ocirc; &#244;
õ small o, tilde &otilde; &#245;
ö small o, umlaut mark &ouml; &#246;
ø small o, slash &oslash; &#248;
ù small u, grave accent &ugrave; &#249;
ú small u, acute accent &uacute; &#250;
û small u, circumflex accent &ucirc; &#251;
ü small u, umlaut mark &uuml; &#252;
ý small y, acute accent &yacute; &#253;
þ small thorn, Icelandic &thorn; &#254;
ÿ small y, umlaut mark &yuml; &#255;


(4) HTML 符号实体
描述:包括了数学符号、希腊字符、各种箭头记号、科技符号以及形状。

HTML 支持的数学符号

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
结果	描述	实体名称	实体编号
∀ for all &forall; &#8704;
∂ part &part; &#8706;
∃ exists &exists; &#8707;
∅ empty &empty; &#8709;
∇ nabla &nabla; &#8711;
∈ isin &isin; &#8712;
∉ notin &notin; &#8713;
∋ ni &ni; &#8715;
∏ prod &prod; &#8719;
∑ sum &sum; &#8721;
− minus &minus; &#8722;
∗ lowast &lowast; &#8727;
√ square root &radic; &#8730;
∝ proportional to &prop; &#8733;
∞ infinity &infin; &#8734;
∠ angle &ang; &#8736;
∧ and &and; &#8743;
∨ or &or; &#8744;
∩ cap &cap; &#8745;
∪ cup &cup; &#8746;
∫ integral &int; &#8747;
∴ therefore &there4; &#8756;
∼ simular to &sim; &#8764;
≅ approximately equal &cong; &#8773;
≈ almost equal &asymp; &#8776;
≠ not equal &ne; &#8800;
≡ equivalent &equiv; &#8801;
≤ less or equal &le; &#8804;
≥ greater or equal &ge; &#8805;
⊂ subset of &sub; &#8834;
⊃ superset of &sup; &#8835;
⊄ not subset of &nsub; &#8836;
⊆ subset or equal &sube; &#8838;
⊇ superset or equal &supe; &#8839;
⊕ circled plus &oplus; &#8853;
⊗ cirled times &otimes; &#8855;
⊥ perpendicular &perp; &#8869;
⋅ dot operator &sdot; &#8901;



HTML 支持的希腊字母
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
结果	描述	实体名称	实体编号
Α Alpha &Alpha; &#913;
Β Beta &Beta; &#914;
Γ Gamma &Gamma; &#915;
Δ Delta &Delta; &#916;
Ε Epsilon &Epsilon; &#917;
Ζ Zeta &Zeta; &#918;
Η Eta &Eta; &#919;
Θ Theta &Theta; &#920;
Ι Iota &Iota; &#921;
Κ Kappa &Kappa; &#922;
Λ Lambda &Lambda; &#923;
Μ Mu &Mu; &#924;
Ν Nu &Nu; &#925;
Ξ Xi &Xi; &#926;
Ο Omicron &Omicron; &#927;
Π Pi &Pi; &#928;
Ρ Rho &Rho; &#929;
Sigmaf undefined
Σ Sigma &Sigma; &#931;
Τ Tau &Tau; &#932;
Υ Upsilon &Upsilon; &#933;
Φ Phi &Phi; &#934;
Χ Chi &Chi; &#935;
Ψ Psi &Psi; &#936;
Ω Omega &Omega; &#937;

α alpha &alpha; &#945;
β beta &beta; &#946;
γ gamma &gamma; &#947;
δ delta &delta; &#948;
ε epsilon &epsilon; &#949;
ζ zeta &zeta; &#950;
η eta &eta; &#951;
θ theta &theta; &#952;
ι iota &iota; &#953;
κ kappa &kappa; &#954;
λ lambda &lambda; &#923;
μ mu &mu; &#956;
ν nu &nu; &#925;
ξ xi &xi; &#958;
ο omicron &omicron; &#959;
π pi &pi; &#960;
ρ rho &rho; &#961;
ς sigmaf &sigmaf; &#962;
σ sigma &sigma; &#963;
τ tau &tau; &#964;
υ upsilon &upsilon; &#965;
φ phi &phi; &#966;
χ chi &chi; &#967;
ψ psi &psi; &#968;
ω omega &omega; &#969;

ϑ theta symbol &thetasym; &#977;
ϒ upsilon symbol &upsih; &#978;
ϖ pi symbol &piv; &#982;

HTML 支持的其他实体

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
结果	描述	实体名称	实体编号
Πcapital ligature OE &OElig; &#338;
œ small ligature oe &oelig; &#339;
Š capital S with caron &Scaron; &#352;
š small S with caron &scaron; &#353;
Ÿ capital Y with diaeres &Yuml; &#376;
ƒ f with hook &fnof; &#402;
ˆ modifier letter circumflex accent &circ; &#710;
˜ small tilde &tilde; &#732;
  en space &ensp; &#8194;
  em space &emsp; &#8195;
  thin space &thinsp; &#8201;
‌ zero width non-joiner &zwnj; &#8204;
‍ zero width joiner &zwj; &#8205;
‎ left-to-right mark &lrm; &#8206;
‏ right-to-left mark &rlm; &#8207;
– en dash &ndash; &#8211;
— em dash &mdash; &#8212;
‘ left single quotation mark &lsquo; &#8216;
’ right single quotation mark &rsquo; &#8217;
‚ single low-9 quotation mark &sbquo; &#8218;
“ left double quotation mark &ldquo; &#8220;
” right double quotation mark &rdquo; &#8221;
„ double low-9 quotation mark &bdquo; &#8222;
† dagger &dagger; &#8224;
‡ double dagger &Dagger; &#8225;
• bullet &bull; &#8226;
… horizontal ellipsis &hellip; &#8230;
‰ per mille &permil; &#8240;
′ minutes &prime; &#8242;
″ seconds &Prime; &#8243;
‹ single left angle quotation &lsaquo; &#8249;
› single right angle quotation &rsaquo; &#8250;
‾ overline &oline; &#8254;
€ euro &euro; &#8364;
™ trademark &trade; &#8482;
← left arrow &larr; &#8592;
↑ up arrow &uarr; &#8593;
→ right arrow &rarr; &#8594;
↓ down arrow &darr; &#8595;
↔ left right arrow &harr; &#8596;
↵ carriage return arrow &crarr; &#8629;
⌈ left ceiling &lceil; &#8968;
⌉ right ceiling &rceil; &#8969;
⌊ left floor &lfloor; &#8970;
⌋ right floor &rfloor; &#8971;
◊ lozenge &loz; &#9674;
♠ spade &spades; &#9824;
♣ club &clubs; &#9827;
♥ heart &hearts; &#9829;
♦ diamond &diams; &#9830;


附录