コンテンツにスキップ

英文维基 | 中文维基 | 日文维基 | 草榴社区

PETSCII

出典: フリー百科事典『ウィキペディア(Wikipedia)』

PETSCII (PET Standard Code of Information Interchange), also known as CBM ASCII, is the character set used in Commodore Business Machines' 8-bit home computers[aa 1].

This character set was first used by the PET from 1977, and was subsequently used by the CBM-II, VIC-20, Commodore 64, Commodore 16, Commodore 116, Plus/4, and Commodore 128. However, the Amiga personal computer family instead uses standard ISO/IEC 8859-1.

[edit] The character set was largely designed by Leonard Tramiel (the son of Commodore CEO Jack Tramiel) and PET designer Chuck Peddle. The graphic characters of PETSCII were one of the extensions Commodore specified for Commodore BASIC when laying out desired changes to Microsoft's existing 6502 BASIC to Microsoft's Ric Weiland in 1977. The VIC-20 used the same pixel-for-pixel font as the PET, although the characters appeared wider due to the VIC's 22-column screen. The Commodore 64, however, used a slightly re-designed, heavy upper-case font, essentially a thicker version of the PET's, in order to avoid color artifacts created by the machine's higher resolution screen. The C64's lowercase characters are identical to the lowercase characters in the Atari 8-bit computers font (released 2.75 years earlier).

Peddle claims the inclusion of card suit symbols was spurred by the demand that it should be easy to write card games on the PET (as part of the specification list he received).

Specifications

[編集]

[edit] "Unshifted" PETSCII is based on the 1963 version of ASCII (rather than the 1967 version, which most if not all other computer character sets based on ASCII use). It has only uppercase letters, an up-arrow ⟨↑⟩ instead of caret ⟨^⟩ at 0x5E and a left-arrow ⟨←⟩ instead of an underscore ⟨_⟩ at 0x5F. In all versions except the original Commodore PET, it also has a British pound sign ⟨£⟩ instead of the backslash ⟨\⟩ at 0x5C. Other characters added in ASCII-1967 (lowercase letters, the grave accent, curly braces, vertical bar, and tilde) do not exist in PETSCII. Codes 0xA0–0xDF are allotted to CBM-specific block graphics characters—horizontal and vertical lines, hatches, shades, triangles, circles and card suits.

PETSCII also has a "shifted" mode (also called "business mode"), which changes the uppercase letters at 0x41–0x5A to lowercase, and changes the graphics at 0xC1–0xDA to uppercase letters. Upper- and lower-case are swapped from where ASCII has them. The mode is toggled by holding one of the SHIFT keys and then pressing and releasing the Commodore key. The shift can be done by POKEing location 59468 with the value 14 to select the alternative set or 12 to revert to standard. On the Commodore 64, the sets are alternated by flipping bit 2 of the byte 53272. On some models of PET, this can also be achieved via special control code PRINT CHR$(14) which adjust the line spacing as well as changing the character set; the POKE method is still available and does not alter the line spacing.

Included in PETSCII are cursor and screen control codes, such as {HOME}, {CLR}, {RVS ON}, and {RVS OFF} (the latter two activating/deactivating reverse-video character display). The control codes appeared in program listings as reverse-video graphic characters, although some computer magazines, in their efforts to provide more clearly readable listings, pretty-printed the codes using their actual names in curly braces, like the above examples. This is unambiguous as PETSCII has no curly brace characters.

Different mappings are used for storing characters (the "interchange" mapping, as used by CHR$()) and displaying characters (the "video" mapping). For example, to display the characters "@ABC" on screen by directly writing into the screen memory, one would POKE the decimal values 0, 1, 2, and 3 rather than 64, 65, 66, and 67.

The keyboard by default provides access to the lower half of the code page. Pressing Shift and a key gives the corresponding upper half code point. Some PETSCII code points cannot be printed and are only used for keyboard input (e.g. F1, RUN/STOP).

Character set

[編集]

[edit] This article contains uncommon Unicode characters. Without proper rendering support, you may see question marks, boxes, or other symbols instead of the intended characters.

The tables below represent the "interchange" PETSCII encoding, as used by CHR$().

Control characters are defined in the ranges 0x00–0x1F and 0x80–0x9F, although which control characters are defined and what they are defined as varies between systems. The tables below exclude control characters—the encoding of control characters in discussed in § Control characters.

The ranges 0x60–0x7F and 0xE0–0xFF are duplicate ranges, although what they duplicate varies between systems. On the Commodore PET, they duplicate 0x20–0x3F and 0xA0–0xBF, respectively; on the Commodore VIC-20, 64, 16, and 128 they duplicate 0xC0–0xDF and 0xA0–0xBF, respectively. While these characters are visually duplicates, they are semantically different; for example, on the Commodore PET, code points 0x2C and 0x6C both produce a comma character, but only 0x2C functions as a delimiter between input fields.

Graphic characters are mostly identical across systems, with the exceptions of 0x5C (which is \ on the Commodore PET, and £ on other systems), 0xDE (which is U+1FB95 CHECKER BOARD FILL on the Commodore PET and VIC-20, and U+1FB96 INVERSE CHECKER BOARD FILL on other systems), and the range 0x60–0x7F (which duplicates a different range on Commodore PET). Additionally, in Commodore PET 2001's shifted character set, uppercase and lowercase letters are swapped relative to other systems'.

Compatibility symbols for PETSCII characters were added to Unicode 13.0 as part of the Symbols for Legacy Computing block.

Standard

[編集]

[edit] The following tables represent the PETSCII encoding used on the Commodore VIC-20, 64, 16, and 128.

Unshifted

[編集]

[edit]

Unshifted PETSCII
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x
1x
2x SP ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ £ ]
6x 🭲 🭸 🭷 🭶 🭺 🭱 🭴 🭼 🭽
7x 🭾 🭻 🭰 🭵 🮌 π
8x
9x
Ax NBSP 🮏 🮇
Bx 🮈 🮂 🮃 🭿
Cx 🭲 🭸 🭷 🭶 🭺 🭱 🭴 🭼 🭽
Dx 🭾 🭻 🭰 🭵 🮌 π
Ex NBSP 🮏 🮇
Fx 🮈 🮂 🮃 🭿 π
 Differs between PETSCII variants.

Shifted

[編集]

[edit]

Shifted PETSCII
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x
1x
2x SP ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ a b c d e f g h i j k l m n o
5x p q r s t u v w x y z [ £ ]
6x A B C D E F G H I J K L M N O
7x P Q R S T U V W X Y Z 🮌 🮕/🮖 🮘
8x
9x
Ax NBSP 🮏 🮙 🮇
Bx 🮈 🮂 🮃
Cx A B C D E F G H I J K L M N O
Dx P Q R S T U V W X Y Z 🮌 🮕/🮖 🮘
Ex NBSP 🮏 🮙 🮇
Fx 🮈 🮂 🮃 🮕/🮖
 Differs between PETSCII variants.
  1. ^ Jump up to:a b c This is U+1FB95 CHECKER BOARD FILL on the VIC-20; and U+1FB96 INVERSE CHECKER BOARD FILL on the Commodore 64 and Commodore 128.

Commodore PET

[編集]

[edit]

Unshifted

[編集]

[edit]

Unshifted PETSCII (PET)
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x
1x
2x SP ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ]
6x SP ! " # $ % & ' ( ) * + , - . /
7x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
8x
9x
Ax NBSP 🮏 🮇
Bx 🮈 🮂 🮃 🭿
Cx 🭲 🭸 🭷 🭶 🭺 🭱 🭴 🭼 🭽
Dx 🭾 🭻 🭰 🭵 🮌 π
Ex NBSP 🮏 🮇
Fx 🮈 🮂 🮃 🭿 π
 Differs from standard PETSCII.

Shifted

[編集]

[edit]

Shifted PETSCII (PET)
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x
1x
2x SP ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ a b c d e f g h i j k l m n o
5x p q r s t u v w x y z [ \ ]
6x SP ! " # $ % & ' ( ) * + , - . /
7x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
8x
9x
Ax NBSP 🮏 🮙 🮇
Bx 🮈 🮂 🮃
Cx A B C D E F G H I J K L M N O
Dx P Q R S T U V W X Y Z 🮌 🮕 🮘
Ex NBSP 🮏 🮙 🮇
Fx 🮈 🮂 🮃 🮕
 Displayed case matches the Commodore PET 8032. The opposite case is used on the Commodore PET 2001.

 Differs from standard PETSCII.

Control characters

[編集]

[edit] While the graphic characters were mostly shared between Commodore systems, the control characters frequently varied. The follow table describes what the control characters represent on the Commodore PET 2001, Commodore PET 8032, VIC-20, Commodore 64, Commodore 16, Commodore 128 (40 and 80 column modes).

PETSCII control characters
Hex Decimal PET 2001 PET 8032 VIC-20 C64 C16 C128 (40 col) C128 (80 col)
00 0
01 1
02 2 UNDERLINE ON
03 3 STOP
04 4
05 5 WHITE
06 6
07 7 BELL BELL
08 8 LOCK CASE
09 9 TAB UNLOCK CASE TAB
0A 10 LINE FEED
0B 11 UNLOCK CASE
0C 12 LOCK CASE
0D 13 RETURN
0E 14 LOWER CASE
0F 15 SET WINDOW TOP FLASH ON
10 16
11 17 CURSOR DOWN
12 18 REVERSE ON
13 19 HOME
14 20 DEL
15 21 KILL LINE
16 22 ERASE TO RIGHT
17 23
18 24 TAB SET/CLEAR
19 25 SCROLL UP
1A 26
1B 27 ESC ESC
1C 28 RED
1D 29 CURSOR RIGHT
1E 30 GREEN
1F 31 BLUE
80 128
81 129 ORANGE DARK PURPLE
82 130 FLASH ON UNDERLINE OFF
83 131 RUN
84 132 FLASH OFF
85 133 F1
86 134 F3
87 135 DOUBLE BELL F5
88 136 F7
89 137 TAB SET/CLEAR F2
8A 138 F4
8B 139 F6
8C 140 F8 HELP F8
8D 141 SHIFT + RETURN
8E 142 UPPER CASE
8F 143 SET WINDOW END FLASH OFF
90 144 BLACK
91 145 CURSOR UP
92 146 REVERSE OFF
93 147 CLEAR
94 148 INST
95 149 INSERT LINE ABOVE BROWN DARK YELLOW
96 150 ERASE TO LEFT PINK YELLOW-GREEN PINK
97 151 DARK GRAY PINK DARK GRAY DARK CYAN
98 152 MEDIUM GRAY BLUE-GREEN MEDIUM GRAY
99 153 SCROLL DOWN LIGHT GREEN LIGHT BLUE LIGHT GREEN
9A 154 LIGHT BLUE DARK BLUE LIGHT BLUE
9B 155 LIGHT GRAY LIGHT GREEN LIGHT GRAY
9C 156 PURPLE
9D 157 CURSOR LEFT
9E 158 YELLOW
9F 159 CYAN

The colors of the VIC-20 and C64/128 are listed in the VIC-II article.

Base 128

[編集]

[edit]

This section does not cite any sources. Please help improve this section by adding citations to reliable sources. Unsourced material may be challenged and removed.

Find sources: "PETSCII" – news · newspapers · books · scholar · JSTOR (March 2024) (Learn how and when to remove this message)

Out of PETSCII's first 192 codes, there are 128 graphic characters: 32–127 and 160–192. This permits "base128"-style encodings in DATA statements, or perhaps between PETSCII-speaking machines. This can also include control characters, which are visible when quoted, although which control characters are defined varies between systems.

The primary application for a "Base 128" encoding is in DATA statements in Commodore BASIC. Binary data can be stored with relatively low overhead, allowing one character of data to encode seven bits of data. On a standard 80-character line, typically four characters are used for the line number, and two characters for the tokenized DATA statement. Since the comma and colon are significant to BASIC, a quote character is also needed, leaving 73 characters for data. At seven bits per character, one DATA line could store 511 bits of binary data, for 79% efficiency. If three-digit line numbers are used, efficiency increases to 80%. If two-digit line numbers are used, efficiency is 82%.

Line Numbers Data chars per Line Bits per Line Efficiency Max. Lines Max. Total Data Bytes
1-9 76 532 0.83125 9 598
10-99 75 525 0.820312 90 5,906
100-999 74 518 0.809375 900 58,275
1000-9999 73 511 0.7984375 9,000 574,875
10000-65535 72 504 0.7875 55,536 3.5 MB (approx.)
  1. ^ Assume line 0 is a GOTO.
  2. ^ Jump up to:a b Maximum line number is probably off-by-one.

For storing binary data in Commodore BASIC, it appears that two- or three-digit line numbers are typically the best choice.

Base 164

[編集]

[edit] 164 PETSCII characters are representable in quoted strings; theoretically, then, Base 164 is possible. This adds in the color values, the function keys, and cursor controls.

See also

[編集]

[edit]

  • ATASCII
  • Atari ST character set
  • ZX Spectrum character set
  • Extended ASCII
  • Text semigraphics

References

[編集]

[edit]

  1. ^
  2. ^ Jump up to:a b
  3. ^
  4. ^  (mirror)
  5. ^
  6. ^ Jump up to:a b c d e f g
  7. ^ Jump up to:a b c
  8. ^
  9. ^
  10. ^ Jump up to:a b
  11. ^
  12. ^
  13. ^
  14. ^
  15. ^
  16. ^
  17. ^
[編集]

[edit]

  • PETSCII to Unicode mapping and a TrueType font using that mapping
  • Typography in 8 bits: System fonts
  • Online PETSCII-art editor
  • PETSCII-art
  • v
  • t
  • e

Character encodings

Early telecommunications
  • Telegraph code
    • Needle
    • Morse
      • Non-Latin
      • Wabun/Kana
      • Chinese
      • Cyrillic
    • Baudot and Murray
  • Fieldata
  • ASCII
    • ISO/IEC 646
  • BCDIC
  • Teletex and Videotex/Teletext
    • T.51/ISO/IEC 6937
    • ITU T.61
    • ITU T.101
    • World System Teletext
      • background
      • sets
  • Transcode
ISO/IEC 8859
  • Approved parts
    • -1 (Western Europe)
    • -2 (Central Europe)
    • -3 (Maltese/Esperanto)
    • -4 (North Europe)
    • -5 (Cyrillic)
    • -6 (Arabic)
    • -7 (Greek)
    • -8 (Hebrew)
    • -9 (Turkish)
    • -10 (Nordic)
    • -11 (Thai)
    • -13 (Baltic)
    • -14 (Celtic)
    • -15 (New Western Europe)
    • -16 (Romanian)
  • Abandoned parts
    • -12 (Devanagari)
  • Proposed but not approved
    • KOI-8 Cyrillic
    • Sámi
  • Adaptations
    • Welsh
    • Barents Cyrillic
    • Estonian
    • Ukrainian Cyrillic
Bibliographic use
  • MARC-8
    • ANSEL
    • CCCII/EACC
  • ISO 5426
  • 5426-2
  • 5427
  • 5428
  • 6438
  • 6862
National standards
  • ArmSCII
  • Big5
  • BraSCII
  • CNS 11643
  • DIN 66003
  • ELOT 927
  • GOST 10859
  • GB 2312
  • GB 12345
  • GB 12052
  • GB 18030
  • HKSCS
  • ISCII
  • JIS X 0201
  • JIS X 0208
  • JIS X 0212
  • JIS X 0213
  • KOI-7
  • KPS 9566
  • KS X 1001
  • KS X 1002
  • LST 1564
  • LST 1590-4
  • PASCII
  • Shift JIS
  • SI 960
  • TIS-620
  • TSCII
  • VISCII
  • VSCII
  • YUSCII
ISO/IEC 2022
  • ISO/IEC 8859
  • ISO/IEC 10367
  • Extended Unix Code / EUC
Mac OS Code pages

("scripts")

  • Armenian
  • Arabic
  • Barents Cyrillic
  • Celtic
  • Central European
  • Croatian
  • Cyrillic
  • Devanagari
  • Farsi (Persian)
  • Font X (Kermit)
  • Gaelic
  • Georgian
  • Greek
  • Gujarati
  • Gurmukhi
  • Hebrew
  • Iceland
  • Inuit
  • Keyboard
  • Latin (Kermit)
  • Maltese/Esperanto
  • Ogham
  • Roman
  • Romanian
  • Sámi
  • Turkish
  • Turkic Cyrillic
  • Ukrainian
  • VT100
DOS code pages
  • 437
  • 668
  • 708
  • 720
  • 737
  • 770
  • 773
  • 775
  • 776
  • 777
  • 778
  • 850
  • 851
  • 852
  • 853
  • 855
  • 856
  • 857
  • 858
  • 859
  • 860
  • 861
  • 862
  • 863
  • 864
  • 865
  • 866
  • 867
  • 868
  • 869
  • 897
  • 899
  • 903
  • 904
  • 932
  • 936
  • 942
  • 949
  • 950
  • 951
  • 1034
  • 1040
  • 1042
  • 1043
  • 1044
  • 1098
  • 1115
  • 1116
  • 1117
  • 1118
  • 1127
  • 3846
  • ABICOMP
  • CS Indic
  • CSX Indic
  • CSX+ Indic
  • CWI-2
  • Iran System
  • Kamenický
  • Mazovia
  • MIK
IBM AIX code pages
  • 895
  • 896
  • 912
  • 915
  • 921
  • 922
  • 1006
  • 1008
  • 1009
  • 1010
  • 1012
  • 1013
  • 1014
  • 1015
  • 1016
  • 1017
  • 1018
  • 1019
  • 1046
  • 1124
  • 1133
Windows code pages
  • CER-GS
  • 932
  • 936 (GBK)
  • 950
  • 1169
  • Extended Latin-8
  • 1250
  • 1251
  • 1252
  • 1253
  • 1254
  • 1255
  • 1256
  • 1257
  • 1258
  • 1270
  • Cyrillic + Finnish
  • Cyrillic + French
  • Cyrillic + German
  • Polytonic Greek
EBCDIC code pages
  • Japanese language in EBCDIC
  • DKOI
DEC terminals (VTx)
  • Multinational (MCS)
  • National Replacement (NRCS)
    • French Canadian
    • Swiss
    • Spanish
    • United Kingdom
    • Dutch
    • Finnish
    • French
    • Norwegian and Danish
    • Swedish
    • Norwegian and Danish (alternative)
  • 8-bit Greek
  • 8-bit Turkish
  • SI 960
  • Hebrew
  • Special Graphics
  • Technical (TCS)
Platform specific
  • 1052
  • 1053
  • 1054
  • 1055
  • 1056
  • 1057
  • 1058
  • Acorn RISC OS
  • Amstrad CPC
  • Apple II
  • ATASCII
  • Atari ST
  • BICS
  • Casio calculators
  • CDC
  • Compucolor 8001
  • Compucolor II
  • CP/M+
  • DEC RADIX 50
  • DEC MCS/NRCS
  • DG International
  • Galaksija
  • GEM
  • GSM 03.38
  • HP Roman
  • HP FOCAL
  • HP RPL
  • SQUOZE
  • LICS
  • LMBCS
  • MSX
  • NEC APC
  • NeXT
  • PETSCII
  • PostScript Standard
  • PostScript Latin 1
  • SAM Coupé
  • Sega SC-3000
  • Sharp calculators
  • Sharp MZ
  • Sinclair QL
  • Teletext
  • TI calculators
  • TRS-80
  • Ventura International
  • WISCII
  • XCCS
  • ZX80
  • ZX81
  • ZX Spectrum
Unicode / ISO/IEC 10646
  • UTF-1
  • UTF-7
  • UTF-8
  • UTF-16
  • UTF-32
  • UTF-EBCDIC
  • GB 18030
  • DIN 91379
  • BOCU-1
  • CESU-8
  • SCSU
  • TACE16
  • Comparison of Unicode encodings
TeX typesetting system
  • Cork
  • LY1
  • OML
  • OMS
  • OT1
Miscellaneous code pages
  • ABICOMP
  • ASMO 449
  • Digital encoding of APL symbols
    • ISO-IR-68
  • ARIB STD-B24
  • Fieldata
  • HZ
  • IEC-P27-1
  • INIS
    • 7-bit
    • 8-bit
  • ISO-IR-169
  • ISO 2033
  • KOI
    • KOI8-R
    • KOI8-RU
    • KOI8-U
  • Mojikyō
  • SEASCII
  • Stanford/ITS
  • Symbol
  • TRON
  • Unified Hangul Code
Control character
  • Morse prosigns
  • C0 and C1 control codes
    • ISO/IEC 6429
    • JIS X 0211
  • Unicode control, format and separator characters
  • Whitespace characters
Related topics
  • CCSID
  • Character encodings in HTML
  • Charset detection
  • Han unification
  • Hardware code page
  • MICR code
  • Mojibake
  • Variable-length encoding
Character sets

Categories:

  • Character sets
  • Commodore International
  • Commodore 64
  • VIC-20
  • ASCII
  • Computer-related introductions in 1977


引用エラー: 「aa」という名前のグループの <ref> タグがありますが、対応する <references group="aa"/> タグが見つかりません