SAMPA

Computer-readable phonetic script

The Speech Assessment Methods Phonetic Alphabet (SAMPA) is a computer-readable phonetic script using 7-bit printable ASCII characters, based on the International Phonetic Alphabet (IPA). It was originally developed in the late 1980s for six European languages by the EEC ESPRIT information technology research and development program. As many symbols as possible have been taken over from the IPA; where this is not possible, other signs that are available are used, e.g. [@] for schwa (IPA [ə]), [2] for the vowel sound found in French deux 'two' (IPA [ø]), and [9] for the vowel sound found in French neuf 'nine' (IPA [œ]).

Today, officially, SAMPA has been developed for all the sounds of the following languages:

The characters ["s{mp@] represent the pronunciation of the name SAMPA in English, with the initial symbol ["] indicating primary stress. Like IPA, SAMPA is usually enclosed in square brackets or slashes, which are not part of the alphabet proper and merely signify that it is phonetic as opposed to regular text.

Features

SAMPA was developed in the late 1980s in the European Commission-funded ESPRIT project 2589 "Speech Assessment Methods" (SAM)—hence "SAM Phonetic Alphabet"—in order to facilitate email data exchange and computational processing of transcriptions in phonetics and speech technology.

SAMPA is a partial encoding of the IPA. The first version of SAMPA was the union of the sets of phoneme codes for Danish, Dutch, English, French, German and Italian; later versions extended SAMPA to cover other European languages. Since SAMPA is based on phoneme inventories, each SAMPA table is valid only in the language it was created for. In order to make this IPA encoding technique universally applicable, X-SAMPA was created, which provides one single table without language-specific differences.

SAMPA was devised as a hack to work around the inability of text encodings to represent IPA symbols. Consequently, as Unicode support for IPA symbols becomes more widespread, the necessity for a separate, computer-readable system for representing the IPA in ASCII decreases. However, text input relies on specific keyboard encodings or input devices. For this reason, SAMPA and X-SAMPA are still widely used[1][better source needed] in computational phonetics and in speech technology.

See also

References

  1. ^ "Project Euphonia's Personalized Speech Recognition for Non-Standard Speech". Google AI Blog. Retrieved 2019-08-16.
  • Ranchhod, Elisabeth & J. Mamede, Nuno (2002). Advances in Natural Language Processing: Third International Conference, PorTAL 2002, Faro, Portugal, June 23–26, 2002. Proceedings (Lecture Notes in Computer Science). (1st ed.). Springer. ISBN 3-540-43829-7.
  • L. DeMiller, Anna & Rettig, James (2000). Linguistics: A Guide to the Reference Literature (2nd ed.). Libraries Unlimited. ISBN 1-56308-619-0.
  • Lamberts, Koen & Goldstone, Rob (2004). Handbook of Cognition. Sage Publications Ltd. ISBN 0-7619-7277-3.

External links

Look up SAMPA in Wiktionary, the free dictionary.
  • SAMPA computer readable phonetic alphabet
  • Phonemic notation of English in SAMPA
  • SAMPA for Scots Archived 2003-08-11 at the Wayback Machine
  • Converter from (German) written text to SAMPA and IPA (Ajax-application)
  • IPA-SAMPA Converter and IPA-SAMPA chart
  • v
  • t
  • e
IPA topics
IPA
Special topics
Encodings
Pulmonic consonants
Place Labial Coronal Dorsal Laryngeal
Manner Bi­labial Labio­dental Linguo­labial Dental Alveolar Post­alveolar Retro­flex Palatal Velar Uvular Pharyn­geal/epi­glottal Glottal
Nasal m ɱ̊ ɱ n ɳ̊ ɳ ɲ̊ ɲ ŋ̊ ŋ ɴ̥ ɴ
Plosive p b t d ʈ ɖ c ɟ k ɡ q ɢ ʡ ʔ
Sibilant affricate ts dz t̠ʃ d̠ʒ
Non-sibilant affricate p̪f b̪v t̪θ d̪ð tɹ̝̊ dɹ̝ t̠ɹ̠̊˔ d̠ɹ̠˔ ɟʝ kx ɡɣ ɢʁ ʡʜ ʡʢ ʔh
Sibilant fricative s z ʃ ʒ ʂ ʐ ɕ ʑ
Non-sibilant fricative ɸ β f v θ̼ ð̼ θ ð θ̠ ð̠ ɹ̠̊˔ ɹ̠˔ ɻ̊˔ ɻ˔ ç ʝ x ɣ χ ʁ ħ ʕ h ɦ
Approximant ʋ ɹ ɻ j ɰ ʔ̞
Tap/flap ⱱ̟ ɾ̼ ɾ̥ ɾ ɽ̊ ɽ ɢ̆ ʡ̆
Trill ʙ̥ ʙ r ɽ̊r̥ ɽr ʀ̥ ʀ ʜ ʢ
Lateral affricate tꞎ d𝼅 c𝼆 ɟʎ̝ k𝼄 ɡʟ̝
Lateral fricative ɬ ɮ 𝼅 𝼆 ʎ̝ 𝼄 ʟ̝
Lateral approximant l ɭ ʎ ʟ ʟ̠
Lateral tap/flap ɺ̥ ɺ 𝼈̥ 𝼈 ʎ̆ ʟ̆

Symbols to the right in a cell are voiced, to the left are voiceless. Shaded areas denote articulations judged impossible.

Non-pulmonic consonants
BL LD D A PA RF P V U EG
Ejective Stop ʈʼ ʡʼ
Affricate p̪fʼ t̪θʼ tsʼ t̠ʃʼ tʂʼ kxʼ qχʼ
Fricative ɸʼ θʼ ʃʼ ʂʼ ɕʼ χʼ
Lateral affricate tɬʼ c𝼆ʼ k𝼄ʼ q𝼄ʼ
Lateral fricative ɬʼ
Click
(top: velar;
bottom: uvular)
Tenuis


k𝼊
q𝼊

Voiced ɡʘ
ɢʘ
ɡǀ
ɢǀ
ɡǃ
ɢǃ
ɡ𝼊
ɢ𝼊
ɡǂ
ɢǂ
Nasal ŋʘ
ɴʘ
ŋǀ
ɴǀ
ŋǃ
ɴǃ
ŋ𝼊
ɴ𝼊
ŋǂ
ɴǂ
ʞ
 
Tenuis lateral
Voiced lateral ɡǁ
ɢǁ
Nasal lateral ŋǁ
ɴǁ
Implosive Voiced ɓ ɗ ʄ ɠ ʛ
Voiceless ɓ̥ ɗ̥ ᶑ̊ ʄ̊ ɠ̊ ʛ̥
Co-articulated consonants
Labial–velar
ɧ
Sj-sound (variable)
Lateral approximant
Velarized alveolar
Labial–velar
Labial–alveolar
Other
Front Central Back
Close
•
•
Near-close
Close-mid
•
•
Mid
Open-mid
Near-open
•
Open
•
•

Legend: unrounded  rounded