Special Characters and SMS Character Counting
An SMS typically contains 160 characters. However, several rules affect this, including whether you send longer messages or if the message contains special characters. This article goes over everything that governs how many characters can be in your SMS messages.
SMS character encoding and length limits
SMS messages primarily use two main encoding standards: GSM-7 and Unicode. The type of encoding determines your maximum character limit.
|
Encoding Standard |
Character Set |
Max Characters (Single SMS) |
Max Characters (Concatenated/Multi-part SMS) |
|
GSM-7 |
Standard European letters, digits, and common punctuation. |
160 |
153 per segment |
|
Unicode (UCS-2) |
Non-Western scripts, Chinese, Arabic, and emojis. |
70 |
67 per segment |
Key Rules for Length
- Longer Messages (GSM-7): If your message exceeds 160 characters, it is split into multiple segments. For this to happen, the system "borrows" 7 characters from each segment for concatenation metadata, reducing the available space to 153 characters per subsequent SMS. (e.g., A 161-character message becomes two SMS segments.)
- The Unicode Switch: The presence of even a single character not supported by the basic GSM-7 set (e.g., a Chinese character or most emojis) automatically switches the entire message to Unicode encoding, reducing the limit immediately to 70 characters for the first segment.
- Two-Slot Characters in GSM-7: A few "simpler" special characters (like the € sign,
^,|, or[) are technically supported by GSM-7 but consume two character slots each. (e.g., "Price is €5" is 12 characters, but consumes 13 slots).
Unicode Character Alert and Auto-Replacement
To help you avoid unexpected costs and length reductions, HelloSMS includes a feature that detects non-GSM characters and offers an automatic replacement option.
- When Unicode is Detected: If you include a Unicode character, the system will show a warning, and the character count will immediately drop to the Unicode limit (70 characters max for the first segment).
- Characters and Emojis: Emojis and characters from non-Western alphabets (like Arabic or Chinese) use the Unicode format. An emoji consumes two character slots within the 70-character limit. (e.g., The message "Hi 😀" corresponds to 6 characters (4 regular + 2 for the emoji) in Unicode format.)
- Preventing Unexpected Costs: Certain characters, such as "smart quotes" or long dashes copied from editors like Word or ChatGPT, may not be GSM (Unicode) and can unexpectedly increase your message cost.
- Automatic Replacement: The system can automatically identify and replace detected Unicode characters (such as a long dash) with their GSM-7 equivalents, converting the message back to the more efficient GSM-7 encoding.
Variable Lengths with Custom Data Fields
When you use custom data fields to personalize a message, the final message length will vary for each recipient, since names or other data may be of different lengths.
- Uncertainty of Length: Since the system cannot know the final length of the field content for every recipient, the preview's estimated cost and length may not reflect the maximum possible message length.
- Risk of Extra Segments: If a recipient's personalized data (e.g., a very long name) causes the final message to exceed the character limit, the message will be split into an additional SMS segment, incurring an extra charge.
See Custom Data Fields for a Contact List to learn more.
To ensure your entire message fits within the intended number of segments, always build in a generous character margin to account for the longest possible data from a custom field.
Unsubscribe Text, Link Shortening
Be aware that certain elements automatically added to your message will count toward the total character limit.
- Unsubscribe Text: By default, marketing messages sent to a list include an opt-out message (e.g., "STOP to this number"). This text is included in the character count and segment calculation. See Unsubscribe Text Overview to learn more.
- Custom Unsubscribe Text: If you opt to provide your own custom opt-out text via the Account Settings, the text you define will be added instead, and its length will be factored into the character count. See Edit Opt-out Texts to learn more.
- Link Shortening: HelloSMS offers an optional link shortening feature that provides two main benefits:
- Reduces character count: It automatically shortens long URLs within your message, reducing the number of characters consumed by the link.
- Click tracking: It creates a unique, shortened link for each recipient to support campaign statistics.
Appendix: Compilation of Characters
The following characters are the standard GSM-7 characters. The presence of any character not on this list will trigger Unicode encoding.
Standard GSM-7 Characters (1 character slot)
All of these characters correspond to one character each:
! " # $ % ' ( ) * + , - . / : ; < = > ? @ _ ¡ £ ¥ ¿ & ¤ 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z Ä Å Æ Ç É Ñ Ø ø Ü ß Ö à ä å æ è é ì ñ ò ö ù ü Δ Φ Γ Λ Ω Π Ψ Σ Θ Ξ
Extended GSM-7 Characters (2 character slots)
The following special characters take up two character slots each:
^ | € { } [ ] ~ \
By understanding the impact of GSM-7 vs. Unicode encoding, managing custom data fields, and accounting for auto-added text, you can maintain complete control over your message length and minimize unexpected credit usage.
Learn More
See the following articles to learn more: