Convert::YText - Quotes strings suitably for rfc2822 local part
Version 0.2
use Convert::YText qw(encode_ytext decode_ytext);
$encoded=encode_ytext($string); $decoded=decode_ytext($encoded);
($decoded eq $string) || die "this should never happen!";
Convert::YText converts strings to and from "YText", a format inspired
by xtext defined in RFC1894, the MIME base64 and quoted-printable types (RFC
1394). The main goal is encode a UTF8 string into something safe for use as
the local part in an internet email address (RFC2822).
By default spaces are replaced with "+", "/" with
"~", the characters "A-Za-z0-9_.-" encode as themselves,
and everything else is written "=USTR=" where USTR is the base64
(using "A-Za-z0-9_." as digits) encoding of the unicode character
code. The encoding is configurable (see below).
The module can can export "encode_ytext" which converts arbitrary
unicode string into a "safe" form, and "decode_ytext"
which recovers the original text. "validate_ytext" is a heuristic
which returns 0 for bad input.
For more control, you will need to use the OO interface.
Create a new encoding object.
Arguments
Arguments are by name (i.e. a hash).
- DIGIT_STRING ("A-Za-z0-9_.") Must be 64
characters long
- ESCAPE_CHAR ('=') Must not be in digit string.
- SPACE_CHAR ('+') Non digit to replace space. Can be the
empty string.
- SLASH_CHAR ( '~') Non digit to replace slash. Can be the
empty string.
- EXTRA_CHARS ('._\-') Other characters to leave
unencoded.
Arguments
a string to encode.
Returns
encoded string
Arguments
a string to decode.
Returns
encoded string
Simple necessary but not sufficient test for validity.
According to RFC 2822, the following non-alphanumerics are OK for the local part
of an address: "!#$%&'*+-/=?^_`{|}~". On the other hand, it
seems common in practice to block addresses having "%!/|`#&?" in
the local part. The idea is to restrict ourselves to basic ASCII
alphanumerics, plus a small set of printable ASCII, namely "=_+-~.".
The characters '+' and '-' are pretty widely used to attach suffixes (although
usually only one works on a given mail host). It seems ok to use '+-', since
the first marks the beginning of a suffix, and then is a regular character.
The character '.' also seems mostly permissible.
David Bremner, <
[email protected]<gt>
Copyright (C) 2011 David Bremner. All Rights Reserved.
This module is free software; you can redistribute it and/or modify it under the
same terms as Perl itself.
MIME::Base64, MIME::Decoder::Base64, MIME::Decoder::QuotedPrint.