Symbian Developer Library

SYMBIAN OS V6.1 EDITION FOR C++

[Index] [Glossary] [Previous] [Next]



Location: charconv.h
Link against: charconv.lib

Class CCnvCharacterSetConverter

CCnvCharacterSetConverter

Support

Supported from 5.1

Description

Converts text between Unicode and other character sets. The first stage of the conversion is to specify the non-Unicode character set being converted to or from. This is done by calling one of the overloads of PrepareToConvertToOrFromL().

The second stage is to convert the text — using one of the overloads of ConvertFromUnicode() or ConvertToUnicode().

Where possible the first documented overload of PrepareToConvertToOrFromL() should be used because the second overload panics if the specified character set is not available — the first overload simply returns whether the character set is available or not available. However if the conversions are to be performed often, or if the user must select the character set for the conversion from a list, the second overload may be more appropriate.

The first overload is less efficient than the second, because it searches through the file system for the selected character set every time it is invoked. The second overload searches through an array of all available character sets. In this method, the file system need only be searched once — when CreateArrayOfCharacterSetsAvailableLC() or CreateArrayOfCharacterSetsAvailableL() is used to create the array.

The conversion functions allow users of this class to perform partial conversions on an input descriptor — handling the situation where the input descriptor is truncated mid way through a multi-byte character. This means that you do not have to guess how big to make the output descriptor for a given input descriptor — you can simply do the conversion in a loop using a small output descriptor. The ability to handle truncated descriptors also allows users of the class to convert information received in chunks from an external source.

The class also provides a number of utility functions.

Derivation

CBaseBase class for all classes to be instantiated on the heap
CCnvCharacterSetConverterConverts text between Unicode and other character sets

Defined in CCnvCharacterSetConverter:
Anonymous, AsciiConversionData(), AutoDetectCharacterSetL(), ConvertCharacterSetIdentifierToMibEnumL(), ConvertCharacterSetIdentifierToStandardNameL(), ConvertFromUnicode(), ConvertMibEnumOfCharacterSetToIdentifierL(), ConvertStandardNameOfCharacterSetToIdentifier(), ConvertStandardNameOfCharacterSetToIdentifierL(), ConvertToUnicode(), CreateArrayOfCharacterSetsAvailableL(), CreateArrayOfCharacterSetsAvailableLC(), DoConvertFromUnicode(), DoConvertToUnicode(), EAvailable, EBigEndian, EDowngradeExoticLineTerminatingCharactersToCarriageReturnLineFeed, EDowngradeExoticLineTerminatingCharactersToJustLineFeed, EErrorIllFormedInput, EInputConversionFlagAllowTruncatedInputNotEvenPartlyConsumable, EInputConversionFlagAppend, EInputConversionFlagStopAtFirstUnconvertibleCharacter, ELittleEndian, ENotAvailable, EOutputConversionFlagInputIsTruncated, KStateDefault, NewL(), NewLC(), PrepareToConvertToOrFromL(), SCharacterSet, SetDefaultEndiannessOfForeignCharacters(), SetDowngradeForExoticLineTerminatingCharacters(), SetReplacementForUnconvertibleUnicodeCharactersL(), TArrayOfAscendingIndices, TAvailability, TDowngradeForExoticLineTerminatingCharacters, TEndianness, TError, ~CCnvCharacterSetConverter()

Inherited from CBase:
operator new()


Construction and destruction


NewL()

static CCnvCharacterSetConverter* NewL();

Description

Allocates and constructs a CCnvCharacterSetConverter object. If there is insufficient memory to create the object, the function leaves.

Since the memory is allocated on the heap, objects of this type should be destroyed using the delete operator when the required conversions are complete.

Return value

CCnvCharacterSetConverter*

The newly created object.


NewLC()

static CCnvCharacterSetConverter* NewLC();

Description

Allocates and constructs a CCnvCharacterSetConverter object, and leaves the object on the cleanup stack. If there is insufficient memory to create the object, the function leaves.

Since the memory is allocated on the heap, objects of this type should be destroyed using either the CleanupStack::Pop() function and then the delete operator, or the CleanupStack::PopAndDestroy() function.

Return value

CCnvCharacterSetConverter*

The newly created object.


~CCnvCharacterSetConverter()

virtual ~CCnvCharacterSetConverter();

Description

The destructor frees all resources owned by the object, prior to its destruction.

[Top]


Preparing to convert


PrepareToConvertToOrFromL()

TAvailability PrepareToConvertToOrFromL(TUint aCharacterSetIdentifier, RFs& aFileServerSession);

Description

Specifies the character set to convert to or from. aCharacterSetIdentifier is a UID which identifies a character set. It can be one of the character sets for which conversion is built into EPOC, or it may be a character set for which conversion is implemented by a plug-in DLL. In the latter case, the function searches through the file system for the DLL which implements the character conversion.

Either this function or its overload must be called before using the conversion functions ConvertFromUnicode() or ConvertToUnicode().

This overload of the function is simpler to use than the other and does not panic if the character set with the specified UID is not available at run time — it simply returns ENotAvailable. It should be used when the conversion character set is specified within the text object being converted, e.g. an email message, or an HTML document. If the character set is not specified, the user must be presented with a list of all available sets, so it makes sense to use the other overload.

The function may need to search the file system each time it is called. If conversion takes place repeatedly over a short period, it may be more efficient to use the other overload.

Notes:

Parameters

TUint aCharacterSetIdentifier

The UID of the non-Unicode character set from or to which to convert. Must not be zero, or a panic occurs.

RFs& aFileServerSession

A file server session.

Return value

TAvailability

The availability of the specified character set. If EAvailable is returned, then the conversion functions ConvertToUnicode() and ConvertFromUnicode() will use aCharacterSetIdentifier as the foreign character set. If ENotAvailable is returned, then the foreign character set will either be undefined (and trying to use the conversion functions will cause a panic), or if it has previously been set, it will remain unchanged.

See also:


PrepareToConvertToOrFromL()

void PrepareToConvertToOrFromL(TUint aCharacterSetIdentifier, const CArrayFix<SCharacterSet>& aArrayOfCharacterSetsAvailable, RFs& aFileServerSession);

Description

Specifies the character set to convert to or from. aCharacterSetIdentifier is a UID which identifies a character set. It can be one of the character sets for which conversion is built into EPOC, or it may be a character set for which the conversion is implemented by a plug-in DLL.

The function searches the character set array specified (aArrayOfCharacterSetsAvailable). This is an array containing all of the character sets for which conversion is available. It is created by calling CreateArrayOfCharacterSetsAvailableL() or CreateArrayOfCharacterSetsAvailableLC(). You should be sure that conversion is available for aCharacterSetIdentifier, because if not, a panic occurs. Otherwise, use the other overload of this function.

Either this function or its overload, must be called before using the conversion functions ConvertFromUnicode() or ConvertToUnicode().

Unlike the other overload, this function does not search the file system for plug-in conversion DLLs, (unless aArrayOfCharacterSetsAvailable is NULL). This function should be used if conversions are to be performed often, or if the conversion character set is to be selected by the user. Generating the array of all the available character sets once and searching though it is more efficient than the method used by the other overload, in which the file system may be searched every time it is invoked.

Notes:

Parameters

TUint aCharacterSetIdentifier

The UID of the non-Unicode character set from or to which to convert. Must not be zero, or a panic occurs.

const CArrayFix<SCharacterSet>& aArrayOfCharacterSetsAvailable

Array of all character sets for which conversion is available — created by either CreateArrayOfCharacterSetsAvailableLC() or CreateArrayOfCharacterSetsAvailableL().

RFs& aFileServerSession

A file server session.

See also:


CreateArrayOfCharacterSetsAvailableL()

static CArrayFix<SCharacterSet>* CreateArrayOfCharacterSetsAvailableL(RFs& aFileServerSession);

Description

Creates an array identifying all the character sets for which conversion is available. These can be character sets for which conversion is built into EPOC, or they may be character sets for which conversion is implemented by a plug-in DLL.

The array returned can be used by one of the PrepareToConvertToOrFromL() overloads to provide a list of all the character sets available for conversion. The caller of this function is responsible for deleting the array, and should not modify it.

Parameters

RFs& aFileServerSession

A file server session. This is used for searching for character conversion plug-in DLLs.

Return value

CArrayFix<SCharacterSet>*

An array identifying all supported character sets.


CreateArrayOfCharacterSetsAvailableLC()

static CArrayFix<SCharacterSet>* CreateArrayOfCharacterSetsAvailableLC(RFs& aFileServerSession);

Description

Creates an array identifying all the character sets for which conversion is available and pushes a pointer to it onto the cleanup stack. These can be character sets for which conversion is built into EPOC, or they may be character sets for which conversion is implemented by a plug-in DLL.

The array returned can be used by one of the PrepareToConvertToOrFromL() overloads to provide a list of all the character sets available for conversion. The caller of this function is responsible for deleting the array, and should not modify it.

Parameters

RFs& aFileServerSession

A file server session. This is used for searching for character conversion plug-in DLLs.

Return value

CArrayFix<SCharacterSet>*

An array of references to all supported character sets.

[Top]


Conversion functions


ConvertFromUnicode()

TInt ConvertFromUnicode(TDes8& aForeign, const TDesC16& aUnicode) const;
TInt ConvertFromUnicode(TDes8& aForeign, const TDesC16& aUnicode, TInt& aNumberOfUnconvertibleCharacters) const;
TInt ConvertFromUnicode(TDes8& aForeign, const TDesC16& aUnicode, TInt& aNumberOfUnconvertibleCharacters, TInt& aIndexOfFirstUnconvertibleCharacter) const;

Description

Converts text encoded in the Unicode character set (UCS-2) into other character sets.

The first overload of the function simply performs the conversion. The second overload converts the text and gets the number of characters that could not be converted. The third overload converts the text, gets the number of characters that could not be converted, and also gets the index of the first character that could not be converted. A fourth overload was introduced in v6.0 — see below.

All overloads cause a panic if no target character set has been selected to convert to (i.e. either overload of PrepareToConvertToOrFromL()must have been successfully called beforehand). You may also need to call SetDefaultEndiannessOfForeignCharacters() to define the endian-ness of the output descriptor.

Notes:

Parameters

TDes8& aForeign

On return, contains the converted text in a non-Unicode character set.

const TDesC16& aUnicode

The source Unicode text to be converted.

TInt& aNumberOfUnconvertibleCharacters

On return contains the number of characters which could not be converted.

TInt& aIndexOfFirstUnconvertibleCharacter

On return, contains the index of the first character in the input text that could not be converted. The value is negative if all characters were converted.

Return value

TInt

The number of unconverted characters left at the end of the input descriptor (e.g. because the output descriptor is not long enough to hold all the text), or one of the error values defined in TError.


ConvertFromUnicode()

TInt ConvertFromUnicode(TDes8& aForeign, const TDesC16& aUnicode, TArrayOfAscendingIndices& aIndicesOfUnconvertibleCharacters) const;

Support

Supported from 6.0

Description

Converts Unicode text into another character set.

Differs from the other overloads of this function by returning the indices of all of the characters in the source Unicode text which could not be converted.

Parameters

TDes8& aForeign

On return, contains the converted text in a non-Unicode character set.

const TDesC16& aUnicode

The source Unicode text to be converted.

TArrayOfAscendingIndices& aIndicesOfUnconvertibleCharacters

On return, holds the indices of each Unicode character in the source text which could not be converted.

Return value

TInt

The number of unconverted characters left at the end of the input descriptor (e.g. because the output descriptor is not long enough to hold all the text), or one of the error values defined in TError.


ConvertToUnicode()

TInt ConvertToUnicode(TDes16& aUnicode, const TDesC8& aForeign, TInt& aState) const;
TInt ConvertToUnicode(TDes16& aUnicode, const TDesC8& aForeign, TInt& aState, TInt& aNumberOfUnconvertibleCharacters) const;
TInt ConvertToUnicode(TDes16& aUnicode, const TDesC8& aForeign, TInt& aState, TInt& aNumberOfUnconvertibleCharacters, TInt& aIndexOfFirstByteOfFirstUnconvertibleCharacter) const;

Description

Converts text encoded in a non-Unicode character set into the Unicode character set (UCS-2).

The first overload of the function simply performs the conversion. The second overload converts the text and gets the number of bytes in the input string that could not be converted. The third overload converts the text, gets the number of bytes that could not be converted, and also gets the index of the first byte that could not be converted.

All overloads cause a panic if no source character set has been selected to convert from (i.e. either overload of PrepareToConvertToOrFromL() must have been successfully called beforehand). You may also need to call SetDefaultEndiannessOfForeignCharacters() to define the endian-ness of the input descriptor.

Notes:

Parameters

TDes16& aUnicode

On return, contains the converted text in the Unicode character set.

const TDesC8& aForeign

The non-Unicode source text to be converted.

TInt& aState

This is used to save state information across multiple calls to ConvertToUnicode(). You should initialise the value to KStateDefault, and then do not change it in a series of related calls.

TInt& aNumberOfUnconvertibleCharacters

On return, contains the number of bytes which were not converted.

TInt& aIndexOfFirstByteOfFirstUnconvertibleCharacter

On return, the index of the first byte of the first unconvertible character. For instance if the first character in the input descriptor (aForeign) could not be converted, then this parameter is set to the first byte of that character, i.e. zero. A negative value is returned if all the characters were converted.

Return value

TInt

The number of unconverted bytes left at the end of the input descriptor (e.g. because the output descriptor is not long enough to hold all the text), or one of the error values defined in TError.


DoConvertFromUnicode()

static TInt DoConvertFromUnicode(const SCnvConversionData& aConversionData, TEndianness aDefaultEndiannessOfForeignCharacters, const TDesC8& aReplacementForUnconvertibleUnicodeCharacters, TDes8& aForeign, const TDesC16& aUnicode, TArrayOfAscendingIndices& aIndicesOfUnconvertibleCharacters);

Support

Supported from 6.0

Description

Converts Unicode text into another character set. The Unicode text specified in aUnicode is converted using the conversion data object (aConversionData) provided by the plug-in for the foreign character set, and the converted text is returned in aForeign.

Note

This is a utility function that should only be called from a plug-in conversion library's implementation of ConvertFromUnicode(). Users of the character conversion API should use one of the overloads of ConvertFromUnicode() instead.

Parameters

const SCnvConversionData& aConversionData

The conversion data object. Typically, you should specify conversionData, as declared in convgeneratedcpp.h. This is the SCnvConversionData object which is created in the cnvtool-generated .cpp file (although for some complex character sets you may want to pass other SCnvConversionData objects into this parameter).

TEndianness aDefaultEndiannessOfForeignCharacters

The default endian-ness to use when writing the characters in the foreign character set. If an endian-ness for foreign characters is specified in aConversionData (i.e. not SCnvConversionData::EUnspecified), then that value is used and the value of aDefaultEndiannessOfForeignCharacters is ignored.

const TDesC8& aReplacementForUnconvertibleUnicodeCharacters

The single character which is to be used to replace unconvertible characters.

TDes8& aForeign

On return, contains the converted text in a non-Unicode character set.

const TDesC16& aUnicode

The source Unicode text to be converted.

TArrayOfAscendingIndices& aIndicesOfUnconvertibleCharacters

On return holds the indices of each Unicode character in the source text which could not be converted (because the target character set does not have an equivalent character).

Return value

TInt

The number of unconverted characters left at the end of the input descriptor (e.g. because aForeign was not long enough to hold all the text), or a negative error value, as defined in TError.

See also:


DoConvertFromUnicode()

static TInt DoConvertFromUnicode(const SCnvConversionData& aConversionData, TEndianness aDefaultEndiannessOfForeignCharacters, const TDesC8& aReplacementForUnconvertibleUnicodeCharacters, TDes8& aForeign, const TDesC16& aUnicode, TArrayOfAscendingIndices& aIndicesOfUnconvertibleCharacters, TUint& aOutputConversionFlags, TUint aInputConversionFlags);

Support

Supported from 6.0

Description

Converts Unicode text into another character set. The Unicode text specified in aUnicode is converted using the conversion data object (aConversionData) provided by the plug-in for the foreign character set, and the converted text is returned in aForeign.

This overload differs from the previous one in that it allows the caller to specify flags which give more control over the conversion.

Note

This is a utility function that should only be called from a plug-in conversion library's implementation of ConvertFromUnicode(). Users of the character conversion API should use one of the overloads of ConvertFromUnicode() instead.

Parameters

const SCnvConversionData& aConversionData

The conversion data object. Typically, you should specify conversionData, as declared in convgeneratedcpp.h. This is the SCnvConversionData object which is created in the cnvtool-generated .cpp file (although for some complex character sets you may want to pass other SCnvConversionData objects into this parameter).

TEndianness aDefaultEndiannessOfForeignCharacters

The default endian-ness to use when writing the characters in the foreign character set. If an endian-ness for foreign characters is specified in aConversionData (i.e. not SCnvConversionData::EUnspecified), then that value is used and the value of aDefaultEndiannessOfForeignCharacters is ignored.

const TDesC8& aReplacementForUnconvertibleUnicodeCharacters

The single character which is to be used to replace unconvertible characters. If aInputConversionFlags is set to EInputConversionFlagStopAtFirstUnconvertibleCharacter, this replacement character is used to replace the first unconvertible character, then the conversion will stop.

TDes8& aForeign

On return, contains the converted text in a non-Unicode character set. This may already contain some text. If it does, and if aInputConversionFlags specifies EInputConversionFlagAppend, then the converted text is appended to this descriptor.

const TDesC16& aUnicode

The source Unicode text to be converted.

TArrayOfAscendingIndices& aIndicesOfUnconvertibleCharacters

On return holds the indices of each Unicode character in the source descriptor aUnicode which could not be converted (because the target character set does not have an equivalent character).

TUint& aOutputConversionFlags

If the input descriptor ended in a truncated sequence, e.g. the first half of a Unicode surrogate pair, aOutputConversionFlags returns with the EOutputConversionFlagInputIsTruncated flag set.

TUint aInputConversionFlags

Specify EInputConversionFlagAppend to append the text in aUnicode to aForeign. Specify EInputConversionFlagStopAtFirstUnconvertibleCharacter to stop converting when the first unconvertible character is reached. Specify EInputConversionFlagAllowTruncatedInputNotEvenPartlyConsumable to prevent the function from returning the error-code EErrorIllFormedInput when the input descriptor consists of nothing but a truncated sequence.

Return value

TInt

The number of unconverted characters left at the end of the input descriptor (e.g. because aForeign was not long enough to hold all the text), or a negative error value, as defined in TError.

See also:


DoConvertToUnicode()

static TInt DoConvertToUnicode(const SCnvConversionData& aConversionData, TEndianness aDefaultEndiannessOfForeignCharacters, TDes16& aUnicode, const TDesC8& aForeign, TInt& aNumberOfUnconvertibleCharacters, TInt& aIndexOfFirstByteOfFirstUnconvertibleCharacter);

Support

Supported from 6.0

Description

Converts non-Unicode text into Unicode. The non-Unicode text specified in aForeign is converted using the conversion data object (aConversionData) provided by the plug-in for the foreign character set, and the converted text is returned in aUnicode.

Notes:

Parameters

const SCnvConversionData& aConversionData

The conversion data object. Typically, you should specify conversionData, as declared in convgeneratedcpp.h. This is the SCnvConversionData object which is created in the cnvtool-generated .cpp file (although for some complex character sets you may want to pass other SCnvConversionData objects into this parameter).

TEndianness aDefaultEndiannessOfForeignCharacters

The default endian-ness of the foreign characters. If an endian-ness for foreign characters is specified in aConversionData, then that is used instead and the value of aDefaultEndiannessOfForeignCharactersis ignored.

TDes16& aUnicode

On return, contains the text converted into Unicode.

const TDesC8& aForeign

The non-Unicode source text to be converted.

TInt& aNumberOfUnconvertibleCharacters

On return, contains the number of characters in aForeign which were not converted. Characters which cannot be converted are output as Unicode replacement characters (0xFFFD).

TInt& aIndexOfFirstByteOfFirstUnconvertibleCharacter

On return, the index of the first byte of the first unconvertible character. For instance if the first character in the input descriptor (aForeign) could not be converted, then this parameter is set to the first byte of that character, i.e. zero. A negative value is returned if all the characters were converted.

Return value

TInt

The number of unconverted bytes left at the end of the input descriptor, or a negative error value, as defined in TError.

See also:


DoConvertToUnicode()

static TInt DoConvertToUnicode(const SCnvConversionData& aConversionData, TEndianness aDefaultEndiannessOfForeignCharacters, TDes16& aUnicode, const TDesC8& aForeign, TInt& aNumberOfUnconvertibleCharacters, TInt& aIndexOfFirstByteOfFirstUnconvertibleCharacter, TUint& aOutputConversionFlags, TUint aInputConversionFlags);

Support

Supported from 6.0

Description

Converts non-Unicode text into Unicode. The non-Unicode text specified in aForeign is converted using the conversion data object (aConversionData) provided by the plug-in for the foreign character set, and the converted text is returned in aUnicode.

This overload differs from the previous one in that it allows the caller to specify flags which give more control over the conversion.

Notes:

Parameters

const SCnvConversionData& aConversionData

The conversion data object. Typically, you should specify conversionData, as declared in convgeneratedcpp.h. This is the SCnvConversionData object which is created in the cnvtool-generated .cpp file (although for some complex character sets you may want to pass other SCnvConversionData objects into this parameter).

TEndianness aDefaultEndiannessOfForeignCharacters

The default endian-ness of the foreign characters. If an endian-ness for foreign characters is specified in aConversionData, then that is used instead and the value of aDefaultEndiannessOfForeignCharacters is ignored.

TDes16& aUnicode

On return, contains the text converted into Unicode.

const TDesC8& aForeign

The non-Unicode source text to be converted.

TInt& aNumberOfUnconvertibleCharacters

On return, contains the number of characters in aForeign which were not converted. Characters which cannot be converted are output as Unicode replacement characters (0xFFFD).

TInt& aIndexOfFirstByteOfFirstUnconvertibleCharacter

On return, the index of the first byte of the first unconvertible character. For instance if the first character in the input descriptor (aForeign) could not be converted, then this parameter is set to the first byte of that character, i.e. zero. A negative value is returned if all the characters were converted.

TUint& aOutputConversionFlags

If the input descriptor ended in a truncated sequence, e.g. an incomplete multi-byte character, aOutputConversionFlags returns with the EOutputConversionFlagInputIsTruncated flag set.

TUint aInputConversionFlags

Specify EInputConversionFlagAppend to append the converted text to aUnicode, otherwise the contents of aUnicode are overwritten. Specify EInputConversionFlagStopAtFirstUnconvertibleCharacter to stop converting when the first unconvertible character is reached. Specify EInputConversionFlagAllowTruncatedInputNotEvenPartlyConsumable to prevent the function from returning the error-code EErrorIllFormedInput when the input descriptor consists of nothing but a truncated sequence.

Return value

TInt

The number of unconverted bytes left at the end of the input descriptor, or a negative error value defined in TError.

See also:


AsciiConversionData()

static const SCnvConversionData& AsciiConversionData();

Support

Supported from 6.0

Description

Returns a ready-made SCnvConversionData object for converting between Unicode and ASCII. This can be passed into the aConversionData parameter to DoConvertFromUnicode() or DoConvertToUnicode().

Note

This utility function should only be called by a plug-in conversion library.

Return value

SCnvConversionData&

ASCII conversion data object.

See also:

[Top]


Utility functions


AutoDetectCharacterSetL()

static void AutoDetectCharacterSetL(TInt& aConfidenceLevel, TUint& aCharacterSetIdentifier, const CArrayFix<SCharacterSet>& aArrayOfCharacterSetsAvailable, const TDesC8& aSample);

Support

Supported from 6.1

Description

Attempts to determine the non-Unicode character set of the sample text from those supported on the phone.

For each of the available character sets, its implementation of IsInThisCharacterSetL() is called. The character set which returns the highest confidence level (i.e. which generates the fewest 0xFFFD Unicode replacement characters) is returned in aCharacterSetIdentifier.

This function merely determines if the sample text is convertible with this converter: it does no textual analysis on the result. Therefore, this function is not capable of differentiating between very similar encodings (for example the different ISO 8859 variants).

Any code making use of this function should provide a way for the user to override the selection that this function makes.

Please note that the operation of this function is slow, and takes no account of the usual context that would be used in guessing a character set (for example, the language that is expected to be encoded or the transport used). For situations where such context is known, a faster, more accurate solution is advisable.

Parameters

TInt& aConfidenceLevel

Set by the function to a value between 0 and 100. 0 indicates the function has no idea what character set aSample is encoded in. In this case, aCharacterSetIdentifier is undefined. 100 indicates total confidence that aCharacterSetIdentifier is the character set of aSample.

TUint& aCharacterSetIdentifier

On return, the UID of the best available character set for the sample text aSample. Character set UIDs are defined in charconv.h.

const CArrayFix<SCharacterSet>& aArrayOfCharacterSetsAvailable

The array of character sets available on the device. If this is not already available, it can be created using CreateArrayOfCharacterSetsAvailableL() or CreateArrayOfCharacterSetsAvailableLC().

const TDesC8& aSample

The non-Unicode sample text string.


ConvertCharacterSetIdentifierToStandardNameL()

HBufC8* ConvertCharacterSetIdentifierToStandardNameL(TUint aCharacterSetIdentifier) const;

Support

Withdrawn in 6.0

Description

Gets the Internet-standard name of a character set, which is identified in EPOC by its UID. The function can be called at any time in the CCnvCharacterSetConverter object’s lifetime.

Note:

Parameters

TUint aCharacterSetIdentifier

The UID of the character set.

Return value

HBufC8*

The Internet-standard name or MIME name of the character set, or NULL if the set is not known. The name is encoded in 8 bit ASCII.

See also:


ConvertCharacterSetIdentifierToStandardNameL()

HBufC8* ConvertCharacterSetIdentifierToStandardNameL(TUint aCharacterSetIdentifier, RFs& aFileServerSession);

Support

Supported from 6.0

Description

Returns the Internet-standard name of a character set identified in EPOC by a UID.

If the character set specified is not one for which EPOC provides built-in conversion, the file system is searched for plug-ins which implement the conversion, hence the need for a file server session.

Parameters

TUint aCharacterSetIdentifier

The UID of the character set.

RFs& aFileServerSession

A file server session.

Return value

HBufC8*

The Internet-standard name of the character set.

See also:


ConvertStandardNameOfCharacterSetToIdentifier()

TUint ConvertStandardNameOfCharacterSetToIdentifier(const TDesC8& aStandardNameOfCharacterSet) const;

Support

Withdrawn in 6.0

Description

Gets the UID of a character set for a given Internet-standard name.

Note:

Parameters

const TDesC8& aStandardNameOfCharacterSet

Internet-standard name of a character set.

Return value

TUint

The UID for the character set. If the name is not known, zero is returned.

See also:


ConvertStandardNameOfCharacterSetToIdentifierL()

TUint ConvertStandardNameOfCharacterSetToIdentifierL(const TDesC8& aStandardNameOfCharacterSet, RFs& aFileServerSession);

Support

Supported from 6.0

Description

Gets the UID of a character set identified by its Internet-standard name (the matching is case-insensitive).

If the character set specified is not one for which EPOC provides built-in conversion, the function searches the file system for plug-ins which implement the conversion and which provide the name-to-UID mapping information.

Parameters

const TDesC8& aStandardNameOfCharacterSet

Internet-standard name of a character set.

RFs& aFileServerSession

Connection to a file server session.

Return value

TUint

The UID for the character set.

See also:


ConvertMibEnumOfCharacterSetToIdentifierL()

TUint ConvertMibEnumOfCharacterSetToIdentifierL(TInt aMibEnumOfCharacterSet, RFs& aFileServerSession);

Support

Supported from 6.0

Description

Converts a MIB enum value to the UID value of the character set.

If the character set identified is not one for which EPOC provides built-in conversion, the function searches the file system for plug-ins which implement the conversion and which provide the MIB enum-to-UID mapping information.

Parameters

TInt aMibEnumOfCharacterSet

The MIB enum value of the character set.

RFs& aFileServerSession

Connection to a file server session.

Return value

TUint

The UID of the character set.


ConvertCharacterSetIdentifierToMibEnumL()

TInt ConvertCharacterSetIdentifierToMibEnumL(TUint aCharacterSetIdentifier, RFs& aFileServerSession);

Support

Supported from 6.0

Description

Converts the UID of a character set to its MIB enum value.

If the character set identified is not one for which EPOC provides built-in conversion, the function searches the file system for plug-ins which implement the conversion and which provide the UID-to-MIB enum mapping information.

Parameters

TUint aCharacterSetIdentifier

The UID of the character set.

RFs& aFileServerSession

Connection to a file server session.

Return value

TInt

The MIB enum value of the character set.


SetDefaultEndiannessOfForeignCharacters()

void SetDefaultEndiannessOfForeignCharacters(TEndianness aEndianness);

Description

Sets the default endian-ness used by the ConvertFromUnicode() and ConvertToUnicode() functions to convert between Unicode and non-Unicode character sets.

The endian-ness of a multi-byte character set may be defined in the character set definition or, as in the case of UCS-2, be operating system dependent. If the endian-ness of the current character set is defined by the character set itself, then the default endian-ness specified by this function is ignored.

Notes:

Parameters

TEndianness aEndianness

The default endian-ness of the current character set.


SetDowngradeForExoticLineTerminatingCharacters()

void SetDowngradeForExoticLineTerminatingCharacters(TDowngradeForExoticLineTerminatingCharacters aDowngradeForExoticLineTerminatingCharacters);

Support

Supported from 6.0

Description

Sets whether the Unicode 'line separator' and 'paragraph separator' characters (0x2028 and 0x2029 respectively) should be converted into a carriage return / line feed pair, or into a line feed only when converting from Unicode into a foreign character set. This applies to all foreign character sets that do not contain a direct equivalent of these Unicode character codes.

By default, line and paragraph separators are converted into a CR/LF pair. This function should be called (if at all) after calling PrepareToConvertToOrFromL() and before calling ConvertFromUnicode() and/or ConvertToUnicode().

Parameters

TDowngradeForExoticLineTerminatingCharacters aDowngradeForExoticLineTerminatingCharacters

Specify EDowngradeExoticLineTerminatingCharactersToCarriageReturnLineFeed if line/paragraph separators should be converted into a carriage return and line feed combination and EDowngradeExoticLineTerminatingCharactersToJustLineFeed if they should be converted into line feeds only. Any other value causes the function to panic.


SetReplacementForUnconvertibleUnicodeCharactersL()

void SetReplacementForUnconvertibleUnicodeCharactersL(const TDesC8& aReplacementForUnconvertibleUnicodeCharacters);

Description

Sets the character used to replace unconvertible characters in the output descriptor, when converting from Unicode into another character set.

The default replacement for unconvertible Unicode characters is specified in the conversion data for the character set. The replacement text which is set using this function overrides the default value.

Notes:

Parameters

const TDesC8& aReplacementForUnconvertibleUnicodeCharacters

The single character which is to be used to replace unconvertible characters.

[Top]


Enumerations


Enum TEndianness

TEndianness

Description

Specifies the default endian-ness of the current character set. Used by SetDefaultEndiannessOfForeignCharacters().

ELittleEndian

The character set is big-endian.

EBigEndian

The character set is little-endian.


Enum TAvailability

TAvailability

Description

Indicates whether a character set is available or unavailable for conversion. Used by the second overload of PrepareToConvertToOrFromL().

EAvailable

The requested character set can be converted.

ENotAvailable

The requested character set cannot be converted.


Enum TError

TError

Description

Flags conversion errors. At this stage there is only one error flag — others may be added in the future.

EErrorIllFormedInput

The input descriptor contains a single corrupt character. This error might be returned if the input descriptor only contains some of the bytes of a single multi-byte character.


Enum Anonymous

Anonymous

Description

Sets the initial state of the conversion variable in the ConvertToUnicode() function.

KStateDefault

Initial value for the state argument in a set of related calls to ConvertToUnicode().


Enum TDowngradeForExoticLineTerminatingCharacters

TDowngradeForExoticLineTerminatingCharacters

Support

Supported from 6.0

Description

Downgrade for line and paragraph separators

EDowngradeExoticLineTerminatingCharactersToCarriageReturnLineFeed

Paragraph/line separators should be downgraded (if necessary) into carriage return and line feed pairs.

EDowngradeExoticLineTerminatingCharactersToJustLineFeed

Paragraph/line separators should be downgraded (if necessary) into a line feed only.


Enum Anonymous

Anonymous

Support

Supported from 6.0

Description

Input flags used to control a character set conversion.

Note

This enumeration can be used in the DoConvertToUnicode() and DoConvertFromUnicode() functions. These are part of the Character Conversion Plug-in Provider API and are for use by plug-in conversion libraries only.

EInputConversionFlagAppend

Appends the converted text to the output descriptor.

EInputConversionFlagAllowTruncatedInputNotEvenPartlyConsumable

By default, when the input descriptor passed to DoConvertFromUnicode() or DoConvertToUnicode() consists of nothing but a truncated sequence, the error-code EErrorIllFormedInput is returned. If this behaviour is undesirable, the input flag EInputConversionFlagAllowTruncatedInputNotEvenPartlyConsumable should be set.

EInputConversionFlagStopAtFirstUnconvertibleCharacter

Stops converting when the first unconvertible character is reached.


Enum Anonymous

Anonymous

Support

Supported from 6.0

Description

Output flag used to indicate whether or not a character in the source descriptor is the first half of a surrogate pair, but is the last character in the descriptor to convert.

Note

This enumeration can be used in the DoConvertToUnicode() and DoConvertFromUnicode() functions. These are part of the Character Conversion Plug-in Provider API and are for use by plug-in conversion libraries only.

EOutputConversionFlagInputIsTruncated

Indicates whether or not the source descriptor ends in a truncated sequence, e.g. the first half only of a surrogate pair.

[Top]


Structs


Struct SCharacterSet

SCharacterSet

Description

Stores information about a non-Unicode character set. The information is used to locate the conversion information required by ConvertFromUnicode() and ConvertToUnicode().

An array of these structs — which contains all available character sets — can be generated by CreateArrayOfCharacterSetsAvailableLC() and CreateArrayOfCharacterSetsAvailableL(), and is used by one of the overloads of PrepareToConvertToOrFromL().

Defined in CCnvCharacterSetConverter::SCharacterSet:
Identifier(), Name(), NameIsFileName()

Identifier()


TUint Identifier() const;

Description

Gets the character set’s UID.

Return value

TUint

The UID of the character set.

Name()


TPtrC Name() const;

Description

Gets the full path and filename of the DLL which implements conversion for the character set.

If the character set is one for which conversion is built into EPOC rather than implemented by a plug in DLL, the function just returns the name of the character set. The NameIsFileName() function can be used to determine whether or not it is legal to create a TParsePtrC object over the descriptor returned by Name().

Notes:

Return value

TPtrC

Full path and filename of the character set converter plug in DLL, or just the name of the character set.

NameIsFileName()


TBool NameIsFileName() const;

Description

Tests whether a filename given by the function SCharacterSet::Name() is a real file name (i.e. conversion is provided by a plug in DLL), or just the character set name (i.e. conversion is built into EPOC).

Note:

Return value

TBool

ETrue if the name is a real filename. EFalse if it is just the character set name.

[Top]


Classes


Class TArrayOfAscendingIndices

TArrayOfAscendingIndices

Support

Supported from 6.0

Description

Holds an ascending array of the indices of the characters in the source Unicode text which could not be converted by CCnvCharacterSetConverter::ConvertFromUnicode() into the foreign character set

Defined in CCnvCharacterSetConverter::TArrayOfAscendingIndices:
AppendIndex(), EAppendFailed, EAppendSuccessful, NumberOfIndices(), Remove(), RemoveAll(), TAppendResult, TArrayOfAscendingIndices(), operator[]()

Construction

TArrayOfAscendingIndices()


TArrayOfAscendingIndices()

Description

C++ constructor. The array is initialised to be of length zero.

Append/delete

AppendIndex()


TAppendResult AppendIndex(TInt aIndex);

Description

Appends an index to the array of indices. The value of aIndex should be greater than that of the last index in the array, to maintain an ascending array. The return value should be tested to see whether the function succeeded or not.

Parameters

TInt aIndex

The index to append to the array.

Return value

TAppendResult

EAppendFailed if the append failed, or EAppendSuccessful if it succeeded.

Remove()


void Remove(TInt aIndexOfIndex)

Description

Deletes a single index from the array.

Parameters

TInt aIndexOfIndex

The index of the index to delete. Must not be negative and must not be greater than the length of the array, or a panic occurs.

RemoveAll()


void RemoveAll()

Support

Supported from 6.1

Description

Deletes all indices from the array.

Enquiry

NumberOfIndices()


TInt NumberOfIndices() const

Description

Returns the number of indices in the array.

Return value

TInt

The number of indices in the array.

operator[]()


TInt operator[](TInt aIndexOfIndex) const

Description

Gets the value of the specified index.

Parameters

TInt aIndexOfIndex

Index into the array.

Return value

TInt

The value of the index.

Enumerations

Enum TAppendResult


TAppendResult

Description

The return value of CCnvCharacterSetConverter::AppendIndex().

EAppendFailed

The append failed.

EAppendSuccessful

The append succeeded.