StringHelper
String handling class for UTF-8 data wrapping the phputf8 library. All functions assume the validity of UTF-8 strings.
since |
1.3.0 |
---|---|
package |
Joomla Framework |
Methods
compliant
Tests whether a string complies as UTF-8.
compliant(string str) : bool
This will be much faster than StringHelper::valid() but will pass five and six octet UTF-8 sequences, which are not supported by Unicode and so cannot be displayed correctly in a browser. In other words it is not as strict as StringHelper::valid() but it's faster. If you use it to validate user input, you place yourself at the risk that attackers will be able to inject 5 and 6 byte sequences (which may or may not be a significant risk, depending on what you are are doing).
see | StringHelper::valid |
---|---|
link | |
since |
1.3.0 |
Arguments
- str
string
UTF-8 string to check
Response
bool
TRUE if string is valid UTF-8
increment
Increments a trailing number in a string.
increment(string string, string|null style = 'default', int n) : string
Used to easily create distinct labels when copying objects. The method has the following styles:
default: "Label" becomes "Label (2)" dash: "Label" becomes "Label-2"
since |
1.3.0 |
---|
Arguments
- string
string
The source string.- style
string|null
The the style (default|dash).- n
int
If supplied, this number is used for the copy, otherwise it is the 'next' number.
Response
string
The incremented string.
is_ascii
Tests whether a string contains only 7bit ASCII bytes.
is_ascii(string str) : bool
You might use this to conditionally check whether a string needs handling as UTF-8 or not, potentially offering performance benefits by using the native PHP equivalent if it's just ASCII e.g.;
if (StringHelper::is_ascii($someString))
{
// It's just ASCII - use the native PHP version
$someString = strtolower($someString);
}
else
{
$someString = StringHelper::strtolower($someString);
}
since |
1.3.0 |
---|
Arguments
- str
string
The string to test.
Response
bool
True if the string is all ASCII
ltrim
UTF-8 aware replacement for ltrim()
ltrim(string str, string|bool charlist = false) : string
Strip whitespace (or other characters) from the beginning of a string. You only need to use this if you are supplying the charlist optional arg and it contains UTF-8 characters. Otherwise ltrim will work normally on a UTF-8 string.
link | |
---|---|
since |
1.3.0 |
Arguments
- str
string
The string to be trimmed- charlist
string|bool
The optional charlist of additional characters to trim
Response
string
The trimmed string
ord
UTF-8 aware alternative to ord()
ord(string chr) : int
Returns the unicode ordinal for a character.
link | |
---|---|
since |
1.4.0 |
Arguments
- chr
string
UTF-8 encoded character
Response
int
Unicode ordinal for the character
rtrim
UTF-8 aware replacement for rtrim()
rtrim(string str, string|bool charlist = false) : string
Strip whitespace (or other characters) from the end of a string. You only need to use this if you are supplying the charlist optional arg and it contains UTF-8 characters. Otherwise rtrim will work normally on a UTF-8 string.
link | |
---|---|
since |
1.3.0 |
Arguments
- str
string
The string to be trimmed- charlist
string|bool
The optional charlist of additional characters to trim
Response
string
The trimmed string
str_ireplace
UTF-8 aware alternative to str_ireplace()
str_ireplace(string search, string replace, string str, int|null|bool count = null) : string
Case-insensitive version of str_replace()
link | |
---|---|
since |
1.3.0 |
Arguments
- search
string
String to search- replace
string
Existing string to replace- str
string
New string to replace with- count
int|null|bool
Optional count value to be passed by referene
Response
string
UTF-8 String
str_pad
UTF-8 aware alternative to str_pad()
str_pad(string input, int length, string padStr = ' ', int type = STR_PAD_RIGHT) : string
Pad a string to a certain length with another string. $padStr may contain multi-byte characters.
link | |
---|---|
since |
1.4.0 |
Arguments
- input
string
The input string.- length
int
If the value is negative, less than, or equal to the length of the input string, no padding takes place.- padStr
string
The string may be truncated if the number of padding characters can't be evenly divided by the string's length.- type
int
The type of padding to apply
Response
string
str_split
UTF-8 aware alternative to str_split()
str_split(string str, int splitLen = 1) : array|string|bool
Convert a string to an array.
link | |
---|---|
since |
1.3.0 |
Arguments
- str
string
UTF-8 encoded string to process- splitLen
int
Number to characters to split string by
Response
array|string|bool
strcasecmp
UTF-8/LOCALE aware alternative to strcasecmp()
strcasecmp(string str1, string str2, string|bool locale = false) : int
A case insensitive string comparison.
link | |
---|---|
since |
1.3.0 |
Arguments
- str1
string
string 1 to compare- str2
string
string 2 to compare- locale
string|bool
The locale used by strcoll or false to use classical comparison
Response
int
< 0 if str1 is less than str2; > 0 if str1 is greater than str2, and 0 if they are equal.
strcmp
UTF-8/LOCALE aware alternative to strcmp()
strcmp(string str1, string str2, mixed locale = false) : int
A case sensitive string comparison.
link | |
---|---|
since |
1.3.0 |
Arguments
- str1
string
string 1 to compare- str2
string
string 2 to compare- locale
mixed
The locale used by strcoll or false to use classical comparison
Response
int
< 0 if str1 is less than str2; > 0 if str1 is greater than str2, and 0 if they are equal.
strcspn
UTF-8 aware alternative to strcspn()
strcspn(string str, string mask, int|bool start = null, int|bool length = null) : int
Find length of initial segment not matching mask.
link | |
---|---|
since |
1.3.0 |
Arguments
- str
string
The string to process- mask
string
The mask- start
int|bool
Optional starting character position (in characters)- length
int|bool
Optional length
Response
int
The length of the initial segment of str1 which does not contain any of the characters in str2
stristr
UTF-8 aware alternative to stristr()
stristr(string str, string search) : string|bool
Returns all of haystack from the first occurrence of needle to the end. Needle and haystack are examined in a case-insensitive manner to find the first occurrence of a string using case insensitive comparison.
link | |
---|---|
since |
1.3.0 |
Arguments
- str
string
The haystack- search
string
The needle
Response
string|bool
strlen
UTF-8 aware alternative to strlen()
strlen(string str) : int
Returns the number of characters in the string (NOT THE NUMBER OF BYTES).
link | |
---|---|
since |
1.3.0 |
Arguments
- str
string
UTF-8 string.
Response
int
Number of UTF-8 characters in string.
strpos
UTF-8 aware alternative to strpos()
strpos(string str, string search, int|null|bool offset = false) : int|bool
Find position of first occurrence of a string.
link | |
---|---|
since |
1.3.0 |
Arguments
- str
string
String being examined- search
string
String being searched for- offset
int|null|bool
Optional, specifies the position from which the search should be performed
Response
int|bool
Number of characters before the first match or FALSE on failure
strrev
UTF-8 aware alternative to strrev()
strrev(string str) : string
Reverse a string.
link | |
---|---|
since |
1.3.0 |
Arguments
- str
string
String to be reversed
Response
string
The string in reverse character order
strrpos
UTF-8 aware alternative to strrpos()
strrpos(string str, string search, int offset) : int|bool
Finds position of last occurrence of a string.
link | |
---|---|
since |
1.3.0 |
Arguments
- str
string
String being examined.- search
string
String being searched for.- offset
int
Offset from the left of the string.
Response
int|bool
Number of characters before the last match or false on failure
strspn
UTF-8 aware alternative to strspn()
strspn(string str, string mask, int|null start = null, int|null length = null) : int
Find length of initial segment matching mask.
link | |
---|---|
since |
1.3.0 |
Arguments
- str
string
The haystack- mask
string
The mask- start
int|null
Start optional- length
int|null
Length optional
Response
int
strtolower
UTF-8 aware alternative to strtolower()
strtolower(string str) : string|bool
Make a string lowercase
Note: The concept of a characters "case" only exists is some alphabets such as Latin, Greek, Cyrillic, Armenian and archaic Georgian - it does not exist in the Chinese alphabet, for example. See Unicode Standard Annex #21: Case Mappings
link | |
---|---|
since |
1.3.0 |
Arguments
- str
string
String being processed
Response
string|bool
Either string in lowercase or FALSE is UTF-8 invalid
strtoupper
UTF-8 aware alternative to strtoupper()
strtoupper(string str) : string|bool
Make a string uppercase
Note: The concept of a characters "case" only exists is some alphabets such as Latin, Greek, Cyrillic, Armenian and archaic Georgian - it does not exist in the Chinese alphabet, for example. See Unicode Standard Annex #21: Case Mappings
link | |
---|---|
since |
1.3.0 |
Arguments
- str
string
String being processed
Response
string|bool
Either string in uppercase or FALSE is UTF-8 invalid
substr
UTF-8 aware alternative to substr()
substr(string str, int offset, int|null|bool length = false) : string|bool
Return part of a string given character offset (and optionally length).
link | |
---|---|
since |
1.3.0 |
Arguments
- str
string
String being processed- offset
int
Number of UTF-8 characters offset (from left)- length
int|null|bool
Optional length in UTF-8 characters from offset
Response
string|bool
substr_replace
UTF-8 aware alternative to substr_replace()
substr_replace(string str, string repl, int start, int|bool|null length = null) : string
Replace text within a portion of a string.
link | |
---|---|
since |
1.3.0 |
Arguments
- str
string
The haystack- repl
string
The replacement string- start
int
Start- length
int|bool|null
Length (optional)
Response
string
transcode
Transcode a string.
transcode(string source, string fromEncoding, string toEncoding) : string|null
link | |
---|---|
since |
1.3.0 |
Arguments
- source
string
The string to transcode.- fromEncoding
string
The source encoding.- toEncoding
string
The target encoding.
Response
string|null
The transcoded string, or null if the source was not a string.
trim
UTF-8 aware replacement for trim()
trim(string str, string|bool charlist = false) : string
Strip whitespace (or other characters) from the beginning and end of a string. You only need to use this if you are supplying the charlist optional arg and it contains UTF-8 characters. Otherwise trim will work normally on a UTF-8 string
link | |
---|---|
since |
1.3.0 |
Arguments
- str
string
The string to be trimmed- charlist
string|bool
The optional charlist of additional characters to trim
Response
string
The trimmed string
ucfirst
UTF-8 aware alternative to ucfirst()
ucfirst(string str, string|null delimiter = null, string|null newDelimiter = null) : string
Make a string's first character uppercase or all words' first character uppercase.
link | |
---|---|
since |
1.3.0 |
Arguments
- str
string
String to be processed- delimiter
string|null
The words delimiter (null means do not split the string)- newDelimiter
string|null
The new words delimiter (null means equal to $delimiter)
Response
string
If $delimiter is null, return the string with first character as upper case (if applicable)
else consider the string of words separated by the delimiter, apply the ucfirst to each words
and return the string with the new delimiter
ucwords
UTF-8 aware alternative to ucwords()
ucwords(string str) : string
Uppercase the first character of each word in a string.
link | |
---|---|
since |
1.3.0 |
Arguments
- str
string
String to be processed
Response
string
String with first char of each word uppercase
unicode_to_utf16
Converts Unicode sequences to UTF-16 string.
unicode_to_utf16(string str) : string
since |
1.3.0 |
---|
Arguments
- str
string
Unicode string to convert
Response
string
UTF-16 string
unicode_to_utf8
Converts Unicode sequences to UTF-8 string.
unicode_to_utf8(string str) : string
since |
1.3.0 |
---|
Arguments
- str
string
Unicode string to convert
Response
string
UTF-8 string
valid
Tests a string as to whether it's valid UTF-8 and supported by the Unicode standard.
valid(string str) : bool
Note: this function has been modified to simple return true or false.
author | |
---|---|
link | |
see | compliant |
since |
1.3.0 |
Arguments
- str
string
UTF-8 encoded string.
Response
bool
true if valid
Properties
incrementStyles
Increment styles.
since |
1.3.0 |
---|
Type(s)
array