StringHelper

String handling class for UTF-8 data wrapping the phputf8 library. All functions assume the validity of UTF-8 strings.

abstract
since

1.3.0

package

Joomla Framework

Methods

compliant

Tests whether a string complies as UTF-8.

compliant(string str) : bool
static

This will be much faster than StringHelper::valid() but will pass five and six octet UTF-8 sequences, which are not supported by Unicode and so cannot be displayed correctly in a browser. In other words it is not as strict as StringHelper::valid() but it's faster. If you use it to validate user input, you place yourself at the risk that attackers will be able to inject 5 and 6 byte sequences (which may or may not be a significant risk, depending on what you are are doing).

see StringHelper::valid
link
since

1.3.0

Arguments

str

stringUTF-8 string to check

Response

boolTRUE if string is valid UTF-8

increment

Increments a trailing number in a string.

increment(string string, string|null style = 'default', int n) : string
static

Used to easily create distinct labels when copying objects. The method has the following styles:

default: "Label" becomes "Label (2)" dash: "Label" becomes "Label-2"

since

1.3.0

Arguments

string

stringThe source string.

style

string|nullThe the style (default|dash).

n

intIf supplied, this number is used for the copy, otherwise it is the 'next' number.

Response

stringThe incremented string.

is_ascii

Tests whether a string contains only 7bit ASCII bytes.

is_ascii(string str) : bool
static

You might use this to conditionally check whether a string needs handling as UTF-8 or not, potentially offering performance benefits by using the native PHP equivalent if it's just ASCII e.g.;

if (StringHelper::is_ascii($someString)) { // It's just ASCII - use the native PHP version $someString = strtolower($someString); } else { $someString = StringHelper::strtolower($someString); }
since

1.3.0

Arguments

str

stringThe string to test.

Response

boolTrue if the string is all ASCII

ltrim

UTF-8 aware replacement for ltrim()

ltrim(string str, string|bool charlist = false) : string
static

Strip whitespace (or other characters) from the beginning of a string. You only need to use this if you are supplying the charlist optional arg and it contains UTF-8 characters. Otherwise ltrim will work normally on a UTF-8 string.

link
since

1.3.0

Arguments

str

stringThe string to be trimmed

charlist

string|boolThe optional charlist of additional characters to trim

Response

stringThe trimmed string

ord

UTF-8 aware alternative to ord()

ord(string chr) : int
static

Returns the unicode ordinal for a character.

link
since

1.4.0

Arguments

chr

stringUTF-8 encoded character

Response

intUnicode ordinal for the character

rtrim

UTF-8 aware replacement for rtrim()

rtrim(string str, string|bool charlist = false) : string
static

Strip whitespace (or other characters) from the end of a string. You only need to use this if you are supplying the charlist optional arg and it contains UTF-8 characters. Otherwise rtrim will work normally on a UTF-8 string.

link
since

1.3.0

Arguments

str

stringThe string to be trimmed

charlist

string|boolThe optional charlist of additional characters to trim

Response

stringThe trimmed string

str_ireplace

UTF-8 aware alternative to str_ireplace()

str_ireplace(string search, string replace, string str, int|null|bool count = null) : string
static

Case-insensitive version of str_replace()

link
since

1.3.0

Arguments

search

stringString to search

replace

stringExisting string to replace

str

stringNew string to replace with

count

int|null|boolOptional count value to be passed by referene

Response

stringUTF-8 String

str_pad

UTF-8 aware alternative to str_pad()

str_pad(string input, int length, string padStr = ' ', int type = STR_PAD_RIGHT) : string
static

Pad a string to a certain length with another string. $padStr may contain multi-byte characters.

link
since

1.4.0

Arguments

input

stringThe input string.

length

intIf the value is negative, less than, or equal to the length of the input string, no padding takes place.

padStr

stringThe string may be truncated if the number of padding characters can't be evenly divided by the string's length.

type

intThe type of padding to apply

Response

string

str_split

UTF-8 aware alternative to str_split()

str_split(string str, int splitLen = 1) : array|string|bool
static

Convert a string to an array.

link
since

1.3.0

Arguments

str

stringUTF-8 encoded string to process

splitLen

intNumber to characters to split string by

Response

array|string|bool

strcasecmp

UTF-8/LOCALE aware alternative to strcasecmp()

strcasecmp(string str1, string str2, string|bool locale = false) : int
static

A case insensitive string comparison.

link
since

1.3.0

Arguments

str1

stringstring 1 to compare

str2

stringstring 2 to compare

locale

string|boolThe locale used by strcoll or false to use classical comparison

Response

int< 0 if str1 is less than str2; > 0 if str1 is greater than str2, and 0 if they are equal.

strcmp

UTF-8/LOCALE aware alternative to strcmp()

strcmp(string str1, string str2, mixed locale = false) : int
static

A case sensitive string comparison.

link
since

1.3.0

Arguments

str1

stringstring 1 to compare

str2

stringstring 2 to compare

locale

mixedThe locale used by strcoll or false to use classical comparison

Response

int< 0 if str1 is less than str2; > 0 if str1 is greater than str2, and 0 if they are equal.

strcspn

UTF-8 aware alternative to strcspn()

strcspn(string str, string mask, int|bool start = null, int|bool length = null) : int
static

Find length of initial segment not matching mask.

link
since

1.3.0

Arguments

str

stringThe string to process

mask

stringThe mask

start

int|boolOptional starting character position (in characters)

length

int|boolOptional length

Response

intThe length of the initial segment of str1 which does not contain any of the characters in str2

stristr

UTF-8 aware alternative to stristr()

stristr(string str, string search) : string|bool
static

Returns all of haystack from the first occurrence of needle to the end. Needle and haystack are examined in a case-insensitive manner to find the first occurrence of a string using case insensitive comparison.

link
since

1.3.0

Arguments

str

stringThe haystack

search

stringThe needle

Response

string|bool

strlen

UTF-8 aware alternative to strlen()

strlen(string str) : int
static

Returns the number of characters in the string (NOT THE NUMBER OF BYTES).

link
since

1.3.0

Arguments

str

stringUTF-8 string.

Response

intNumber of UTF-8 characters in string.

strpos

UTF-8 aware alternative to strpos()

strpos(string str, string search, int|null|bool offset = false) : int|bool
static

Find position of first occurrence of a string.

link
since

1.3.0

Arguments

str

stringString being examined

search

stringString being searched for

offset

int|null|boolOptional, specifies the position from which the search should be performed

Response

int|boolNumber of characters before the first match or FALSE on failure

strrev

UTF-8 aware alternative to strrev()

strrev(string str) : string
static

Reverse a string.

link
since

1.3.0

Arguments

str

stringString to be reversed

Response

stringThe string in reverse character order

strrpos

UTF-8 aware alternative to strrpos()

strrpos(string str, string search, int offset) : int|bool
static

Finds position of last occurrence of a string.

link
since

1.3.0

Arguments

str

stringString being examined.

search

stringString being searched for.

offset

intOffset from the left of the string.

Response

int|boolNumber of characters before the last match or false on failure

strspn

UTF-8 aware alternative to strspn()

strspn(string str, string mask, int|null start = null, int|null length = null) : int
static

Find length of initial segment matching mask.

link
since

1.3.0

Arguments

str

stringThe haystack

mask

stringThe mask

start

int|nullStart optional

length

int|nullLength optional

Response

int

strtolower

UTF-8 aware alternative to strtolower()

strtolower(string str) : string|bool
static

Make a string lowercase

Note: The concept of a characters "case" only exists is some alphabets such as Latin, Greek, Cyrillic, Armenian and archaic Georgian - it does not exist in the Chinese alphabet, for example. See Unicode Standard Annex #21: Case Mappings

link
since

1.3.0

Arguments

str

stringString being processed

Response

string|boolEither string in lowercase or FALSE is UTF-8 invalid

strtoupper

UTF-8 aware alternative to strtoupper()

strtoupper(string str) : string|bool
static

Make a string uppercase

Note: The concept of a characters "case" only exists is some alphabets such as Latin, Greek, Cyrillic, Armenian and archaic Georgian - it does not exist in the Chinese alphabet, for example. See Unicode Standard Annex #21: Case Mappings

link
since

1.3.0

Arguments

str

stringString being processed

Response

string|boolEither string in uppercase or FALSE is UTF-8 invalid

substr

UTF-8 aware alternative to substr()

substr(string str, int offset, int|null|bool length = false) : string|bool
static

Return part of a string given character offset (and optionally length).

link
since

1.3.0

Arguments

str

stringString being processed

offset

intNumber of UTF-8 characters offset (from left)

length

int|null|boolOptional length in UTF-8 characters from offset

Response

string|bool

substr_replace

UTF-8 aware alternative to substr_replace()

substr_replace(string str, string repl, int start, int|bool|null length = null) : string
static

Replace text within a portion of a string.

link
since

1.3.0

Arguments

str

stringThe haystack

repl

stringThe replacement string

start

intStart

length

int|bool|nullLength (optional)

Response

string

transcode

Transcode a string.

transcode(string source, string fromEncoding, string toEncoding) : string|null
static
link
since

1.3.0

Arguments

source

stringThe string to transcode.

fromEncoding

stringThe source encoding.

toEncoding

stringThe target encoding.

Response

string|nullThe transcoded string, or null if the source was not a string.

trim

UTF-8 aware replacement for trim()

trim(string str, string|bool charlist = false) : string
static

Strip whitespace (or other characters) from the beginning and end of a string. You only need to use this if you are supplying the charlist optional arg and it contains UTF-8 characters. Otherwise trim will work normally on a UTF-8 string

link
since

1.3.0

Arguments

str

stringThe string to be trimmed

charlist

string|boolThe optional charlist of additional characters to trim

Response

stringThe trimmed string

ucfirst

UTF-8 aware alternative to ucfirst()

ucfirst(string str, string|null delimiter = null, string|null newDelimiter = null) : string
static

Make a string's first character uppercase or all words' first character uppercase.

link
since

1.3.0

Arguments

str

stringString to be processed

delimiter

string|nullThe words delimiter (null means do not split the string)

newDelimiter

string|nullThe new words delimiter (null means equal to $delimiter)

Response

stringIf $delimiter is null, return the string with first character as upper case (if applicable) else consider the string of words separated by the delimiter, apply the ucfirst to each words and return the string with the new delimiter

ucwords

UTF-8 aware alternative to ucwords()

ucwords(string str) : string
static

Uppercase the first character of each word in a string.

link
since

1.3.0

Arguments

str

stringString to be processed

Response

stringString with first char of each word uppercase

unicode_to_utf16

Converts Unicode sequences to UTF-16 string.

unicode_to_utf16(string str) : string
static
since

1.3.0

Arguments

str

stringUnicode string to convert

Response

stringUTF-16 string

unicode_to_utf8

Converts Unicode sequences to UTF-8 string.

unicode_to_utf8(string str) : string
static
since

1.3.0

Arguments

str

stringUnicode string to convert

Response

stringUTF-8 string

valid

Tests a string as to whether it's valid UTF-8 and supported by the Unicode standard.

valid(string str) : bool
static

Note: this function has been modified to simple return true or false.

author

[email protected]

link
see compliant
since

1.3.0

Arguments

str

stringUTF-8 encoded string.

Response

booltrue if valid

Properties

incrementStyles

Increment styles.

static
since

1.3.0

Type(s)

array