Hudzilla.org - the homepage of Paul Hudson
Contents > Functions > Playing with strings Wish List | Report Bug | About Me ]

4.7.4     Measuring strings: strlen(), count_chars(), and str_word_count()

This is NOT the latest copy of this book; click here for the latest version.

int strlen ( string source)

mixed count_chars ( string string [, int mode])

mixed str_word_count ( string string [, int format])

Measuring a string and its contents can be done in three separate ways. The easiest (and most "obvious") way to measure a string is to count the number of characters in the string, and this task is performed by the strlen() function, which takes just one parameter (the string), and returns the number of characters in it. It is so easy to use it barely merits an example, but just to make sure we're both reading from the same song sheet:

<?php
    
print strlen("Foo") . "\n"; // 3
    
print strlen("Goodbye, Perl!") . "\n"; // 14
?>

There really is not anything else about strlen() to learn - it is a very simple function, and thus works very simply. Having said that, it is very useful, and is likely to crop up in many scripts that you write.

The other two functions, count_chars() and str_word_count() measure the contents of a string in different ways: count_chars(), when given a string, returns an array containing the letters used in that string and how many times each letter was used, whereas calling str_word_count() without any parameters returns the number of words used.

Using count_chars() is complicated somewhat by the fact that it actually returns an array of exactly 255 elements by default, with each number in there evaluating to an ASCII code. You can work around this by filtering through the array to remove items that have a value (frequency) of 0, or, alternatively, you can pass a second parameter to the function. If you pass 1, only letters with a frequency greater than 0 are listed, if you pass 2 only letters with a frequency equal to 0 are listed.

Similarly, you can pass a second parameter to str_word_count() to make it do other things. By default, it just returns the number of unique words that were found in the string. However, if you pass 1 as the second parameter it will return an array of the words found, and passing 2 does the same, except the key of each word will be set to the position that word was found inside the string.

Here is an example of both functions in action:

<?php
    $str
= "This is a test, only a test, and nothing but a test.";
    
$a = count_chars($str, 1);
    
$b = str_word_count($str, 1);
    
$c = str_word_count($str, 2);
    
$d = str_word_count($str);
    
print_r($a);
    
print_r($b);
    
print_r($c);
    echo
"There are $d words in the string\n";
?>

That should output the following (note that I have taken out much of the whitespace to save space):

Array ( [32] => 11 [44] => 2 [46] => 1 [84] => 1 [97] => 4 [98] => 1 [100] => 1 [101] => 3 [103] => 1 [104] => 2 [105] => 3 [108] => 1 [110] => 4 [111] => 2 [115] => 5 [116] => 8 [117] => 1 [121] => 1)
Array ( [0] => This [1] => is [2] => a [3] => test [4] => only [5] => a [6] => test [7] => and [8] => nothing [9] => but [10] => a [11] => test )
Array ( [0] => This [5] => is [8] => a [10] => test [16] => only [21] => a [23] => test [29] => and [33] => nothing [41] => but [45] => a [47] => test )
There are 12 words in the string

In the first array print out, ASCII codes are used for the numbers inside the square brackets (the array keys) and the frequencies of each letter are used as the other numbers (the array values). In the second printout, the array keys are irrelevant, but the array values are the list of the words found - note that the comma and full stop are not in there as they are not considered words. In the third print out, the array keys mark where the first letter of the word in the value was found, thus "0" means "This" was found at the beginning of the string. The last print out shows the default word-counting behaviour of str_word_count().





<< 4.7.3 Converting to and from ASCII: chr() and ord()   4.7.5 Finding a string within a string: strpos() and stripos() >>
Table of Contents
Want to see this stuff in print? PHP in a Nutshell takes the core topics covered here, adds in thousands of edits from the editorial team and myself, and combines them to make an unbeatable reference for PHP programmers at all levels.



My latest book has hundreds more tips on how to use PHP, Apache, and MySQL, plus Perl, Python, shell scripts, performance tuning, and more!



Top-right shadow
 
Bottom-left shadow Bottom shadow

Comments from other readers
A PHP User - 06 Sep 2008

there are actually 8 t's in the string as t=ASCII 116
the char_count does show 116 as 8 occurrences

deathgod - 06 Sep 2008

The php manual says
print_r($a) function prints variable $a in a human readible format.
i.e. it does nothing special except allow us to more easily read the results.

I suggest if you want a complete list of functions etc. to download the php manual. Its on 2.9Mb.
copy and paste into browsers
http://www.php.net/download-docs.php

Its quite useful for referencing, like the print_r() function here

Gogo the Great - 06 Sep 2008

@A PHP User

quote:
"I am a little confused about the number count that count_chars() represents. for example, Why does it show that t appears 11 times. On manually counting the number of t's there are only 9. This just does not make sense to me..."

It doesn't show that "t" appears 11 times - it's the number of times space (ASCII value of 32) appears in the string. The "t" (ASCII value of 116) appears 8 times - just as is shown in the output array ;)

@Paul Hudson, the author
Paul, this is one of the best online manuals I have ever seen! Great job!

Unix Programmer - 06 Sep 2008

In the third paragraph, a part of a sentence goes "whereas calling str_word_count() without any parameters returns the number of words used.". I believe 'without any parameters' should be replaced with 'without any optional parameters'.

A PHP User - 06 Sep 2008

You should have explained the print_r function back when you were talking about superglobals such as $_SERVER.

A PHP User - 06 Sep 2008

I am a little confused about the number count that count_chars() represents. for example, Why does it show that t appears 11 times. On manually counting the number of t's there are only 9. This just does not make sense to me...

Semper Fi - 06 Sep 2008

Since this is a book. I would recommend explaining the "print_r" function. At first I was confused on it's use, but it was a simple lookup in the PHP manual.

Great book by the way!

-D



Add comment
Please note that by posting a comment here you are committing it to the public domain. This is important so that others can make use of your code themselves, and also so that I can incorporate helpful notes directly into the main text. Comments are limited to 2000 characters in length.

If you are reporting an error in the content, please tell me directly.

Your name/email address:
Your comment:
 
Now, in order to verify that you're a real person, please answer this simple question: what is two plus four?
The answer is:
(please write in
numbers, eg 19)


Top-right shadow
 
Bottom-left shadow Bottom shadow