str_word_count

(PHP 4 >= 4.3.0, PHP 5, PHP 7, PHP 8)

str_word_countBir dizgedeki sözcükler hakkında bilgi verir

Açıklama

str_word_count(string $dizge, int $biçem = 0, ?string $karakterler = null): array|int

dizge içindeki sözcükleri sayar. Seçimlik olan biçem bağımsız değişkeni belirtilMEmişse, dönüş değeri bulunan sözcüklerin sayısını ifade eden bir tamsayı olur. Belirtilmesi durumunda içeriği belirtilen biçem bağımsız değişkenine bağlı olarak değişen bir dizi döner. biçem bağımsız değişkeninde belirtilebilecek değerler ve sonuçları aşağıda açıklanmıştır.

Bu işlevin amacı doğrultusunda 'sözcük' yerele bağlı abecesel karakterlerden başka, sözcüğün ilk karakteri dışında "'" ve "-" karakterlerini de içerebilir. Dikkat: Çok baytlı yereller desteklenmez.

Bağımsız Değişkenler

dizge

Sözcükleri hakkında bilgi döndürülecek dizge.

biçem

Bu işlevin ne döndüreceği belirtilir. Desteklenen değerler:

  • 0 - Bulunan sözcük sayısı döner.
  • 1 - dizge içindeki tüm sözcükleri içeren bir dizi döner.
  • 2 - Sözcüklerin dizge içindeki konumlarını anahtar, sözcükleri değer olarak içeren bir ilişkisel dizi döner.

karakterler

Bir sözcük karakteri olarak değerlendirilebilecek karakterlerin listesi.

Dönen Değerler

Belirtilen biçem'e göre bir tamsayı veya bir dizi döner.

Sürüm Bilgisi

Sürüm: Açıklama
8.0.0 karakterler artık null olabiliyor.

Örnekler

Örnek 1 - str_word_count() örneği

<?php

$str
= "Hello fri3nd, you're
looking good today!"
;

print_r(str_word_count($str, 1));
print_r(str_word_count($str, 2));
print_r(str_word_count($str, 1, 'àáãç3'));

echo
str_word_count($str);

?>

Yukarıdaki örneğin çıktısı:

Array
(
    [0] => Hello
    [1] => fri
    [2] => nd
    [3] => you're
    [4] => looking
    [5] => good
    [6] => today
)

Array
(
    [0] => Hello
    [6] => fri
    [10] => nd
    [14] => you're
    [29] => looking
    [46] => good
    [51] => today
)

Array
(
    [0] => Hello
    [1] => fri3nd
    [2] => you're
    [3] => looking
    [4] => good
    [5] => today
)

7

Ayrıca Bakınız

  • explode() - Bir dizgeyi bir ayraca göre bölüp bir dizi haline getirir
  • preg_split() - Dizgeyi düzenli ifadeye göre böler
  • count_chars() - Bir dizgedeki karakterler hakkında bilgi döndürür
  • substr_count() - Bir dizge içinde belli bir alt dizgeden kaç tane bulunduğunu bulur

add a note

User Contributed Notes 11 notes

up
40
cito at wikatu dot com
14 years ago
<?php

/***
 * This simple utf-8 word count function (it only counts) 
 * is a bit faster then the one with preg_match_all
 * about 10x slower then the built-in str_word_count
 * 
 * If you need the hyphen or other code points as word-characters
 * just put them into the [brackets] like [^\p{L}\p{N}\'\-]
 * If the pattern contains utf-8, utf8_encode() the pattern,
 * as it is expected to be valid utf-8 (using the u modifier).
 **/

// Jonny 5's simple word splitter
function str_word_count_utf8($str) {
  return count(preg_split('~[^\p{L}\p{N}\']+~u',$str));
}
?>
up
17
splogamurugan at gmail dot com
17 years ago
We can also specify a range of values for charlist.

<?php
$str = "Hello fri3nd, you're
       looking          good today! 
       look1234ing";
print_r(str_word_count($str, 1, '0..3'));
?>

will give the result as 

Array ( [0] => Hello [1] => fri3nd [2] => you're [3] => looking [4] => good [5] => today [6] => look123 [7] => ing )
up
1
Adeel Khan
18 years ago
<?php

/**
 * Returns the number of words in a string.
 * As far as I have tested, it is very accurate.
 * The string can have HTML in it,
 * but you should do something like this first:
 *
 *    $search = array(
 *      '@<script[^>]*?>.*?</script>@si',
 *      '@<style[^>]*?>.*?</style>@siU',
 *      '@<![\s\S]*?--[ \t\n\r]*>@'
 *    );
 *    $html = preg_replace($search, '', $html);
 *
 */

function word_count($html) {

  # strip all html tags
  $wc = strip_tags($html);

  # remove 'words' that don't consist of alphanumerical characters or punctuation
  $pattern = "#[^(\w|\d|\'|\"|\.|\!|\?|;|,|\\|\/|\-|:|\&|@)]+#";
  $wc = trim(preg_replace($pattern, " ", $wc));

  # remove one-letter 'words' that consist only of punctuation
  $wc = trim(preg_replace("#\s*[(\'|\"|\.|\!|\?|;|,|\\|\/|\-|:|\&|@)]\s*#", " ", $wc));

  # remove superfluous whitespace
  $wc = preg_replace("/\s\s+/", " ", $wc);

  # split string into an array of words
  $wc = explode(" ", $wc);

  # remove empty elements
  $wc = array_filter($wc);

  # return the number of words
  return count($wc);

}

?>
up
1
manrash at gmail dot com
17 years ago
For spanish speakers a valid character map may be:

<?php
$characterMap = 'áéíóúüñ';

$count = str_word_count($text, 0, $characterMap);
?>
up
1
uri at speedy dot net
13 years ago
Here is a count words function which supports UTF-8 and Hebrew. I tried other functions but they don't work. Notice that in Hebrew, '"' and '\'' can be used in words, so they are not separators. This function is not perfect, I would prefer a function we are using in JavaScript which considers all characters except [a-zA-Zא-ת0-9_\'\"] as separators, but I don't know how to do it in PHP.

I removed some of the separators which don't work well with Hebrew ("\x20", "\xA0", "\x0A", "\x0D", "\x09", "\x0B", "\x2E"). I also removed the underline.

This is a fix to my previous post on this page - I found out that my function returned an incorrect result for an empty string. I corrected it and I'm also attaching another function - my_strlen.

<?php 

function count_words($string) {
    // Return the number of words in a string.
    $string= str_replace("&#039;", "'", $string);
    $t= array(' ', "\t", '=', '+', '-', '*', '/', '\\', ',', '.', ';', ':', '[', ']', '{', '}', '(', ')', '<', '>', '&', '%', '$', '@', '#', '^', '!', '?', '~'); // separators
    $string= str_replace($t, " ", $string);
    $string= trim(preg_replace("/\s+/", " ", $string));
    $num= 0;
    if (my_strlen($string)>0) {
        $word_array= explode(" ", $string);
        $num= count($word_array);
    }
    return $num;
}

function my_strlen($s) {
    // Return mb_strlen with encoding UTF-8.
    return mb_strlen($s, "UTF-8");
}

?>
up
1
brettNOSPAM at olwm dot NO_SPAM dot com
23 years ago
This example may not be pretty, but It proves accurate:

<?php
//count words
$words_to_count = strip_tags($body);
$pattern = "/[^(\w|\d|\'|\"|\.|\!|\?|;|,|\\|\/|\-\-|:|\&|@)]+/";
$words_to_count = preg_replace ($pattern, " ", $words_to_count);
$words_to_count = trim($words_to_count);
$total_words = count(explode(" ",$words_to_count));
?>

Hope I didn't miss any punctuation. ;-)
up
0
php dot net at salagir dot com
8 years ago
This function doesn't handle  accents, even in a locale with accent.
<?php
echo str_word_count("Is working"); // =2

setlocale(LC_ALL, 'fr_FR.utf8');
echo str_word_count("Not wôrking"); // expects 2, got 3.
?>

Cito solution treats punctuation as words and thus isn't a good workaround.
<?php
function str_word_count_utf8($str) {
      return count(preg_split('~[^\p{L}\p{N}\']+~u',$str));
}
echo str_word_count_utf8("Is wôrking"); //=2
echo str_word_count_utf8("Not wôrking."); //=3
?>

My solution:
<?php
function str_word_count_utf8($str) {
    $a = preg_split('/\W+/u', $str, -1, PREG_SPLIT_NO_EMPTY);
    return count($a);
}
echo str_word_count_utf8("Is wôrking"); // = 2
echo str_word_count_utf8("Is wôrking! :)"); // = 2
?>
up
0
dmVuY2lAc3RyYWhvdG5pLmNvbQ== (base64)
15 years ago
to count words after converting a msword document to plain text with antiword, you can use this function:

<?php
function count_words($text) {
    $text = str_replace(str_split('|'), '', $text); // remove these chars (you can specify more)
    $text = trim(preg_replace('/\s+/', ' ', $text)); // remove extra spaces
    $text = preg_replace('/-{2,}/', '', $text); // remove 2 or more dashes in a row
    $len = strlen($text);
    
    if (0 === $len) {
        return 0;
    }
    
    $words = 1;
    
    while ($len--) {
        if (' ' === $text[$len]) {
            ++$words;
        }
    }
    
    return $words;
}
?>

it strips the pipe "|" chars, which antiword uses to format tables in its plain text output, removes more than one dashes in a row (also used in tables), then counts the words.

counting words using explode() and then count() is not a good idea for huge texts, because it uses much memory to store the text once more as an array. this is why i'm using while() { .. } to walk the string
up
0
brettz9 - see yahoo
15 years ago
Words also cannot end in a hyphen unless allowed by the charlist...
up
0
charliefrancis at gmail dot com
16 years ago
Hi this is the first time I have posted on the php manual, I hope some of you will like this little function I wrote.

It returns a string with a certain character limit, but still retaining whole words.
It breaks out of the foreach loop once it has found a string short enough to display, and the character list can be edited.

<?php
function word_limiter( $text, $limit = 30, $chars = '0123456789' ) {
    if( strlen( $text ) > $limit ) {
        $words = str_word_count( $text, 2, $chars );
        $words = array_reverse( $words, TRUE );
        foreach( $words as $length => $word ) {
            if( $length + strlen( $word ) >= $limit ) {
                array_shift( $words );
            } else {
                break;
            }
        }
        $words = array_reverse( $words );
        $text = implode( " ", $words ) . '&hellip;';
    }
    return $text;
}

$str = "Hello this is a list of words that is too long";
echo '1: ' . word_limiter( $str );
$str = "Hello this is a list of words";
echo '2: ' . word_limiter( $str );
?>

1: Hello this is a list of words&hellip;
2: Hello this is a list of words
up
0
MadCoder
20 years ago
Here's a function that will trim a $string down to a certian number of words, and add a...   on the end of it.
(explansion of muz1's 1st 100 words code)

----------------------------------------------
<?php
function trim_text($text, $count){
$text = str_replace("  ", " ", $text);
$string = explode(" ", $text);
for ( $wordCounter = 0; $wordCounter <= $count;wordCounter++ ){ 
$trimed .= $string[$wordCounter];
if ( $wordCounter < $count ){ $trimed .= " "; }
else { $trimed .= "..."; }
}
$trimed = trim($trimed);
return $trimed;
}
?>

Usage
------------------------------------------------
<?php
$string = "one two three four";
echo trim_text($string, 3);
?>

returns:
one two three...
To Top