php英文单词统计器

本文实例为大家分享了英文单词统计PHP 实现,供大家参考,具体内容如下

程序开始运行, 按"浏览"钮选择一个英文文档, 再按"统计 Statistics"钮, 即可得到按字母顺序列出的所有单词,及其出现的次数 用于测试的数据文档: data.txt 驱动程序:word.PHP output.PHP 和 StringTokenizer.PHP 是 要求在同一个文件夹中的程序

1. words_statistics_PHP.png

2. word.PHP

require("StringTokenizer.php");
require("output.php");
if($_POST['submit']){
if ($_FILES["file"]["error"] > 0)
echo "Error: " . $_FILES["file"]["error"] . "
";
else {
$myfile = fopen($_FILES["file"]["tmp_name"],"r") or die("Unable to open file!");
$str = fread($myfile,filesize($_FILES["file"]["tmp_name"]));
$delim = "?\,. /:!\"()\t\n\r\f%";
$st = new StringTokenizer($str,$delim);
echo '找到字符串: '.$st->countTokens();
$list=new LinkedList();
while ($st->hasMoreTokens()) {
$list->orderInsert($st->nextToken());
}
$list->words_count();
$list->traversal();
fclose($myfile);
}
}
?>

英文文档单词统计 Statistics on English words

程序开始运行, 按"浏览"钮选择一个英文文档, 再按"统计 Statistics"钮, 即可得到按字母顺序列出的所有单词,及其出现的次数

<form action="word.PHP" method="post"
enctype="multipart/form-data">
<label for="file">英文文档名 File Name:

3. output.PHP

PHP;"> data = $data; //英文字符串 $this->next = $next; //指向后继结点的指针 $this->frequency=$frequency; //英文字符串出现的次数 } }

class LinkedList{
private $head; //单链表的头结点,不存储数据
function __construct(){//单链表的构造方法
//头结点的数据为"傀儡",不代表 任何数据
$this->head = new Node("dummy 傀儡");
$this->first = null;
}

function isEmpty(){
return ($this->head->next == null);
}
/* orderInsert($data) 方法

  • 按给定字符串 $data 的大小,将其安插到适当的位置,
  • 以保证单链表中字符串的存储,始终是有序的。
    */
    function orderInsert($data){
    $p = new Node($data);
    if($this->isEmpty()){
    $this->head->next = $p;
    }
    else {
    $node= $this->find($data);
    if(!$node){
    $q = $this->head;
    while($q->next != NULL && strcmp($data,$q->next->data)> 0 ){
    $q = $q->next;
    }
    $p->next = $q->next;
    $q->next = $p;
    }else
    $node->frequency++;
    }
    }

function insertLast($data){//将字符串插到单链表的尾部
$p = new Node($data);

if($this->isEmpty()){
$this->head->next = $p;
}
else{
$q = $this->head->next;
while($q->next != NULL)
$q = $q->next;
$q->next = $p;
}
}

function find($value){//查询是否有给定的字符串
$q = $this->head->next;
while($q->next != null){
if(strcmp($q->data,$value)==0){
break;
}
$q = $q->next;
}
if ($q->data == $value)
return $q;
else
return null;
}

function traversal(){//遍历单链表
if(!$this->isEmpty()){
$p=$this->head->next;
echo "输出结果:<table><tr>";
echo "<td>".$p->data."
出现次数:".$p->frequency."</td>";
$n=1;
while($p->next != null){
$p=$p->next;
echo "<td>".$p->data."
出现次数:".$p->frequency."</td>";
$n++;
if ($n%11==0) echo "</tr><tr>";
}

  echo "</tr&gt;</table&gt;";      
}else
echo "链表为空!";

}

function words_count(){
if($this->isEmpty())
echo "
没有储存字符串
";
else{
$counter=0;
$p=$this->head->next;
while($p->next != null){
$p=$p->next;
$counter++;
};
echo "共有单词 ".$counter." 个";
}
}}
?>

4. StringTokenizer.PHP

PHP;"> PHP

/**

  • The string tokenizer class allows an application to break a string into tokens.
  • @author Azeem Michael
  • @example The following is one example of the use of the tokenizer. The code:
  • <?php
  • $str = "this is:@\t\n a test!";
  • $delim = " !@:'\t\n\0"; // remove these chars
  • $st = new StringTokenizer($str,$delim);
  • echo 'Total tokens: '.$st->countTokens().'
    ';
  • while ($st->hasMoreTokens()) {
  • echo $st->nextToken() . '
    ';
  • }
  • prints the following output:
  • Total tokens: 4
  • this
  • is
  • a
  • test
  • ?>

  • */
    class StringTokenizer {

/* @var string
/
private $string;

/* @var string
/
private $token;

/* @var string
/
private $delim;

/**

  • Constructs a string tokenizer for the specified string.
  • @param string $str String to tokenize
  • @param string $delim The set of delimiters (the characters that separate tokens)
  • specified at creation time,default to " \n\r\t\0"
    */
    public function __construct($str,$delim=" \n\r\t\0") {
    $this->string = $str;
    $this->delim = $delim;
    $this->token = strtok($str,$delim);
    }

/**

  • Destructor to prevent memory leaks
    */
    public function __destruct() {
    unset($this);
    }

/**

  • Calculates the number of times that this tokenizer's nextToken method can
  • be called before it generates an exception
  • @return int - number of tokens
    */
    public function countTokens() {
    $counter = 0;
    while($this->hasMoreTokens()) {
    $counter++;
    $this->nextToken();
    }
    $this->token = strtok($this->string,$this->delim);
    return $counter;
    }

/**

  • Tests if there are more tokens available from this tokenizer's string. It
  • does not move the internal pointer in any way. To move the internal pointer
  • to the next element call nextToken()
  • @return boolean - true if has more tokens,false otherwise
    */
    public function hasMoreTokens() {
    return ($this->token !== false);
    }

/**

  • Returns the next token from this string tokenizer and advances the internal
  • pointer by one.
  • @return string - next element in the tokenized string
    */
    public function nextToken() {
    $hold = $this->token; //hold current pointer value
    $this->token = strtok($this->delim); //increment pointer
    return $hold; //return current pointer value
    }
    }
    ?>

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持编程之家。

相关文章

Hessian开源的远程通讯,采用二进制 RPC的协议,基于 HTTP 传输。可以实现PHP调用Java,Python,C#等多语...
初识Mongodb的一些总结,在Mac Os X下真实搭建mongodb环境,以及分享个Mongodb管理工具,学习期间一些总结...
边看边操作,这样才能记得牢,实践是检验真理的唯一标准.光看不练假把式,光练不看傻把式,边看边练真把式....
在php中,结果输出一共有两种方式:echo和print,下面将对两种方式做一个比较。 echo与print的区别: (...
在安装好wampServer后,一直没有使用phpMyAdmin,今天用了一下,phpMyAdmin显示错误:The mbstring exte...
变量是用于存储数据的容器,与代数相似,可以给变量赋予某个确定的值(例如:$x=3)或者是赋予其它的变...