cí pín
  • word frequency
  1. 这个程序可给我们显示词频。

    The program can show us word frequency .

  2. 提出利用相对词频(RelativeWordFrequency,RWF)来评估词语之间搭配强度。

    The concept of relative word frequency ( RWF ) is presented for evaluating the strength of word collocation .

  3. 基于主题词频和g指数的研究热点分析方法

    Research Focus Analysis Based on the Frequency of Topic Words and G-index

  4. 基于域加权词频法的XML文档级检索实现与评价

    Field-Weighted XML Document Level Retrieval and Evaluation

  5. 对自由性产出词汇量的测量,则使用Nation的词频统计软件。

    As for free productive vocabulary , the software for Lexical Frequency invented by Nation has been used .

  6. 关键词权值的计算除了词频、位置,也包括了HTML标签。

    The keyword is weighted by HTML tags besides frequency and location .

  7. 为了解决这个问题,将领域知识引入了Email的特征表示,并在此基础上提出了一种综合领域知识和词频的特征表示方法,用于Email分类。

    To settle this problem , this paper presents a hybrid feature definition method for Email classification .

  8. 该系统采用了一种新的基于少量Web示例网页和语料库词频统计的特征抽取算法和过滤阈值设定方法。

    This system adopts a new algorithm of feature extraction and a new method to determine filtering threshold based on small webpage training sets and term-frequency statistics of corpus .

  9. 目前Web挖掘技术中,特别是Web文本的分类、聚类,采用的核心算法是基于词频统计的矢量空间模型算法。

    The present Web mining technology , especially the core algorithmic of Web document classification and clustering are based on statistical word frequency Vector Space Model ( VSM ) .

  10. 在统计词频方面,本文使用了改进的K均值方法对参数进行估计,并采用线性差值法对参数结果进行平滑处理。

    We used the improved K-mean method to estimate the parameters and the margin linear method for smoothing parameter results in frequency statistics .

  11. 该方法是在词频特征的基础上加入人工总结出的领域特征,从而更能准确地表示Email的主要内容,以提高Email分类的平均F-score。

    It adds knowledge-based features in bag-of-word features to improve F-score in Email classification .

  12. 本文在对90年代以来的《科学文摘》C辑中有关计算机情报分类统计和主题词词频的统计分析基础上,对国外计算机情报检索的热点及部分前沿课题进行了探讨。

    Based on the research of the classification analysis and subject analysis , this paper introduces hot topics and part of advanced fields of information retrieval in the world in the 1990s .

  13. 词频统计是中国汉语水平考试(HSK)的一个重要特征。

    Word frequency calculation is an important characteristic of HSK .

  14. 通过利用基于词频的权值计算,同时改进传统文本相似度计算概率模型,改进SVM算法实现邮件过滤系统。

    We can use term frequency to have a weighted calculation and improve traditional text similarity calculation probability model in SVM algorithm .

  15. 本文将以具有确定分类标准的短文本分类为应用背景,利用基于统计学方法对短文本进行词频统计,同时利用基于支持向量机(SVM)的分类技术,评价短文本与类别的相关关系。

    Support Vector Machine ( SVM ) classification technique is used to evaluate the short text and inspect out the correlation between categories .

  16. 词频和年级对FOK判断的影响

    Influence of Frequency of Words and Grades on the FOK Judgment

  17. PageRank算法和Hits算法以及词频位置加权算法是研究的重点。

    This study also focuses on the PageRank algorithm , Hits algorithm and Frequency position weighing algorithm .

  18. 二是鉴于词频的特征表示方法难以准确表示Email主要内容,因此将领域知识引入Email特征表示中,并在此基础上提出了一种综合领域知识和词频的特征表示方法,用于Email分类。

    Whereas Email feature representation based on word frequency cannot represent the topic of an Email precisely , this paper presents a hybrid feature representation method for Email classification .

  19. 在系统中,通过提取文档集的词频、文档标引源位置特性及本体关系距离等初集本体特征,作为样本集,并采用BP神经网络预测出文档集内文档的分级排序。

    Word frequency , marked sources location and ontology relation distances handled by the system are used as input samples of the Back-Propagation Neural Networks ( BPNN ) to predict the document level .

  20. 基于词频差异的特征选取及改进的TF-IDF公式

    Improved feature selection method and TF-IDF formula based on word frequency differentia

  21. 同时对于提高信息检索的查全率和查准率问题,本文将基于用户行为的查询扩展方法和基于词频统计的查询扩展相结合,设计了一个基于Web语义的查询扩展结构。

    And in order to improve the recall rate and precision rate of information retrieval , this paper summarizes the query expansion method based on user behavior and word frequency , then design a architecture of query expansion based on Web Semantics .

  22. 基于HNC理论的一种词汇歧义消解规则3、在词汇歧义消解过程的第一个阶段&歧义词意义通达阶段,只有词频因索起作用,语境因素不起作用。

    A approach for word disambiguation based on HNC One was lexical frequency , and the other was discourse context .

  23. 此外,还用词频分析方法分析了中德合著SCI论文的选题,定位了双方共同的研究兴趣和研究热点。

    Using WordSmith software we also analyzed the frequency of words composing the titles of 7985 papers and found the common research interests and the hot topics of the double sides .

  24. 国内外信息检索研究热点分析&基于Z-Score标准化的词频

    Analysis of the Research Hotspots in Domestic and Foreign Information Retrieval Field & Based on the Z-Score Word Frequency

  25. 中文任务词频效应出现在左侧Broca区;英文任务在双侧Broca区。

    Word frequency effect of Chinese task represented on left Broca area , and English task represented on bilateral Broca area .

  26. 分析了邮件特征域在邮件主题表达力方面的重要作用,给出了基于特征域词频TF的权值计算方法,并改进了传统的文本相似度计算概率模型。

    This paper presents the weight calculation based on term frequency by analyzing the important effects of E-mail character field in topic expression , and improves the traditional probabilistic model of resemblance calculation .

  27. 该方法利用平均信息量和词频-逆向文件频率(tf-idf)分析并重构字典中原子携带的区别性信息。

    The model utilizes information entropy and term frequency-inverse document ( tf-idf ) frequency to analyze and reconstruct the discriminative information carried by atoms .

  28. 另外,为了提高词频计算的精度,SBGA采用了一种改进的词频计算方法TFS,将加权后词的同义词频率加到了原词频中。

    To improve the accuracy of term frequency , SBGA employs a new method TFS , which takes word sense into account while calculating term frequency .

  29. 本文就目前基于词频表示的中文文本分类实验结果缺乏可比性的情况,结合中文文本分类的字/词频特征向量和经典文本分类方法如TFIDF/Rocchio、Na(?)

    Due to the lack of comparative study on word based Chinese text classification in the previous research , this paper proposes a controlled comparative Chinese text classification experiment on combining character / word representation and classical classifiers such as TFIDF / Rocchio Naive Bayes Support .

  30. 该方法将文档中词频出现两次以上的词条作为文档的摘要信息,来表示节点文档内容,然后根据改进的STC算法为选出的词条建立了一个树状的层次结构。

    In this method , words which appear more than twice are picked up to represent the content of the nodes , and then improved STC algorithm is employed to build a hierarchical structure with the selected words .