Object

org.nlp4l.lucene.stats

WordCounts

Related Doc: package stats

Permalink

object WordCounts

Utility object for "counting the frequency of words" in the index with various criteria.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. WordCounts
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. def count(reader: RawReader, field: String, words: Set[String], docSet: Set[Int], maxWords: Int, analyzer: Analyzer): Map[String, Long]

    Permalink

    Count frequencies of words in the index with given org.nlp4l.lucene.analysis.Analyzer.

    Count frequencies of words in the index with given org.nlp4l.lucene.analysis.Analyzer.

    reader

    the RawReader instance

    field

    the field name for counting words

    words

    the set of words to be counted. All words will be returned if empty set is given.

    docSet

    the index subset (set of document ids) for counting. Sums up word frequencies around whole index if empty set is given.

    maxWords

    the max number of words to be returned. Top frequent words are returned if positive integer is given, otherwise all words will be returned. Default is -1 (all words will be returned.)

    analyzer

    the Analyzer to re-analyze field values to count words. This is used when no term vector is available for given documents / field.

    returns

    the Map of word and frequency associated this word.

  7. def count(reader: IReader, field: String, words: Set[String], docSet: Set[Int], maxWords: Int = 1): Map[String, Long]

    Permalink

    Count frequencies of words in the index with the org.nlp4l.lucene.Schema in the given IReader.

    Count frequencies of words in the index with the org.nlp4l.lucene.Schema in the given IReader.

    reader

    the IReader instance

    field

    the field name for counting words

    words

    the set of words to be counted. All words will be returned if empty set is given.

    docSet

    the index subset (set of document ids) for counting.

    maxWords

    the max number of words to be returned. Top frequent words are returned if positive integer is given, otherwise all words will be returned. Default is -1 (all words will be returned.)

    returns

    the Map of word and frequency associated this word.

  8. def countDF(reader: RawReader, field: String, words: Set[String], maxWords: Int = 1): Map[String, Long]

    Permalink

    Count document frequencies of words in the whole index.

    Count document frequencies of words in the whole index.

    reader

    the IReder instance

    field

    the field name for counting words

    words

    the set of words to be counted. All words will be returned if empty set is given.

    maxWords

    the max number of words to be returned. Top frequent words are returned if positive integer is given, otherwise all words will be returned. Default is -1 (all words will be returned.)

  9. def countPrefix(reader: RawReader, field: String, prefix: String): Long

    Permalink
  10. def countWords(text: String, words: Set[String], analyzer: Analyzer): Map[String, Long]

    Permalink
  11. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  12. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  13. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  14. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  15. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  16. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  17. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  18. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  19. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  20. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  21. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  22. def totalCount(reader: RawReader, field: String, docSet: Set[Int], analyzer: Analyzer): Long

    Permalink

    Count the frequency for all words in the index with given org.nlp4l.lucene.analysis.Analyzer.

    Count the frequency for all words in the index with given org.nlp4l.lucene.analysis.Analyzer.

    reader

    the RawReader instance

    field

    the field name for counting words

    docSet

    the set of words to be counted. All words will be returned if empty set is given.

    analyzer

    the Analyzer to re-analyze field values to count words. This is used when no term vector is available for given documents / field.

    returns

    the Map of word and frequency associated this word.

  23. def totalCount(reader: IReader, field: String, docSet: Set[Int]): Long

    Permalink

    Count the frequency for all words in the index with the org.nlp4l.lucene.Schema in the given IReader.

    Count the frequency for all words in the index with the org.nlp4l.lucene.Schema in the given IReader.

    reader

    the IReader instance

    field

    the field name for counting words

    docSet

    the set of words to be counted. All words will be returned if empty set is given.

    returns

    the Map of word and frequency associated this word.

  24. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  25. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  26. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped