When using the API to make a LexiStats request for frequency data, I get an r.text object. Typically under results there is one frequency result, however for certain words (e.g.; 'the', 'today', 'New Zealand'), there are multiple results in exactly the same format. Typically the first frequency is very low (number ranging from 2 to 15), obviously incorrect, then the second result appears correct, in the thousands or millions or whatever. Is this an error in the data, or is there some significance to the multiple results? Thanks


    Hi @efleming582
    bear with me, please!

    Hi @efleming582

    One of my colleagues has just got back to me about your question.
    Here is what he sent me:

    The API returns the frequency of the different results group by lexical category and grammatical Features.

    • ‘frequency’ gives the number of appearance in the corpus
    • ‘normalisedFrequency’ gives how frequent that specific word is on average in 1 million words.

    For instance:

    Lexical Category: adverb
    { "wordform": "today", "lemma": "today", "normalizedLemma": "today", "trueCase": "today", "frequency": 1664, 
    "normalizedFrequency": 0.182546247200115, "lexicalCategory": "adverb", "grammaticalFeatures": {}, "firstMention": "2012-05-01T00:00:00", "components": "N/A", "type": "word" }, 
    Lexical Category: Noun + Plural
    { "wordform": "today", "lemma": "today", "normalizedLemma": "today", "trueCase": "today", "frequency": 1, "normalizedFrequency": 0.000109703273558, "lexicalCategory": "noun", "grammaticalFeatures": { "numberType": "plural" }, "firstMention": "2013-02-01T00:00:00", "components": "N/A", "type": "word" },
    Lexical Category: Noun + Singular
    { "wordform": "today", "lemma": "today", "normalizedLemma": "today", "trueCase": "today", "frequency": 554237, "normalizedFrequency": 60.80161322683295, "lexicalCategory": "noun", "grammaticalFeatures": { "numberType": "singular" }, "firstMention": "2008-09-01T00:00:00", "components": "N/A", "type": "word" }, 
    Lexical Category: Adverb + Temporal
    { "wordform": "today", "lemma": "today", "normalizedLemma": "today", "trueCase": "today", "frequency": 2316151, "normalizedFrequency": 254.08934675408236, "lexicalCategory": "adverb", "grammaticalFeatures": { "thematicRoleType": "temporal" }, "firstMention": "1960-12-01T00:00:00", "components": "N/A", "type": "word" }

    I hope that helps!

    Hi @efleming582
