Dear API forum users,

On 31 March 2020, we will be making a change on our login system.
Your account will be preserved, but you will be required to reset your password the first time you try to login after this date.

Just go to Sign in on https://forum.oxforddictionaries.com/api/ and follow the instructions to reset your password.

If you have any questions, please get in touch with us on [email protected]

What is significance of multiple results to some LexiStats requests?

When using the API to make a LexiStats request for frequency data, I get an r.text object. Typically under results there is one frequency result, however for certain words (e.g.; 'the', 'today', 'New Zealand'), there are multiple results in exactly the same format. Typically the first frequency is very low (number ranging from 2 to 15), obviously incorrect, then the second result appears correct, in the thousands or millions or whatever. Is this an error in the data, or is there some significance to the multiple results? Thanks

Comments

  • SimoneSimone Administrator admin

    Hi @efleming582
    Let me check with my colleagues, and I'll get back to you - bear with me, please!

  • SimoneSimone Administrator admin

    Hi @efleming582

    One of my colleagues has just got back to me about your question.
    Here is what he sent me:

    The API returns the frequency of the different results group by lexical category and grammatical Features.

    • ‘frequency’ gives the number of appearance in the corpus
    • ‘normalisedFrequency’ gives how frequent that specific word is on average in 1 million words.

    For instance:

    https://od-api.oxforddictionaries.com/api/v2/stats/frequency/words/en/?corpus=nmc&wordform=today&limit=100
    
    Lexical Category: adverb
    
    { "wordform": "today", "lemma": "today", "normalizedLemma": "today", "trueCase": "today", "frequency": 1664, 
    "normalizedFrequency": 0.182546247200115, "lexicalCategory": "adverb", "grammaticalFeatures": {}, "firstMention": "2012-05-01T00:00:00", "components": "N/A", "type": "word" }, 
    
    Lexical Category: Noun + Plural
    
    { "wordform": "today", "lemma": "today", "normalizedLemma": "today", "trueCase": "today", "frequency": 1, "normalizedFrequency": 0.000109703273558, "lexicalCategory": "noun", "grammaticalFeatures": { "numberType": "plural" }, "firstMention": "2013-02-01T00:00:00", "components": "N/A", "type": "word" },
    
    Lexical Category: Noun + Singular
    
    { "wordform": "today", "lemma": "today", "normalizedLemma": "today", "trueCase": "today", "frequency": 554237, "normalizedFrequency": 60.80161322683295, "lexicalCategory": "noun", "grammaticalFeatures": { "numberType": "singular" }, "firstMention": "2008-09-01T00:00:00", "components": "N/A", "type": "word" }, 
    
    Lexical Category: Adverb + Temporal
    
    { "wordform": "today", "lemma": "today", "normalizedLemma": "today", "trueCase": "today", "frequency": 2316151, "normalizedFrequency": 254.08934675408236, "lexicalCategory": "adverb", "grammaticalFeatures": { "thematicRoleType": "temporal" }, "firstMention": "1960-12-01T00:00:00", "components": "N/A", "type": "word" }
    

    I hope that helps!

  • SimoneSimone Administrator admin

    Hi @efleming582
    Just a quick note to say we haven't forgotten about this question - my colleagues are still looking into it, so bear with us, please!

Sign In or Register to comment.