Exact number of data set entries offered by the API

shahoodshahood Member

Hi @Simone

Can you please provide the exact number of the following data offered by this API:

  • Headwords
  • Definition entries
  • Thesaurus entries
  • Example sentences
  • Phrases
    I need to display this information in the introduction page of my Android app.

Thanks

Comments

  • AmosDuveenAmosDuveen Member, Administrator, Moderator admin
    edited February 6

    Hi @shahood,

    I wish it were that simple!

    We probably ought to do an audit so that we can produce those stats for our own purposes as much as anything else. The problem is that ODAPI extracts data from various different datasets and also manipulates them to improve the user experience. So, for example, the wordlist endpoint tells me that there are:

    • GB English: 230,000 headwords
    • US English: 220,000 headwords

    However, that doesn't tell you how many of those entries overlap (which will be a very significant proportion). Furthermore, while I have access to the source data for all of these outputs, it bears very little relation to the way the data is actually delivered by the API. For illustration, my source data only has:

    • GB English: 110,000 headwords
    • US English: 100,000 headwords

    The these are the same datasets but ODAPI has reorganized the data to make content more discoverable; one simple example of this is that all the subentries in the source data have been promoted to full entries in ODAPI which means that the content remains identical but, suddenly, you have a lot more headwords.

    Unfortunately, the thesaurus and sentence endpoints do not output headword lists so I cannot give you that information, meanwhile, the source data I have available will not give you an accurate reflection of ODAPI outputs.

    All of which, is a very long-winded way of saying 'sorry, but I don't know'.

Sign In or Register to comment.