The Dictionary Endpoint is Filtered by Default for Pronunciations

Hello,

I noticed that requests to the Dictionary endpoint apply the filter "regions=gb" by default to 'pronunciations'. I can retrieve US pronunciations only by applying the filter "regions=us".

By default, the Dictionary endpoint returns 'British' and 'North American' 'regions' data for 'senses', so I think it is a bug that both 'British English' and 'American English' 'dialects' data are not provided by default without a filter.

Thank you,
Adam

Comments

  • preimannpreimann Member, Administrator, Moderator admin

    Hi @AdamSteffanick,

    Many thanks for your question. As you correctly describe, a dictionary entry request currently returns sense-level data for different dialects. For example, 'pants' returns both an 'underpants and knickers' sense (British) and the 'trousers' sense (US), but by default only a British pronunciation and audio file are returned unless specifically filtered. This is not a bug as such, but a reflection of the way in which we have so far ingested the dictionary content into the API. The Oxford Dictionary of English, which is the UK title, includes both senses of 'pants' in its entry, which is why you see both senses in your response, but we link it by a separate mechanism to the audio files, which is why you get British and US audio files separately. It is certainly possible for us develop a feature to return both British and US audio files in the same response if this would be useful to you? Would you be able to describe your project in more detail so that I can understand the requirement better?

    I hope that helps, and I look forward to hearing from you.

    Phil

  • Hello @preimann,

    Thank you for this explanation. My intuition was that the endpoint would return all possible data unless filtered, and I had assumed that 'American English' pronunciation data were absent until I thought to apply a filter to my request.

    One of my goals is to compare the IPA transcriptions of 'British English' and 'American English' within the 'dialects' data. Currently, this requires a separate call to the API for each dialect.

    It would also be helpful if you could tell me whether the API's 'British English' data represent Received Pronunciation and 'American English' data represent General American pronunciation.

    Sincerely,
    Adam

  • AmosDuveenAmosDuveen Member, Administrator, Moderator admin

    Hi @AdamSteffanick,

    (apologies for the delay - I've just found this sitting in my drafts when I thought I'd posted it)

    The UK/US split in our data is complicated. There are pros and cons whether we define the split as a different endpoint or a filter within a single endpoint.

    The English dictionary data is a single database but the differences between UK and US run so deep that they cannot be accurately reflected in any single output (we cannot put both the 'underwear' sense and the 'trousers' sense of 'pants' top; it must be one or the other). Therefore, what we have put together is the best way forward we could manage but I'm not sure there will ever be a way to satisfy both British and US users in a single output.

  • Hello @AmosDuveen,

    Thank you. Knowing this, I'll plan to make one API call per dialect and come up with something that merges the responses.

    Sincerely,
    Adam

  • AmosDuveenAmosDuveen Member, Administrator, Moderator admin

    @AdamSteffanick,

    Sounds like a plan!

Sign In or Register to comment.