How to get Phrases and Phrasal Verbs from API?

How to get Phrases and Phrasal Verbs from Oxford API? I've tried:

GET /search/{source_lang}

but It seems that this is not what I am looking for. For instance, the word (eye) in the website has these phrases:

"all eyes are on"
"be all eyes"
"before (or in front of or under) one's (very) eyes"

but in the Search result there is no mention of those phrases.

Is there any way to get this kind of results?

Comments

  • AmosDuveenAmosDuveen Member, Administrator, Moderator admin

    Hi @Developer,

    These would have originally been subentries in our source data but were converted to full entries in their own right in preparation for the conversion to the data format driving the API. Part of the thinking behind this was to increase the accuracy of searches as these lemmas could not otherwise be found directly.

    As a consequence, you now need to search each subentry lemma (phrase, phrasal verb, compound etc.) individually, so you can still successfully search ".../entries/en/all_eyes_are_on_%E2%80%94%E2%80%94" (note that the use of em dashes in the lemma confuses matters but this can be worked around by calling on the search endpoint to suggest appropriate targets) and ".../entries/en/be all eyes" etc.

    Note however that, depending on the program and/or language you are using, in some instances you may need to encode some of the more common non-letter characters e.g. using "%20" or "_" for a space and "%2D" for a hyphen.

  • DeveloperDeveloper Member
    Thank you for the response.

    My problem is how to get those phrases in the first place. Requesting definition will not bring these phrases, but only definitions. I have tried for the word "eye" as an example. I need the request that takes the word "eye" as an input and returns those phrases as an output. For instance, what if the user wants to know wich expressions that use this word? I have tried lemmas, but those are only about inflections. I have tried sentences, which are mainly an exampkes from an English corpus.

    Finally I have tried search, which lists a lot of expressions and phrases containing the searched word. Nevertheless, some of the expressions are not listed (as the expressions mentioned in my post).

    In the Oxford website, searching for the word "eye" would bring a lot of phrases. How to get those?

    Another note: where those " more examples " are coming from? The returned Sentences from the corpus are plenty, and some of them are the same as in "More examples" lists in the website, but there are some examples that are not in the returned Sentences string.
  • AmosDuveenAmosDuveen Member, Administrator, Moderator admin

    Hi @Developer,

    There's a lot to unpick there. Firstly, it might be more helpful to describe what it is you want your app to do, and to what purpose; that way I might possibly be able to give better guidance or suggest a different approach.

    Regarding the search end point, it gave me the same results as the website (US-only results excluded), even the expanded website results for "eye" didn't include the phrases you are looking for so I'm not sure how you came to think they did. Is it possible you did a different search?

    Can you please give me an example of the sentence dictionary discrepancies you are referring to so that I can investigate further?

  • DeveloperDeveloper Member
    edited July 2017
    My app is an Android one that will take the selected words in any other app (Internet browsers, e-readers..etc), then automatically get the definitions from the Api, and output them in a beautiful design.

    Also, it could be used as a normal dictionary, where the user could enter the word by the keyboard and look up the word(s).

    So, if the user wants to know about the expressions that contains certain word, e.g. eye, how to achieve that?

    In the website: en.oxforddictionaries.com, if you put the word eye in the search box, then click/touch the search button, the result will contain definitions of the word eye, plus a lot of phrases that contain this word. To see them, you should scroll to the bottom of the page, after definitions.

    I have attached some screenshots.
  • DeveloperDeveloper Member
    edited July 2017


    My app is an Android one that will take the user's selected words in any other app (Internet browsers, e-readers.. etc), then automatically get the definitions from the Api, and finally output the result in readable way. For this purpose, definitions and some examples are enough.

    I am also thinking of making the app like any normal dictionary app, i.e. get the input from the user by typing in a search box. For this purpose phrases and phrasal verbs are necessary.

    I have attached two screenshots from the result page of the website: en.oxforddictionaries.com when searching for the word: eye. These phrases come out after the definitions part.

    Another big feature I am thinking of is a one that was used by the babylon software in PC: when the user selects some words, the software will try to read the word(s) before or after the selected word(s) to find if there is some expression related to the selected word; for instance, when the user selects the word "look" in this sentence "I will look after him", the software will read the word "after" that comes next to the word " look" and it will display the result for: " look after", which has a completely different meaning than the word " look" alone.

  • AmosDuveenAmosDuveen Member, Administrator, Moderator admin

    Hi @Developer,

    Those examples are part of the dictionary entry and can be found in the output of an entries call; the example sentences are a different dataset and are intended to supplement these core dictionary examples.

  • DeveloperDeveloper Member
    The Phrases in the bottom ( all eyes are on, be all eyes etc) are not part of the dictionary entry.
  • AmosDuveenAmosDuveen Member, Administrator, Moderator admin
    edited July 2017

    Hi @Developer,

    As per my first response, each of the lemmas (excuse the dictionary jargon, click for a definition) now has its own entry so the subentry lemmas you quoted ('all eyes are on --' and 'be all eyes') are now full entries in their own right. There is one, fairly rare, exception, which concerns the embedded lemmas (e.g. 'eye someone up' in the screenshots above); for technical reasons (i.e. they are embedded within the parent entry), these cannot be easily separated out from the parent entry and so they now appear as notes of type 'wordFormNote' and, as a consequence, they are not so discoverable.

    For the lemmas which do have their own independent entry, each can be found using the search endpoint which will produce a fuzzy-matched list of likely results, however, a second call will be necessary to retrieve the entry content.

    The drop-down selection you saw on https://en.oxforddictionaries.com/ is something that was developed specifically as an aspect of the website functionality and is separate from the data provided through the API. You can get a similar result by setting the prefix of the wordlist endpoint, but remember that it relies on matching text strings so if you set the prefix to 'eye', it won't pick up 'all eyes on --' because the lemma doesn't begin with 'eye', however, it will find it if you set the prefix to 'all' or 'all eyes'.

    The difficulty you are going to face with the your project is the automation aspect. I get that you want the smoothest possible experience for your users, however, it is difficult to see how you will be able to minimise the user effort further without being able to interrogate the full morphological dataset yourself (it is available for license, but probably well beyond the financial scope of the average individual programmer). You may go some way towards bridging the gap using other morphology datasets available online but I cannot make any promises as to how well they would serve your purpose or integrate with our API data (probably no better than a string match with no guarantees about variant spellings etc).

    Ultimately, you have the ideas, the tools and the data to put together a decent and workable app but you may need to reign in your ambition slightly in order to get it off the ground. If it really takes off in a big way then maybe that will be the time to revisit the idea of licensing our English morphology.

  • DeveloperDeveloper Member
    edited July 2017

    Thank you for your reply, and also for your understanding. You guess it right: I am kind of a perfectionist who is trying to make the best of what It is possible.

    My app is mainly directed towards English-learners who are trying to read a complete book or a novel on Android devices. When the user finds an unknown word or expression, he/she would simply copy it, and the app, which has been already launched, would show the meaning. It would be frustrating to try to "guess" what is the right headword, or what to "include" in the copied words if the expression is long (e.g. Should I include the word "up" or not? what about the word "it"? etc) . What about the expressions that have the word "something" or "someone" in the dictionary, but not in the real world? How an app could guess that "she has taken him up ten years ago" is referred to the expression: take someone up? You could imagine the difficulty I am facing in this kind of situations.

    My ambition could be even extended to another level: Let us suppose that the user copies ONLY the word: "thought" in a sentence that has the expression: "food for thought". I want my app to "read" the words before and after the selected word, and when it finds there is an expression it will display the full expression with its meaning(s), because that expression is what the user often wants. This idea by the way was originally implemented in a Windows dictionary software.

    Would you please explain a little about this "morphology dataset"?Googling: (Oxford morphology dataset) would result in a reference book about English morphology. It is just to understand how this might help my app.

    "Every book writer looks for compliments, except for dictionary authors: they look only for not to be blamed".

    The same quote applies to dictionary programmers as well.

    Regards.

  • AmosDuveenAmosDuveen Member, Administrator, Moderator admin
    edited July 2017

    Hi @Developer,

    A morphology isn't a print product so you won't find it searching like that. In its most basic form, it is a list of lemmas (e.g. 'go') and their associated wordforms (i.e. variants and inflections e.g. 'go', 'goes', 'going', 'went', etc.); richer morphologies will also include grammatical information and other metadata, which users may find useful.

    The reason why I suggested it was that, if you had offline access to an entire English morphology, you would be able to interrogate it in any which way you see fit (as opposed to being limited to the search tools we make available). I would suggest that, if this approach interests you, you look online to see if you can get hold of a morphology dataset at a more reasonable cost. The difficulty with that approach is that, while there will be a lot of crossover with our entry list, you may still find there are significant differences.

    Searching the wordlist endpoint is better than you might expect in that searching for the entry 'be two (or ten) a penny' will, in different circumstances, unearth results for 'be two a penny', 'be two (or ten) a penny' and 'be ten a penny'. This expansion ought to go some way to mitigating your concerns over the precise wording. Unfortunately there is no leeway with the word 'be' as the lemma form does not give any indication that it is optional and so it would not have been expanded in the same way.

    One last thing I may be able to do to help, but this is by no means guaranteed, I can suggest that we expand the prefix functionality in the filters advanced endpoint to be able to search for any given string of characters, whether or not they appear at the start of an entry. There is a difficulty with this at our end in that the wordlist endpoint is already our most complex output and there may be practical limitations as to what can be done, but I will at least ask the question for you.

  • DeveloperDeveloper Member
    That is good.

    I would like to ask about a point that I have forgetten to mention before, how the end user would be able to use this app? I am referring here to the Api key and app key. Could one key be used for evey user?
  • SimoneSimone Administrator admin

    Hi @Developer, I've already alerted Amos, he should be able to get back to you soon.

  • AmosDuveenAmosDuveen Member, Administrator, Moderator admin

    Hi @Developer,

    The 'app_id' and 'app_key' are individual to each application so you can have multiple installations using the same credentials and all the calls will be logged against your account. For that reason, you may wish to take some security precautions in your setup to ensure that, when your app is distributed, someone cannot easily access these credentials and use them for themselves.

  • DeveloperDeveloper Member
    edited July 2017

    That means either I should have a Premium plan subscription, or I should notify the user to get his/her own app key.

    I think I have finished my app (Android, java). The design is like this:

    1. The user will start the app.
    2. S/he will select a word(s), then copy that to the Clipboard.
    3. The app will AUTOMATICALLY take whatever inside the clipboard, trim it from leading and trailing spaces, clean it from anything that is not: a letter, a number, a between-words space or one of the following special characters:

    ' — _ -

    and finally analyse it using the following approach:

    a. First, It will request the Definitions from the British Oxford for the whole expression. If it is found, then it will display the result in readable way, taking into consideration most of the available endpoints: definitions in senses, definitions in subsenses, notes (in the lexicalCategory level and in the senses and subsenses levels) , regions, registers, pronunciations, audio files, lexicalCategory, domains, variantForms, crossReferenceMarkers, examples.

    b. In case the Definitions request is not found ("java.io.FileNotFound"), it will take** the first word** from the expression, then it will try to get its lemmas. If it's found, it will make a list of its inflections. I have used the HashSet list, because this kind of lists will automatically remove any duplicate from the list (a lot of inflections have the same word with different other information, but my app will use only the "text" value of the "inflectionOf"part). Also, the list will not contain the inflection if it is the same word as the first word (so, the list for the expression "took up" will result in only one item in the list, i.e. "take up", without the expression "took up" itself, because it won't be needed.). Now, after the app get its list for the first word, it will display a clickable list of that inflections along with all other words from the original expression. When any of those items is clicked, the app will request the Definitions for it and we will back to the step a. above. For convenience, if this list contains only ONE item, it will request the definition automatically without showing any list (e.g. if the original expression is "went out" the list will be "go out", and because there is only one item, i.e. "go out", the app will request the definition for "go out", and it will display the result.)

    c. In case none of the above has results, the app will go for Search request, make a list of all the possible headwords, then display that list in clickable list.

    I have also added a simple plainText and a button to help searching for meanings by the traditional way, i.e. typing manually.

    That's it!!

    Why I haven't gone with the Search from the first place? Because I want the app to go the Definitions if it is possible. Search is a two steps process, while Definitions is one step only.

    I know it's not 100% ideal, but it works for 99% of the cases when the user knows which words to include in the selection.

  • DeveloperDeveloper Member

    Two points:

    1. The user would select the words from any app, i.e. e-readers, internet browsers..etc. In addition, the user could select any word(s) from the definition window of my app.

    2. The Search list would take into consideration the specific region for the item, either it is us or gb.

  • AmosDuveenAmosDuveen Member, Administrator, Moderator admin

    Hi @Developer,

    Sounds like an interesting app that can only get better with further tinkering to improve the look-ups. Do let us know when it goes live and how to get it: we are always keen to see what people have done with our data.

    Regarding the account, yes, if you are planning any significant release, you'll have to upgrade to a paid account.

Sign In or Register to comment.