- 96 All Categories
- 2 LexiStats Corner
- 3 Review my code
- 5 Tutorials and presentations
- 25 Frequently asked questions
- 3 How to get useful technical help
- 1 Member guidelines
- 5 Suggest an improvement
- 12 Report a bug
- 5 Ask the Community: Other
- 20 Ask the Community: Technical and operational questions
- 10 General
Apologies for not updating you on this sooner. I thin the lemmatron is actually working as intended (which is to say, directing users to an appropriate entry, even if that isn't the root form of the word).
What you are suggesting (some kind or morphological analyser?) would represent a new use case which we might be able to explore in future but we would need your help to understand the use case a bit better and we'd also need to see if there were a wider demand for such an endpoint.1
Thanks for spotting this. You are quite right that the 'metallurgy' label does not belong in any of the verb senses. I can only assume that this was a bit of a slip in the conversion process prior to uploading it to the API.
You'll notice the website displays the same data a little differently, grouping the senses into entries along semantic lines whereas the API data groups the senses into entries by part of speech (verbs, nouns, adjectives, etc.). The website display is actually a bit closer to the way the data is structured in-house so what I think has happened is that there was an issue in the conversion which transferred the label when it shouldn't have. This is actually an opportune time to raise it with the technical team as they are currently working on another little issue in the same conversion process.1
(apologies for the delay - I've just found this sitting in my drafts when I thought I'd posted it)
The UK/US split in our data is complicated. There are pros and cons whether we define the split as a different endpoint or a filter within a single endpoint.
The English dictionary data is a single database but the differences between UK and US run so deep that they cannot be accurately reflected in any single output (we cannot put both the 'underwear' sense and the 'trousers' sense of 'pants' top; it must be one or the other). Therefore, what we have put together is the best way forward we could manage but I'm not sure there will ever be a way to satisfy both British and US users in a single output.2
We don't currently offer such an output, but as the data is there and we have had quite a few requests for this type of endpoint, it is fairly high on the list of improvements. Unfortunately, I cannot give you a definitive date yet, but we are aware and we are working on it.1
Why the focus on concrete nouns? ...I only ask because we may find a way to be differently helpful.
I'm not a lexicographer by training but I do work with them and I can tell you that even concepts you think may be fairly straightforward can be picked apart to the nth degree (hint: do not ever ask a lexicographer to define what a 'word' is, if you know what's good for you!).
Given that potential complexity you'll find that some linguistic concepts you may have heard about do not actually get referenced in a dictionary because the editors do not consider them to be suitably important to have a protracted debate over, for example in the Oxford Dictionary of English and the New Oxford American Dictionary (our premium single-volume dictionaries of current usage), the 'properness' of a noun is never referenced (FYI: just because the headword starts with a capital letter, doesn't necessarily mean it is 'proper' noun), and you won't find any labels indicating 'slang', which is too fickle to nail down. However, these priorities are evolving with the digital age and the changing way people use dictionary content so the API data now makes an attempt to identify 'proper' nouns, for example.
Coming back around to the idea of concrete nouns, I can see that this may be just the type of concept which can be argued over ad nauseum to very little benefit in helping people to use the language.1