Words with 4-digit homograph numbers?

shahoodshahood Member

Hi,

Can u please share a bunch of words here that have a homograph number with 4 characters? I want to test my code but still haven't been able to find one such word.

Thanks

Tagged:

Comments

  • AmosDuveenAmosDuveen Member, Administrator, Moderator admin

    Hi @shahood,

    I'm not quite sure I follow you. Are you asking for a 4-letter word with a homograph (e.g. bank, tear, wind, etc.) or are you asking whether any homograph numbers in the data have 4 digits, in which case, I am not aware that there are any.

  • shahoodshahood Member
    edited November 15
    Hi @AmosDuveen
    Thanks for taking time to reply!

    I'm talking about homographs that have 4 digits. The documentation actually points at the possibility of there being 4 digits in a homographs number. I myself didn't come across any but then I have used only a small subset of the complete data. So, I though u guys might dig out one for me.

    So is it safe to conclude that homograph numbers would always have 3 digits?
  • shahoodshahood Member

    Somehow, I can see only a small portion of my reply not the complete reply. Is there something wrong with the website bcoz I did post a complete reply.

  • AmosDuveenAmosDuveen Member, Administrator, Moderator admin

    Hi @shahood,

    The display is something for @simone to look into, but I can see the whole text by using my admin super power to go in to edit your post (but not actually make changes in this case).

    Can you please point me to where in the documentation you read about 4-digit homograph codes?

    I can see how they might be theoretically possible but, practically, I'm not aware of any datasets that might cause them to be used.

    You may have noticed that the numbers are not always sequential; currently only the first digit applies to the homograph; the second and third digits apply to the sense block numbering by lexicalCategory so that "homographNumber": "215" would be the 16th lexicalCategory (counts start at zero) of the 2nd homograph (counts start at 1), and that is already pretty unlikely. I imagine you'd probably need >9 homographs to trigger a 4th digit.

  • shahoodshahood Member

    Hi @AmosDuveen,

    Here is how homograph has been described in Model section:

    homographNumber (string, optional): Identifies the homograph grouping. The last two digits identify different entries of the same homograph. The first one/two digits identify the homograph number.

    So, the last digits will always be two but the first ones can be two and that's precisely why I asked.

  • shahoodshahood Member

    @AmosDuveen

    The display is something for @simone to look into, but I can see the whole text by using my admin super power to go in to edit your post (but not actually make changes in this case).

    That's how it looks here, and that's true with many of my comments:

  • AmosDuveenAmosDuveen Member, Administrator, Moderator admin

    Hi @shahood,

    I deliberately tagged Simone in to make her aware of the display issue.

    As for how to code, the chances of two digits being required are pretty low, but non-zero and even were such a wordform to be found (with >9 homographs), the consequences would presumably be limited to those specific entries. Ultimately, the choice is yours. How difficult would it be to cater for 2 digits rather than one and what is the risk of not doing so? If it is a relatively straightforward thing to do, I would go ahead and code for 4 digits.

  • shahoodshahood Member
    @AmosDuveen yeah I've coded for 4 digits already but just wanted to test the code with some words. However, since u r not aware of any, I'll find some other way.
    Thanks anyway!
  • SimoneSimone Administrator admin

    @AmosDuveen and @shahood
    Thank you both for flagging this issue with the comment display - I will raise it with the forum support.

Sign In or Register to comment.