Chinese bilingual - possible to query by pinyin?

The Chinese-English-Chinese bilingual data is now included in the API, and searching for translations/entries based on English words and Chinese characters works fine.
However, for a Chinese dictionary, it would be very important to be able to search based on pinyin as well. At the moment, this doesn't seem to work.

For example:
Searching for 花 works. But searching for "hua" or "huā" does not work.

When searching for "hua", I would expect to get ""huā", "huà", "huá", "huàbái", "huābàn", etc. When searching for "huā", I would expect to get "huā", "huābái", "huābāo", "huāchá", etc.



  • TaisFukushimaTaisFukushima Member, Administrator, Moderator admin

    Hello @esoderblom

    At the moment, we process the diacritics in the headword in the same way for all the languages, so we don't distinct between tonal a no tonal language. however, we understand your point and have raised a ticket to work on this.

    Thanks for flagging this up!

  • Hello @TaisFukushima! Thank you for the answer.

    However, I think you might have misunderstood the biggest problem I have. Currently, you cannot query the Chinese-English dictionary based on pinyin at all.

    So it's not a question of "it would be nice if 'hua' and 'huā' were handled differently". Instead, searching with "hua" or "huā" doesn't return anything at all. For a Chinese-English dictionary, I think being able to search by pinyin is vital.

    Or, if there is a way to search based on pinyin, please let me know how - at least the normal /translations endpoint doesn't return anything.

    So this works (using the character 花):花?strictMatch=false

    But this doesn't (using "hua" which is the pinyin for that character):

  • TaisFukushimaTaisFukushima Member, Administrator, Moderator admin

    Hi @esoderblom

    My apologies, I now realise the first reply was not very clear.

    Yes, you are correct in that it is not possible to run a pinyin search at all at the moment. And this is because of the way we are currently able to process diacritics. My colleagues are looking into the possibilities of developing the ability to return results that recognise the tonal variations of a word, which would make searches based on pinyin possible. I can't promise that this will be possible, but we are certainly looking into it!

Sign In or Register to comment.