Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show all kanji headwords? #438

Closed
birtles opened this issue Nov 26, 2020 · 10 comments
Closed

Show all kanji headwords? #438

birtles opened this issue Nov 26, 2020 · 10 comments

Comments

@birtles
Copy link
Member

birtles commented Nov 26, 2020

See discussion in #437

It might be more useful to show all the kanji headwords, as opposed to just the matching one. We could dim the non-matching ones, perhaps.

I'm concerned this might make things more cluttered but it's still probably worth a try.

@SaltfishAmi
Copy link
Contributor

I like this idea very much. But I wonder how many different kanji can match a kana entry at most? We may need a limit.

@birtles
Copy link
Member Author

birtles commented Dec 2, 2020

I tried implementing this. How does this look?

image

@birtles
Copy link
Member Author

birtles commented Dec 2, 2020

But I wonder how many different kanji can match a kana entry at most? We may need a limit.

Good question, I haven't encountered too many yet. Let me check.

@SaltfishAmi
Copy link
Contributor

[image]

How about move the matching one to the beginning?
、道★、途、径 みち★

@birtles
Copy link
Member Author

birtles commented Dec 2, 2020

Thanks! I thought about doing that but I assume there is some significance to the order of headwords in JMdict? e.g. common writing first? Maybe it doesn't matter?

@SaltfishAmi
Copy link
Contributor

Thanks! I thought about doing that but I assume there is some significance to the order of headwords in JMdict? e.g. common writing first? Maybe it doesn't matter?

Ah, that’s a very good point.
As a Chinese I have abundant knowledge of kanji already, so it seems difficult for me to think from a kanji learner’s aspect. I may have to reconsider this.

@birtles
Copy link
Member Author

birtles commented Dec 2, 2020

It turns out the maximum number of kanji headwords is 12.

$ cat words.ljson | jq -n "[[inputs][].k | length] | max"
12

As for which entries have 12 kanji headwords...

$ cat words.ljson | jq -n "[inputs] | map(select(.k | length == 12)) | map(.k)"
[
  [
    "一ヶ月",
    "一ヵ月",
    "一カ月",
    "一か月",
    "一箇月",
    "一ケ月",
    "1ヶ月",
    "1ヵ月",
    "1カ月",
    "1か月",
    "1箇月",
    "1ケ月"
  ],
  [
    "一箇所",
    "一カ所",
    "一ヶ所",
    "一か所",
    "一ヵ所",
    "一ケ所",
    "1箇所",
    "1カ所",
    "1ヶ所",
    "1か所",
    "1ヵ所",
    "1ケ所"
  ],
  [
    "磨りガラス",
    "擦りガラス",
    "磨り硝子",
    "磨硝子",
    "擦り硝子",
    "磨ガラス",
    "擦硝子",
    "擦ガラス",
    "摺りガラス",
    "摺ガラス",
    "摺硝子",
    "摺り硝子"
  ],
  [
    "干しぶどう",
    "干しブドウ",
    "干し葡萄",
    "干葡萄",
    "干ぶどう",
    "干ブドウ",
    "乾しぶどう",
    "乾し葡萄",
    "乾しブドウ",
    "乾ぶどう",
    "乾葡萄",
    "乾ブドウ"
  ],
  [
    "二箇年",
    "二カ年",
    "二ヵ年",
    "二ヶ年",
    "二か年",
    "二ケ年",
    "2箇年",
    "2カ年",
    "2ヵ年",
    "2ヶ年",
    "2か年",
    "2ケ年"
  ],
  [
    "三箇年",
    "三カ年",
    "三ヵ年",
    "三ヶ年",
    "三か年",
    "三ケ年",
    "3箇年",
    "3カ年",
    "3ヵ年",
    "3ヶ年",
    "3か年",
    "3ケ年"
  ],
  [
    "四箇年",
    "四カ年",
    "四ヵ年",
    "四ヶ年",
    "四か年",
    "四ケ年",
    "4箇年",
    "4カ年",
    "4ヵ年",
    "4ヶ年",
    "4か年",
    "4ケ年"
  ],
  [
    "五箇年",
    "五カ年",
    "五ヵ年",
    "五ヶ年",
    "五か年",
    "五ケ年",
    "5箇年",
    "5カ年",
    "5ヵ年",
    "5ヶ年",
    "5か年",
    "5ケ年"
  ],
  [
    "六箇年",
    "六カ年",
    "六ヵ年",
    "六ヶ年",
    "六か年",
    "六ケ年",
    "6箇年",
    "6カ年",
    "6ヵ年",
    "6ヶ年",
    "6か年",
    "6ケ年"
  ],
  [
    "七箇年",
    "七カ年",
    "七ヵ年",
    "七ヶ年",
    "七か年",
    "七ケ年",
    "7箇年",
    "7カ年",
    "7ヵ年",
    "7ヶ年",
    "7か年",
    "7ケ年"
  ],
  [
    "八箇年",
    "八カ年",
    "八ヵ年",
    "八ヶ年",
    "八か年",
    "八ケ年",
    "8箇年",
    "8カ年",
    "8ヵ年",
    "8ヶ年",
    "8か年",
    "8ケ年"
  ],
  [
    "九箇年",
    "九カ年",
    "九ヵ年",
    "九ヶ年",
    "九か年",
    "九ケ年",
    "9箇年",
    "9カ年",
    "9ヵ年",
    "9ヶ年",
    "9か年",
    "9ケ年"
  ],
  [
    "十箇年",
    "十カ年",
    "十ヵ年",
    "十ヶ年",
    "十か年",
    "十ケ年",
    "10箇年",
    "10カ年",
    "10ヵ年",
    "10ヶ年",
    "10か年",
    "10ケ年"
  ]
]

Which ends up rendering something like:

image

It's not beautiful but it seems ok?

@SaltfishAmi
Copy link
Contributor

[image]

Why is there two (rare)s following one headword?

@birtles
Copy link
Member Author

birtles commented Dec 2, 2020

Sorting the match(es) first definitely looks better:

image

@birtles
Copy link
Member Author

birtles commented Dec 2, 2020

Why is there two (rare)s following one headword?

That headwords has the "ik" (irregular kanji) and "io" (irregular okurigana) annotations and the English localization render head_info_label_ikanji and head_info_label_io (and head_info_label_kana) as "rare".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants