Commit Graph

9 Commits

Author SHA1 Message Date
ba5fea6e0a Attach pitch accent indicators in a more reasonable way.
We give it a class so CSS styling can be used on it more easily.
2024-09-18 15:08:53 +02:00
adb58983a7 Add option to include pitch accent information with the furigana 2024-09-18 12:10:22 +02:00
7361240e49 Add option to use hiragana instead of katakana for the generated furigana. 2024-09-17 08:32:51 +02:00
0266341f99 Fix stupid bug in furigana application.
It would sometimes result in characters getting swapped.
2024-09-16 08:28:11 +02:00
4b48f86824 Rework of various things.
This way the main `FuriganaGenerator` can be shared among multiple
threads.

This also adds substitutions for words that the tokenizer insists on
using the less common pronunciations for.
2024-09-15 08:55:03 +02:00
d79cc60a48 Tweak the learning algorithm.
It was both too conservative and not conservative enough in different
circumstances.
2024-09-11 13:25:10 +02:00
44cb2b8bda Make building faster. 2024-09-11 11:22:14 +02:00
ecbac83e26 Add function to get word stats after processing. 2024-09-11 11:14:12 +02:00
1c3afed157 First commit.
A furigana generator, that can do "spaced repetition" style reduction
of furigana over the course of a text.
2024-09-10 18:45:58 +02:00