furigana_gen

10 Commits 1 Branch 0 Tags 46 MiB

Author	SHA1	Message	Date
Nathan Vegdahl	9845da5a7e	Update learning code. Now it tracks distance by character, and determines whether to show furigana based on how long it's been since the last time a word was shown with furigana rather than the last time a word was shown at all. Also some minor performance efficiency improvements.	2024-09-20 07:38:02 +02:00
Nathan Vegdahl	ba5fea6e0a	Attach pitch accent indicators in a more reasonable way. We give it a class so CSS styling can be used on it more easily.	2024-09-18 15:08:53 +02:00
Nathan Vegdahl	adb58983a7	Add option to include pitch accent information with the furigana	2024-09-18 12:10:22 +02:00
Nathan Vegdahl	7361240e49	Add option to use hiragana instead of katakana for the generated furigana.	2024-09-17 08:32:51 +02:00
Nathan Vegdahl	0266341f99	Fix stupid bug in furigana application. It would sometimes result in characters getting swapped.	2024-09-16 08:28:11 +02:00
Nathan Vegdahl	4b48f86824	Rework of various things. This way the main `FuriganaGenerator` can be shared among multiple threads. This also adds substitutions for words that the tokenizer insists on using the less common pronunciations for.	2024-09-15 08:55:03 +02:00
Nathan Vegdahl	d79cc60a48	Tweak the learning algorithm. It was both too conservative and not conservative enough in different circumstances.	2024-09-11 13:25:10 +02:00
Nathan Vegdahl	44cb2b8bda	Make building faster.	2024-09-11 11:22:14 +02:00
Nathan Vegdahl	ecbac83e26	Add function to get word stats after processing.	2024-09-11 11:14:12 +02:00
Nathan Vegdahl	1c3afed157	First commit. A furigana generator, that can do "spaced repetition" style reduction of furigana over the course of a text.	2024-09-10 18:45:58 +02:00