DiscoverDeep Dive in ResearchEDINET-Bench: LLMs on Japanese Financial Tasks
EDINET-Bench: LLMs on Japanese Financial Tasks

EDINET-Bench: LLMs on Japanese Financial Tasks

Update: 2025-06-24
Share

Description

The article introduces EDINET-Bench, a novel open-source Japanese financial benchmark designed to evaluate Large Language Models (LLMs) on complex financial tasks. This benchmark addresses the scarcity of challenging Japanese financial datasets for LLM evaluation, crucial for tasks like accounting fraud detectionearnings forecasting, and industry prediction. The EDINET-Bench dataset is automatically compiled from ten years of Japanese annual reports available through the Electronic Disclosure for Investors’ NETwork (EDINET). Initial evaluations indicate that even state-of-the-art LLMs perform only marginally better than logistic regression in some complex financial tasks, highlighting the need for domain-specific adaptation and further research. The project makes its datasetbenchmark construction code, and evaluation code publicly available to foster advancements in LLM applications within the financial sector.

Comments 
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

EDINET-Bench: LLMs on Japanese Financial Tasks

EDINET-Bench: LLMs on Japanese Financial Tasks

NotebookLM