FAQ
Tasks 1 and 2
Tasks 3 and 4
Q1: Do you plan to provide the data in different languages (e.g., English) for Task 3 and 4?
A1: Starting this year, data for Tasks 3 and 4 are provided in Japanese only. Participants may use any machine translation system to translate the civil law articles and test queries into their preferred target language. The target language does not need to be English; any language may be used.
Q2: Is it OK to use machine translation system with closed model setting (e.g., commercial product and services such as Google Translate) for Task 3 and 4?
A2: To ensure high-quality translations, the openness rules do not apply to machine translation systems. Therefore, participants may use commercial or closed-model translation services such as Google Translate, DeepL, or other translation systems. However, the translation must be a faithful, literal rendering of the original text and must not introduce any additional information beyond what is explicitly stated in the source. In other words, any machine translation system may be used, provided that the output does not add interpretative or supplementary content that exceeds the original text.
Q3: Is it necessary to implement and execute the system locally?
A3: It is not necessary to implement or execute the system locally. Participants may use publicly available services, provided that those services comply with the task rules. However, when using external services, participants must verify how the service is implemented. Services based on closed models (e.g., GPT-4o) are not permitted. Participants are required to provide justification that the service they use relies on an open model, as defined in the rules.
Q4: Is it acceptable to submit multiple articles for one question for Task 3?
A4: Yes. Participants may submit multiple articles for a single query. Some queries require multiple articles to correctly determine entailment.
Q5: What are the target documents for retrieval in Task 3?
A5: The target documents are the civil law articles provided in the training data.
Q6: Which document ID should we return for the retrieval results?
A6: Please use the value of the num attribute of each article in civil.xml as the document ID. For example, if the article is represented as:
<Article num="213-2">
...
</Article>you should return: 213-2 as the article ID (value of the third column) in your retrieval results.
Q7: Is it necessary to use a machine translation system as part of the pipeline? In addition, is it acceptable to use a closed LLM for machine translation?
A7: No, it is not necessary to use a machine translation system as part of your pipeline. The use of machine translation is optional.
Our intention in allowing the use of machine translation systems is to encourage participants to apply their methods in languages other than Japanese.
Accordingly, translation may be incorporated into the system pipeline or used as a preprocessing step―for example, to translate civil law articles in order to construct a target article database, as well as to translate test queries before they are processed by the participant’s system.
Furthermore, since the use of external services is permitted, we do not impose specific restrictions on machine translation systems, including the LLM-related rules, provided that the conditions described in Q2 and A2 are satisfied. Therefore, closed-model translation services, including those based on closed LLMs, may be used for machine translation.
Pilot Tasks
Q1: Do you plan to provide the data in different languages (e.g., English) for Pilot Tasks?
A1: Only Japanese data is available.
Q2: Is it OK to use machine translation system with closed model setting (e.g., commercial product and services such as Google Translate) for Pilot Tasks?
A2: Please see the answer to Q2 in Tasks 3 and 4.