To What Extent Have LLMs Reshaped the Legal Domain So Far? A Scoping Literature Review
Abstract
:1. Introduction
Research Questions
- RQ1: Which LLM tools are considered leading in the field, and which are best suited for legal applications according to the current open-access state-of-the-art research?
- RQ2: What are the primary sources for data extraction and the best strategies for dataset development within the legal domain?
- RQ3: What are the challenges of LLMs in addressing legal tasks?
- RQ4: What are the main strategies for increasing the performance of LLMs in addressing legal tasks?
- RQ5: What are the main limitations of current LLMs for the legal domain?
2. Method
2.1. Eligibility Criteria
2.2. Search Strategy
- TITLE-ABS-KEY("LLMs" OR "∗gpt∗" OR "llm" OR "machine learning" )
- AND TITLE-ABS-KEY("law" OR "legal" OR "contract")
- AND PUBYEAR
\( >\) 2021 - AND SUBJAREA(SOCI )
- AND (
- LIMIT-TO ( SUBJAREA,"BUSI" )
- OR LIMIT-TO ( SUBJAREA,"COMP" )
- OR EXCLUDE ( SUBJAREA,"ENER" )
- OR EXCLUDE ( SUBJAREA,"PSYC" )
- OR EXCLUDE ( SUBJAREA,"HEAL" )
- OR EXCLUDE ( SUBJAREA,"MEDI" )
- OR EXCLUDE ( SUBJAREA,"CHEM" )
- OR EXCLUDE ( SUBJAREA,"CENG" )
- OR EXCLUDE ( SUBJAREA,"PHYS" )
- OR EXCLUDE ( SUBJAREA,"BIOC" )
- OR EXCLUDE ( SUBJAREA,"MATE" )
- OR EXCLUDE ( SUBJAREA,"EART" )
- OR EXCLUDE ( SUBJAREA,"ARTS" )
- OR LIMIT-TO ( SUBJAREA,"MATH" )
- )
- AND (LIMIT-TO ( LANGUAGE,"English" ))
- AND (EXCLUDE ( DOCTYPE,"ch" ) OR EXCLUDE ( DOCTYPE, "bk" ))
- (TI=("LLMs" OR "∗gpt∗" OR "llm" OR "machine learning"
- OR "LQA")
- OR AB=("LLMs" OR "gpt∗" OR "llm" OR "machine learning" OR "LQA"))
- AND (TI=("law" OR "legal" OR "contract") OR AB=("law" OR "legal" OR "contract"))
- AND LA=(English)
- NOT DT=(Book OR Book Chapter)
- AND Arts Humanities Other Topics or Astronomy Astrophysics or Marine Freshwater
- Biology or Materials Science or Mathematical Computational Biology or Obstetrics
- Gynecology or Oceanography or Oncology or Ophthalmology or Optics or
- Otorhinolaryngology or Pathology or Pediatrics or Urology Nephrology or Toxicology
- or Physics or Pharmacology Pharmacy or Imaging Science Photographic Technology
- or Immunology or Infectious Diseases or Instruments Instrumentation or Integrative
- Complementary Medicine or Gastroenterology Hepatology or General Internal Medicine
- or Genetics Heredity or Geochemistry Geophysics or Geography or Geology or
- Geriatrics Gerontology or Electrochemistry or Cell Biology or Chemistry or
- Cardiovascular System Cardiology or Biotechnology Applied Microbiology or
- Biophysics or Biomedical Social Sciences or Biochemistry Molecular Biology or
- Automation Control Systems or Anesthesiology or Acoustics or Physical Geography or
- Physiology or Plant Sciences or Polymer Science or Psychology or Psychiatry or
- Radiology Nuclear Medicine Medical Imaging or Rehabilitation or Remote Sensing or
- Reproductive Biology or Research Experimental Medicine or Spectroscopy or Surgery
- or Telecommunications or Thermodynamics or Virology or Veterinary Sciences
- (Exclude - Research Areas)
- AND Computer Science OR Mathematics (Research Areas)
- AND Open Access
2.3. The Selection Process
3. Results
3.1. Bibliographical Analysis
3.2. Distribution Trends
3.3. Main Events and Publications
3.4. Main Legal Tasks
4. Discussion
- RQ1: Which LLM tools are considered leading in the field, and which are best suited for legal applications according to current open-access state-of-the-art research?
- RQ2: What are the primary sources for data extraction and the cutting-edge strategies for dataset development within the legal sector?
Dataset | Application | #Samples |
---|---|---|
ECHR [49] | Argument detection [48] | 1.9k |
CAIL2018 [50] | Judgment prediction | 2.600k |
JEC-QA [51] | Multiple-choice QA | 26k |
JE-Q2EA [9] | Long-form QA | 42k |
JE-QA2E [9] | Long-form QA | 6k |
JE-EXPERT [9] | Long-form QA | 850 |
Legal consultation dataset [57] | Long-form QA [9] | 16k |
LLeQA [18] | Long-form QA | 1.8k |
BSARD [58] | Article prediction | 1.1k |
- RQ3: What are the challenges of LLMs in approaching legal tasks?
- RQ4: What are the main strategies for increasing the performance of LLMs in addressing legal tasks?
- RQ5: What are the main limitations of current LLMs for the legal domain?
- Whether the model fabricates one or more nonexistent legal articles.
- If the response mentions an existing legal article, whether it incorrectly quotes the title of the law or the article’s number.
Limitations
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
AAAI | Association for the Advancement of Artificial Intelligence |
AI | Artificial intelligence |
Arxiv | An open-access repository of preprints |
ChatGPT | Chat Generative Pre-trained Transformer |
COLIEE | Competition on Legal Information Extraction and Entailment |
DOAJ | Directory of open access journals |
ECHR | European Court of Human Rights |
GPT | Generative Pre-trained Transformer |
ICAIL | The International Conference on Artificial Intelligence and Law |
JURIX | The International Conference on Legal Knowledge and Information Systems |
JURISIN | The International Workshop on Juris-informatics |
JE-Q2EA | Judicial Examination-Question to Explanation + Answer |
JE-QA2E | Judicial Examination-Question + Answer to Explanation |
JE-EXPERT | Judicial Examination-Expert Corpus |
LLM | Large Language Model |
MDPI | Multidisciplinary Digital Publishing Institute |
PRISMA | Preferred Reporting Items for Systematic Reviews and Meta-Analyses |
RAG | Retrieval-Augmented Generation |
RLHF | Reinforcement Learning from Human Feedback |
SCL | Society for Computers and Law |
SOTA | State of the art |
SFT | Supervised fine-tuning |
Appendix A. Selected Articles
Title | Authors | URL (accessed on 28 March 2024) | Year |
1. Interpretable Long-Form Legal Question Answering with Retrieval-Augmented Large Language Models | Louis, Antoine; van Dijck, Gijs; Spanakis, Gerasimos | http://arxiv.org/abs/2309.17050 | 2023 |
2. Lawyer LLaMA Technical Report | Huang, Quzhe; Tao, Mingxu; Zhang, Chen; An, Zhenwei; Jiang, Cong; Chen, Zhibin; Wu, Zirui; Feng, Yansong | http://arxiv.org/abs/2305.15062 | 2023 |
3. Instruction Tuning with GPT-4 | Peng, Baolin; Li, Chunyuan; He, Pengcheng; Galley, Michel; Gao, Jianfeng | http://arxiv.org/abs/2304.03277 | 2023 |
4. Predicting Brazilian Court Decisions | Lage-Freitas, André; Allende-Cid, Héctor; Santana, Orivaldo; Oliveira-Lage, Lívia | https://peerj.com/articles/cs-904 | 2022 |
5. Hammering with the Telescope | Sobkowicz, Pawel | https://www.frontiersin.org/articles/10.3389/frai.2022.1010219/full | 2022 |
6. GiusBERTo: Italy’s AI-Based Judicial Transformation: A Teaching Case | Datta, Pratim; Zahn, Brian J.; Attias, Luca; Salierno, Giulio; Bertè, Rosamaria; Battisti, Daniela; Acton, Thomas | https://aisel.aisnet.org/cais/vol53/iss1/33/ | 2023 |
7. Regulating ChatGPT and other Large Generative AI Models | Hacker, Philipp; Engel, Andreas; Mauer, Marco | https://dl.acm.org/doi/10.1145/3593013.3594067 | 2023 |
8. Prediction Machine Learning Models on Propensity Convicts to Criminal Recidivism | Kovalchuk, Olha; Karpinski, Mikolaj; Banakh, Serhiy; Kasianchuk, Mykhailo; Shevchuk, Ruslan; Zagorodna, Nataliya | https://www.mdpi.com/2078-2489/14/3/161 | 2023 |
9. Machine Learning in Bail Decisions and Judges’ Trustworthiness | Morin-Martel, Alexis | https://link.springer.com/10.1007/s00146-023-01673-6 | 2023 |
10. Regression Applied to Legal Judgments to Predict Compensation for Immaterial Damage | Dal Pont, Thiago Raulino; Sabo, Isabela Cristina; Hübner, Jomi Fred; Rover, Aires José | https://peerj.com/articles/cs-1225 | 2023 |
11. How To Build The Ultimate Legal LLM Stack | Dominic Woolrych | https://www.linkedin.com/pulse/how-build-ultimate-legal-llm-stack-dominic-woolrych/ | 2023 |
12. Emerging Architectures for LLM Applications | Matt Bornstein, Rajko Radovanovic | https://a16z.com/emerging-architectures-for-llm-applications/?trk=article-ssr-frontend-pulse_little-text-block | 2023 |
13. LegalVis: Exploring and Inferring Precedent Citations in Legal Documents | Resck, Lucas E.; Ponciano, Jean R.; Nonato, Luis Gustavo; Poco, Jorge | https://ieeexplore.ieee.org/document/9716779/ | 2023 |
14. Emerging Trends: Smooth-talking Machines | Church, Kenneth Ward; Yue, Richard | https://www.cambridge.org/core/product/identifier/S1351324923000463/type/journal_article | 2023 |
15. The Unreasonable Effectiveness of Large Language Models in Zero-shot Semantic Annotation of Legal Texts | Savelka, Jaromir; Ashley, Kevin D. | https://www.frontiersin.org/articles/10.3389/frai.2023.1279794/full | 2023 |
16. Groups of Experts Often Differ in Their Decisions: What are the Implications for AI and Machine Learning? | Sleeman, Derek H.; Gilhooly, Ken | https://onlinelibrary.wiley.com/doi/10.1002/aaai.12135 | 2023 |
17. Predicting Critical Path of Labor Dispute Resolution in Legal Domain by Machine Learning Models Based on SHapley Additive ExPlanations and Soft Voting Strategy | Guan, Jianhua; Yu, Zuguo; Liao, Yongan; Tang, Runbin; Duan, Ming; Han, Guosheng | https://www.mdpi.com/2227-7390/12/2/272 | 2024 |
18. THUIR@COLIEE 2023: Incorporating Structural Knowledge into Pre-trained Language Models for Legal Case Retrieval | Li, Haitao; Su, Weihang; Wang, Changyue; Wu, Yueyue; Ai, Qingyao; Liu, Yiqun | http://arxiv.org/abs/2305.06812 | 2023 |
19. The Implications of ChatGPT for Legal Services and Society | Perlman, Andrew | https://www.ssrn.com/abstract=4294197 | 2022 |
20. GPT Takes the Bar Exam | Bommarito, Michael; Katz, Daniel Martin | http://arxiv.org/abs/2212.14402 | 2022 |
21. Unlocking Practical Applications in Legal Domain | Savelka, Jaromir | https://dl.acm.org/doi/10.1145/3594536.3595161 | 2023 |
22. Large Language Models in Law: A Survey | Lai, Jinqi; Gan, Wensheng; Wu, Jiayang; Qi, Zhenlian; Yu, Philip S. | http://arxiv.org/abs/2312.03718 | 2023 |
23. Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking | Zelikman, Eric; Harik, Georges; Shao, Yijia; Jayasiri, Varuna; Haber, Nick; Goodman, Noah D. | http://arxiv.org/abs/2403.09629 | 2024 |
24. FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions | Weller, Orion; Chang, Benjamin; MacAvaney, Sean; Lo, Kyle; Cohan, Arman; Van Durme, Benjamin; Lawrie, Dawn; Soldaini, Luca | http://arxiv.org/abs/2403.15246 | 2024 |
25. Performance Analysis of Large Language Models in the Domain of Legal Argument Mining | Al Zubaer, Abdullah; Granitzer, Michael; Mitrović, Jelena | https://www.frontiersin.org/articles/10.3389/frai.2023.1278796/full | 2023 |
26. A Dynamic Approach for Visualizing and Exploring Concept Hierarchies from Textbooks | Wehnert, Sabine; Chedella, Praneeth; Asche, Jonas; De Luca, Ernesto William | https://www.frontiersin.org/articles/10.3389/frai.2024.1285026/full | 2024 |
27. The Benefits and Dangers of Using Machine Learning to Support Making Legal Predictions | Zeleznikow, John | https://wires.onlinelibrary.wiley.com/doi/10.1002/widm.1505 | 2023 |
28. ChatLaw: Open-Source Legal Large Language Model with Integrated External Knowledge Bases | Cui, Jiaxi; Li, Zongjian; Yan, Yang; Chen, Bohua; Yuan, Li | http://arxiv.org/abs/2306.16092 | 2023 |
29. Survey of Hallucination in Natural Language Generation | Ji, Ziwei; Lee, Nayeon; Frieske, Rita; Yu, Tiezheng; Su, Dan; Xu, Yan; Ishii, Etsuko; Bang, Yejin; Chen, Delong; Chan, Ho Shu; Dai, Wenliang; Madotto, Andrea; Fung, Pascale | http://arxiv.org/abs/2202.03629 | 2022 |
30 Long-form Factuality in Large Language Models | Wei, Jerry; Yang, Chengrun; Song, Xinying; Lu, Yifeng; Hu, Nathan; Tran, Dustin; Peng, Daiyi; Liu, Ruibo; Huang, Da; Du, Cosmo; Le, Quoc V. | http://arxiv.org/abs/2403.18802 | 2024 |
31. Understanding the Planning of LLM Agents: A Survey | Huang, Xu; Liu, Weiwen; Chen, Xiaolong; Wang, Xingmei; Wang, Hao; Lian, Defu; Wang, Yasheng; Tang, Ruiming; Chen, Enhong | http://arxiv.org/abs/2402.02716 | 2024 |
32. LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models | Guha, Neel; Nyarko, Julian; Ho, Daniel E.; Ré, Christopher; Chilton, Adam; Narayana, Aditya; Chohlas-Wood, Alex; Peters, Austin; Waldon, Brandon; Rockmore, Daniel N.; Zambrano, Diego; Talisman, Dmitry; Hoque, Enam; Surani, Faiz; Fagan, Frank; Sarfaty, Galit; Dickinson, Gregory M.; Porat, Haggai; Hegland, Jason; Wu, Jessica; Nudell, Joe; Niklaus, Joel; Nay, John; Choi, Jonathan H.; Tobia, Kevin; Hagan, Margaret; Ma, Megan; Livermore, Michael; Rasumov-Rahe, Nikon; Holzenberger, Nils; Kolt, Noam; Henderson, Peter; Rehaag, Sean; Goel, Sharad; Gao, Shang; Williams, Spencer; Gandhi, Sunny; Zur, Tom; Iyer, Varun; Li, Zehua | http://arxiv.org/abs/2308.11462 | 2023 |
33. LawBench: Benchmarking Legal Knowledge of Large Language Models | Fei, Zhiwei; Shen, Xiaoyu; Zhu, Dawei; Zhou, Fengzhe; Han, Zhuo; Zhang, Songyang; Chen, Kai; Shen, Zongwen; Ge, Jidong | http://arxiv.org/abs/2309.16289 | 2023 |
References
- Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; Anadkat, S.; et al. GPT-4 Technical Report. arXiv 2023, arXiv:2303.08774. [Google Scholar]
- Sobkowicz, P. Hammering with the telescope. Front. Artif. Intell. 2022, 5, 1010219. [Google Scholar] [CrossRef] [PubMed]
- Villata, S.; Araszkiewicz, M.; Ashley, K.; Bench-Capon, T.; Branting, L.K.; Conrad, J.G.; Wyner, A. Thirty years of artificial intelligence and law: The third decade. Artif. Intell. Law 2022, 30, 561–591. [Google Scholar] [CrossRef]
- Ridnik, T.; Kredo, D.; Friedman, I. Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering. arXiv 2024, arXiv:2401.08500. [Google Scholar]
- Dhuliawala, S.; Komeili, M.; Xu, J.; Raileanu, R.; Li, X.; Celikyilmaz, A.; Weston, J. Chain-of-Verification Reduces Hallucination in Large Language Models. In Proceedings of the Findings of the Association for Computational Linguistics ACL 2024, Bangkok, Thailand, 8 August 2024; pp. 3563–3578. [Google Scholar]
- Wang, X.; Wei, J.; Schuurmans, D.; Le, Q.; Chi, E.; Narang, S.; Chowdhery, A.; Zhou, D. Self-Consistency Improves Chain of Thought Reasoning in Language Models. arXiv 2022, arXiv:2203.11171. [Google Scholar]
- Fei, Z.; Shen, X.; Zhu, D.; Zhou, F.; Han, Z.; Zhang, S.; Chen, K.; Shen, Z.; Ge, J. LawBench: Benchmarking Legal Knowledge of Large Language Models. arXiv 2023, arXiv:2309.16289. [Google Scholar]
- Lai, J.; Gan, W.; Wu, J.; Qi, Z.; Yu, P.S. Large Language Models in Law: A Survey. arXiv 2023, arXiv:2312.03718. [Google Scholar] [CrossRef]
- Huang, Q.; Tao, M.; Zhang, C.; An, Z.; Jiang, C.; Chen, Z.; Wu, Z.; Feng, Y. Lawyer LLaMA Technical Report. arXiv 2023, arXiv:2305.15062. [Google Scholar]
- Cui, J.; Li, Z.; Yan, Y.; Chen, B.; Yuan, L. ChatLaw: Open-Source Legal Large Language Model with Integrated External Knowledge Bases. arXiv 2023, arXiv:2306.16092. [Google Scholar]
- Re, R.M.; Solow-Niederman, A. Developing artificially intelligent justice. Stanf. Technol. Law Rev. 2019, 22, 242. [Google Scholar]
- Tricco, A.C.; Lillie, E.; Zarin, W.; O’Brien, K.K.; Colquhoun, H.; Levac, D.; Moher, D.; Peters, M.D.; Horsley, T.; Weeks, L.; et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): Checklist and Explanation. Ann. Intern. Med. 2018, 169, 467–473. [Google Scholar] [CrossRef] [PubMed]
- Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef] [PubMed]
- Haddaway, N.R.; Page, M.J.; Pritchard, C.C.; McGuinness, L.A. PRISMA2020: An R package and Shiny app for producing PRISMA 2020-compliant flow diagrams, with interactivity for optimised digital transparency and Open Synthesis. Campbell Syst. Rev. 2022, 18, e1230. [Google Scholar] [CrossRef] [PubMed]
- Honnibal, M.; Montani, I.; Van Landeghem, S.; Boyd, A. spaCy: Industrial-strength Natural Language Processing in Python; Explosion: 2020. Available online: https://spacy.io (accessed on 28 March 2024).
- Grootendorst, M. MaartenGr/KeyBERT: BibTeX (Version v0.1.3); Zenodo: 2021. Available online: https://zenodo.org/records/4461265 (accessed on 28 March 2024). [CrossRef]
- Li, H.; Su, W.; Wang, C.; Wu, Y.; Ai, Q.; Liu, Y. THUIR@COLIEE 2023: Incorporating Structural Knowledge into Pre-trained Language Models for Legal Case Retrieval. arXiv 2023, arXiv:2305.06812. [Google Scholar]
- Louis, A.; van Dijck, G.; Spanakis, G. Interpretable long-form legal question answering with retrieval-augmented large language models. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; Volume 38, pp. 22266–22275. [Google Scholar]
- Guan, J.; Yu, Z.; Liao, Y.; Tang, R.; Duan, M.; Han, G. Predicting Critical Path of Labor Dispute Resolution in Legal Domain by Machine Learning Models Based on SHapley Additive exPlanations and Soft Voting Strategy. Mathematics 2024, 12, 272. [Google Scholar] [CrossRef]
- Sleeman, D.H.; Gilhooly, K. Groups of experts often differ in their decisions: What are the implications for AI and machine learning? A commentary on Noise: A Flaw in Human Judgment, by Kahneman, Sibony, and Sunstein (2021). AI Mag. 2023, 44, 555–567. [Google Scholar] [CrossRef]
- Lage-Freitas, A.; Allende-Cid, H.; Santana, O.; Oliveira-Lage, L. Predicting Brazilian Court Decisions. Peerj Comput. Sci. 2022, 8, e904. [Google Scholar] [CrossRef]
- Weller, O.; Chang, B.; MacAvaney, S.; Lo, K.; Cohan, A.; Durme, B.V.; Lawrie, D.; Soldaini, L. FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions. arXiv 2024, arXiv:2403.15246. [Google Scholar]
- Savelka, J.; Ashley, K.D. The unreasonable effectiveness of large language models in zero-shot semantic annotation of legal texts. Front. Artif. Intell. 2023, 6, 1279794. [Google Scholar] [CrossRef]
- Bornstein Matt, R.R. Emerging Architectures for LLM Applications. 2023. Available online: https://a16z.com (accessed on 20 June 2023).
- Xu, Z. Human Judges in the Era of Artificial Intelligence: Challenges and Opportunities. Appl. Artif. Intell. 2022, 36, 2013652. [Google Scholar] [CrossRef]
- Etulle, R.D.; Moslares, F.; Pacad, E.; Odullo, J.; Nacionales, J.; Claridad, N. Investigating the Listening and Transcription Performance in Court: Experiences from Stenographers in Philippine Courtrooms. J. Lang. Pragmat. Stud. 2023, 2, 100–111. [Google Scholar] [CrossRef]
- Haitao, L. LexiLaw. 2023. Available online: https://github.com/CSHaitao/LexiLaw (accessed on 1 May 2024).
- GLM, T.; Zeng, A.; Xu, B.; Wang, B.; Zhang, C.; Yin, D.; Rojas, D.; Feng, G.; Zhao, H.; Lai, H.; et al. ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools. arXiv 2024, arXiv:2406.12793. [Google Scholar]
- Wu, S.; Liu, Z.; Zhang, Z.; Chen, Z.; Deng, W.; Zhang, W.; Yang, J.; Yao, Z.; Lyu, Y.; Xin, X.; et al. fuzi.mingcha. 2023. Available online: https://github.com/irlab-sdu/fuzi.mingcha (accessed on 28 March 2024).
- Deng, W.; Pei, J.; Kong, K.; Chen, Z.; Wei, F.; Li, Y.; Ren, Z.; Chen, Z.; Ren, P. Syllogistic Reasoning for Legal Judgment Analysis. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 6–10 December 2023; pp. 13997–14009. [Google Scholar] [CrossRef]
- Cui, Y.; Yang, Z.; Yao, X. Efficient and effective text encoding for chinese llama and alpaca. arXiv 2023, arXiv:2304.08177. [Google Scholar]
- Huang, X.; Zhang, L.L.; Cheng, K.T.; Yang, F.; Yang, M. Fewer is More: Boosting LLM Reasoning with Reinforced Context Pruning. arXiv 2023, arXiv:2312.08901. [Google Scholar]
- JurisLMs. 2023. Available online: https://github.com/seudl/JurisLMs (accessed on 28 March 2024).
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language Models are Unsupervised Multitask Learners. OpenAI Blog 2019, 1, 9. [Google Scholar]
- He, W.; Wen, J.; Zhang, L.; Cheng, H.; Qin, B.; Li, Y.; Jiang, F.; Chen, J.; Wang, B.; Yang, M. HanFei-1.0. 2023. Available online: https://github.com/siat-nlp/HanFei (accessed on 28 March 2024).
- Muennighoff, N.; Wang, T.; Sutawika, L.; Roberts, A.; Biderman, S.; Scao, T.L.; Bari, M.S.; Shen, S.; Yong, Z.X.; Schoelkopf, H.; et al. Crosslingual generalization through multitask finetuning. arXiv 2022, arXiv:2211.01786. [Google Scholar]
- Zhang, J.; Gan, R.; Wang, J.; Zhang, Y.; Zhang, L.; Yang, P.; Gao, X.; Wu, Z.; Dong, X.; He, J.; et al. Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence. arXiv 2022, arXiv:2209.02970. [Google Scholar]
- Shen, X.; Zhu, D.; Fei, Z.; Li, Q.; Shen, Z.; Ge, J. Lychee. 2023. Available online: https://github.com/davidpig/lychee_law (accessed on 28 March 2024).
- Du, Z.; Qian, Y.; Liu, X.; Ding, M.; Qiu, J.; Yang, Z.; Tang, J. GLM: General Language Model Pretraining with Autoregressive Blank Infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, 22–27 May 2022; Muresan, S., Nakov, P., Villavicencio, A., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2022; pp. 320–335. [Google Scholar] [CrossRef]
- Chiang, W.L.; Li, Z.; Lin, Z.; Sheng, Y.; Wu, Z.; Zhang, H.; Zheng, L.; Zhuang, S.; Zhuang, Y.; Gonzalez, J.E.; et al. Vicuna: An Open-Source Chatbot Impressing Gpt-4 with 90%* Chatgpt Quality. 2023. Available online: https://lmsys.org/blog/2023-03-30-vicuna (accessed on 28 March 2024).
- Xu, C.; Sun, Q.; Zheng, K.; Geng, X.; Zhao, P.; Feng, J.; Tao, C.; Jiang, D. Wizardlm: Empowering large language models to follow complex instructions. arXiv 2023, arXiv:2304.12244. [Google Scholar]
- Wang, Y.; Ivison, H.; Dasigi, P.; Hessel, J.; Khot, T.; Chandu, K.R.; Wadden, D.; MacMillan, K.; Smith, N.A.; Beltagy, I.; et al. How far can camels go? In exploring the state of instruction tuning on open resources. In Proceedings of the 37th International Conference on Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023; pp. 74764–74786. [Google Scholar]
- Dettmers, T.; Pagnoni, A.; Holtzman, A.; Zettlemoyer, L. QLoRA: Efficient Finetuning of Quantized LLMs. arXiv 2023, arXiv:2305.14314. [Google Scholar]
- Jiang, A.Q.; Sablayrolles, A.; Mensch, A.; Bamford, C.; Chaplot, D.S.; Casas, D.d.l.; Bressand, F.; Lengyel, G.; Lample, G.; Saulnier, L.; et al. Mistral 7B. arXiv 2023, arXiv:2310.06825. [Google Scholar]
- Woolrych, D. How To Build The Ultimate Legal LLM Stack. 2023. Available online: https://lawpath.com.au/blog/how-to-build-the-ultimate-legal-llm-stack (accessed on 28 March 2024).
- OpenAI. GPT-3.5-turbo-16k. 2023. Available online: https://openai.com (accessed on 1 May 2024).
- Nguyen, H.T. A Brief Report on LawGPT 1.0: A Virtual Legal Assistant Based on GPT-3. arXiv 2023, arXiv:2302.05729. [Google Scholar]
- Moens, M.F.; Boiy, E.; Palau, R.M.; Reed, C. Automatic detection of arguments in legal texts. In Proceedings of the 11th International Conference on Artificial Intelligence and Law, ICAIL ’07, New York, NY, USA, 4–8 June 2007; pp. 225–230. [Google Scholar] [CrossRef]
- Zubaer, A.A.; Granitzer, M.; Mitrović, J. Performance analysis of large language models in the domain of legal argument mining. Front. Artif. Intell. 2023, 6, 1278796. [Google Scholar] [CrossRef] [PubMed]
- Xiao, C.; Zhong, H.; Guo, Z.; Tu, C.; Liu, Z.; Sun, M.; Feng, Y.; Han, X.; Hu, Z.; Wang, H.; et al. CAIL2018: A Large-Scale Legal Dataset for Judgment Prediction. arXiv 2018, arXiv:1807.02478. [Google Scholar]
- Zhong, H.; Zhou, J.; Qu, W.; Long, Y.; Gu, Y. An Element-aware Multi-representation Model for Law Article Prediction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16–20 November 2020; Webber, B., Cohn, T., He, Y., Liu, Y., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2020; pp. 6663–6668. [Google Scholar] [CrossRef]
- Zhong, H.; Xiao, C.; Tu, C.; Zhang, T.; Liu, Z.; Sun, M. JEC-QA: A legal-domain question answering dataset. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 9701–9708. [Google Scholar]
- Yuan, S.; Zhao, H.; Du, Z.; Ding, M.; Liu, X.; Cen, Y.; Zou, X.; Yang, Z.; Tang, J. WuDaoCorpora: A super large-scale Chinese corpora for pre-training language models. AI Open 2021, 2, 65–68. [Google Scholar] [CrossRef]
- Xu, L.; Zhang, X.; Dong, Q. CLUECorpus2020: A Large-scale Chinese Corpus for Pre-training Language Model. arXiv 2020, arXiv:2003.01355. [Google Scholar]
- Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. 2020, 21, 1–67. [Google Scholar]
- Chen, S.; Hou, Y.; Cui, Y.; Che, W.; Liu, T.; Yu, X. Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16–20 November 2020; pp. 7870–7881. [Google Scholar]
- Chen, F. The Legal Consultation Data and Corpus of the Thesis from China Law Network (Version V1); Peking University Open Research Data Platform. 2018. Available online: https://opendata.pku.edu.cn/dataset.xhtml?persistentId=doi:10.18170/DVN/OLO4G8 (accessed on 28 March 2024).
- Louis, A.; Spanakis, G. A Statutory Article Retrieval Dataset in French. arXiv 2022, arXiv:2108.11792. [Google Scholar]
- Ji, Z.; Lee, N.; Frieske, R.; Yu, T.; Su, D.; Xu, Y.; Ishii, E.; Bang, Y.J.; Madotto, A.; Fung, P. Survey of hallucination in natural language generation. ACM Comput. Surv. 2023, 55, 1–38. [Google Scholar] [CrossRef]
- Gou, Z.; Shao, Z.; Gong, Y.; Shen, Y.; Yang, Y.; Duan, N.; Chen, W. CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing. arXiv 2023, arXiv:2305.11738. [Google Scholar]
- Hacker, P. The European AI liability directives—Critique of a half-hearted approach and lessons for the future. Comput. Law Secur. Rev. 2023, 51, 105871. [Google Scholar] [CrossRef]
- Wang, Y.; Kordi, Y.; Mishra, S.; Liu, A.; Smith, N.A.; Khashabi, D.; Hajishirzi, H. Self-Instruct: Aligning Language Models with Self-Generated Instructions. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Toronto, ON, Canada, 9–14 July 2023; pp. 13484–13508. [Google Scholar]
- Peng, B.; Li, C.; He, P.; Galley, M.; Gao, J. Instruction Tuning with GPT-4. arXiv 2023, arXiv:2304.03277. [Google Scholar]
- Li, X.; Zhang, T.; Dubois, Y.; Taori, R.; Ishaan Gulrajani, C.G.; Liang, P.; Hashimoto, T.B. Alpacaeval: An Automatic Evaluator of Instruction-Following Models. 2023. Available online: https://github.com/tatsu-lab/alpaca_eval (accessed on 28 March 2024).
- Ng, A. The Batch Issue 242: Four Design Patterns for AI Agentic Workflows Blog Post. The Batch. Available online: https://www.deeplearning.ai/the-batch/issue-242/ (accessed on 28 March 2024).
- Huang, X.; Liu, W.; Chen, X.; Wang, X.; Wang, H.; Lian, D.; Wang, Y.; Tang, R.; Chen, E. Understanding the planning of LLM agents: A survey. arXiv 2024, arXiv:2402.02716. [Google Scholar]
- Wei, J.; Yang, C.; Song, X.; Lu, Y.; Hu, N.; Tran, D.; Peng, D.; Liu, R.; Huang, D.; Du, C.; et al. Long-form factuality in large language models. arXiv 2024, arXiv:2403.18802. [Google Scholar]
- Church, K.W.; Yue, R. Emerging trends: Smooth-talking machines. Nat. Lang. Eng. 2023, 29, 1402–1410. [Google Scholar] [CrossRef]
- Sierocka, H. Cultural Dimensions Of Legal Discourse. Stud. Log. 2014, 38, 189–196. [Google Scholar] [CrossRef]
- Schilling, T. Beyond Multilingualism: On Different Approaches to the Handling of Diverging Language Versions of a Community Law. Eur. Law J. 2010, 16, 47–66. [Google Scholar] [CrossRef]
- Dubey, A.; Jauhri, A.; Pandey, A.; Kadian, A.; Al-Dahle, A.; Letman, A.; Mathur, A.; Schelten, A.; Yang, A.; Fan, A.; et al. The Llama 3 Herd of Models. arXiv 2024, arXiv:2407.21783. [Google Scholar]
- Boginskaya, O. Semantics of the verb shall in legal discourse. Jezikoslovlje 2017, 18, 305–317. [Google Scholar]
- Basmov, V.; Goldberg, Y.; Tsarfaty, R. LLMs’ Reading Comprehension Is Affected by Parametric Knowledge and Struggles with Hypothetical Statements. arXiv 2024, arXiv:2404.06283. [Google Scholar]
- Zhong, H.; Wang, Y.; Tu, C.; Zhang, T.; Liu, Z.; Sun, M. Iteratively Questioning and Answering for Interpretable Legal Judgment Prediction. Proc. AAAI Conf. Artif. Intell. 2020, 34, 1250–1257. [Google Scholar] [CrossRef]
- Zhang, D.; Finckenberg-Broman, P.; Hoang, T.; Pan, S.; Xing, Z.; Staples, M.; Xu, X. Right to be Forgotten in the Era of Large Language Models: Implications, Challenges, and Solutions. arXiv 2024, arXiv:2307.03941. [Google Scholar] [CrossRef]
- Ali, A.; Al-rimy, B.A.S.; Alsubaei, F.S.; Almazroi, A.A.; Almazroi, A.A. HealthLock: Blockchain-Based Privacy Preservation Using Homomorphic Encryption in Internet of Things Healthcare Applications. Sensors 2023, 23, 6762. [Google Scholar] [CrossRef]
Source | Source Type | Date | Number of Studies |
---|---|---|---|
Web of Science | Database | 22 February 2024 | 224 |
SCOPUS | Database | 22 February 2024 | 177 |
Specific websites | Search | 28 March 2024 | 4 |
arxiv.org | Register | 28 March 2024 | 12 |
Total studies | 417 |
LLM | Foundation Model | # Params | RLHF | W/API | Origin |
---|---|---|---|---|---|
LexiLaw [27] | ChatGLM [28] | 6B | N | W | CN |
Fuzi.mingcha [29,30] | ChatGLM [28] | 6B | N | W | CN |
LaWGPT-7B-beta1.1 | Chinese LLaMA [31] | 7B | N | W | CN |
Lawyer LLaMA [32] | Chinese LLaMA [31] | 13B | N | W | CN |
JurisLMs [33] | GPT2 [34], Chinese LLaMA [31] | 0.77B/13B | N | W | CN |
HanFei [35] | BLOOMZ-7B1 [36] | 7B | N | W | CN |
ChatLaw [10] | Ziya-LLaMA-13B [37] | 13B | N | W | CN |
Lychee [38] | GLM-10B [39] | 10B | N | W | CN |
LLeQA [18] | vicuna-7b-v1.3 [40], wizardLM-7B [41], tulu-7B [42], guanaco-7B [43] | 7B | Y | W | EU |
FollowIR-7B [22] | Mistral 7B [44] | 7B | N | W | US |
Lawpath AI [45] | GPT-3.5-turbo-16k [46] | 175B | Y | API | AU |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Padiu, B.; Iacob, R.; Rebedea, T.; Dascalu, M. To What Extent Have LLMs Reshaped the Legal Domain So Far? A Scoping Literature Review. Information 2024, 15, 662. https://doi.org/10.3390/info15110662
Padiu B, Iacob R, Rebedea T, Dascalu M. To What Extent Have LLMs Reshaped the Legal Domain So Far? A Scoping Literature Review. Information. 2024; 15(11):662. https://doi.org/10.3390/info15110662
Chicago/Turabian StylePadiu, Bogdan, Radu Iacob, Traian Rebedea, and Mihai Dascalu. 2024. "To What Extent Have LLMs Reshaped the Legal Domain So Far? A Scoping Literature Review" Information 15, no. 11: 662. https://doi.org/10.3390/info15110662
APA StylePadiu, B., Iacob, R., Rebedea, T., & Dascalu, M. (2024). To What Extent Have LLMs Reshaped the Legal Domain So Far? A Scoping Literature Review. Information, 15(11), 662. https://doi.org/10.3390/info15110662