Register Now

Login

Lost Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Captcha Click on image to update the captcha .

Add question

You must login to ask a question.

Login

Register Now

Lorem ipsum dolor sit amet, consectetur adipiscing elit.Morbi adipiscing gravdio, sit amet suscipit risus ultrices eu.Fusce viverra neque at purus laoreet consequa.Vivamus vulputate posuere nisl quis consequat.

Taiyi-LLM - A bilingual biomedical large language model based on Qwen-7B-base.

## Base Model of Taiyi-LLM Taiyi-LLM is fine-tuned from **Qwen-7B-base**, a large language model from Alibaba Cloud's Tongyi Qianwen series. It has **7 billion parameters** and was pretrained on over **2 trillion tokens** of multilingual text, code, and mathematical content. ## Biomedical Applications of Taiyi-LLM Taiyi-LLM supports: - **Medical Q&A** (single/multi-turn dialogue) - **Literature summarization** - **Disease prediction** - **Biomedical information extraction** (NER, relation/event extraction) - **Medical report generation** - **Bilingual translation** of biomedical texts - **Text classification** and semantic similarity analysis ## Hosting Platforms for Taiyi-LLM Key platforms include: - **GitHub**: [Primary repository](https://github.com/DUTIR-BioNLP/Taiyi-LLM) - **Hugging Face**: [Model card](https://huggingface.co/DUTIR-BioNLP/Taiyi-LLM) - **ModelScope**: [Demo](https://modelscope.cn/studios/DUTIRbionlp/Taiyi2_Demo) - **Academic paper**: [DOI:10.1093/jamia/ocae037](https://doi.org/10.1093/jamia/ocae037) ## Technical Specifications for Taiyi-LLM **Environment requirements**: - Python dependencies: `torch==1.13.0`, `transformers==4.30.2`, `accelerate==0.21.0` - Recommended hardware: **NVIDIA 4090 GPU** - Inference parameters: `max_new_tokens=500`, `top_p=0.9`, `temperature=0.3` ## Key Differentiators of Taiyi-LLM **Distinctive features**: 1. **Bilingual capability**: Optimized for Chinese-English biomedical tasks 2. **Extensive training data**: 38 Chinese datasets (10 BioNLP tasks) + 102 English datasets (12 tasks) 3. **Instruction fine-tuning**: 1M+ instruction samples covering diverse biomedical tasks 4. **Open-source resources**: Releases model weights, datasets (CC BY-NC-SA 4.0), and deployment scripts ### Citation sources: - [Taiyi-LLM](https://github.com/DUTIR-BioNLP/Taiyi-LLM) - Official URL Updated: 2025-04-01