GitHub - 2404589803/hf-daily-paper-newsletter-multilingual: π€ A multilingual translation tool that automatically converts Hugging Face's daily AI research papers into π―π΅ Japanese, π°π· Korean, πͺπΈ Spanish, and π«π· French. π Powered by InternLM API, it accurately translates paper titles and abstracts while maintaining technical accuracy. π οΈ The system features robust error handling, progress tracking,
Extracto
π€ A multilingual translation tool that automatically converts Hugging Face's daily AI research papers into π―π΅ Japanese, π°π· Korean, πͺπΈ Spanish, and π«π· French. π Powered by InternLM API, it accur...
Contenido
HuggingFace Daily Paper Newsletter Multilingual
Translating HuggingFace Daily Papers with InternLM
This project automatically downloads and processes HuggingFace daily paper data and translates it into multiple languages using the InternLM large language model. The project runs automatically every day to ensure timely retrieval and translation of the latest papers.
Model Used
- Translation Model: InternLM-3
- Developer: Shanghai AI Laboratory
- Version: internlm3-latest
- Features:
- Powerful multilingual translation capabilities
- Accurate understanding and translation of academic texts
- Real-time translation via API
Features
- Automatic download of HuggingFace daily paper data
- Support for downloading historical data from specific dates
- Use of Beijing time as default timezone
- Complete activity logging
- JSON format paper metadata storage
- Translation of English papers to multiple languages using InternLM-3:
- Japanese
- Korean
- Spanish
- French
- Automated workflow:
- Daily automatic download of latest papers
- Automatic multilingual translation
- Automatic repository updates
Installation
- Clone the repository:
git clone https://github.com/yourusername/hf-daily-paper-newsletter-multilingual.git
cd hf-daily-paper-newsletter-multilingual- Install dependencies:
pip install -r requirements.txt
Usage
Manual Execution
Download Today's Papers
python download_papers.py
Download Papers from a Specific Date
python download_papers.py --date 2024-03-20
Translate Papers
First obtain an InternLM API key, then run:
python translate_papers.py --date 2024-03-20 --api_key your_api_key_here
Automatic Execution
The project is configured with two GitHub Actions workflows:
daily-paper-download.yml: Automatically downloads latest papers at 9:00 AM Beijing timedaily-paper-translate.yml: Automatic translation after download
To enable automatic translation, you need to set INTERNLM_API_KEY in the repository's Secrets.
Data Storage
- Original English paper data is stored in the
Paper_metadata_downloaddirectory - Translated papers are stored in the
Translated_papersdirectory, organized by language code:- ja/: Japanese translations
- ko/: Korean translations
- es/: Spanish translations
- fr/: French translations
- All files are saved in JSON format with names in
YYYY-MM-DD.jsonformat
Return Status
- Success: exit code 0
- Error: exit code 1
- No data: exit code 0 (with warning in log)
Acknowledgments
- InternLM - For providing powerful translation capabilities
- HuggingFace - For providing daily paper data
Fuente: GitHub