NLP 学习相关网站

BERT 

GPT-3

Bert fine-tune 策略

bert fine-tune 策略

  • The top layer of BERT is more useful for text classification;
  • With an appropriate layer-wise decreasing learning rate, BERT can overcome the catastrophic for getting problem;
  • Within-task and in-domain further pre-training can significantly boost its performance;
  • A preceding multi-task fine-tuning is also helpful to the single-task fine-tuning, but its benefit is smaller than further pre-training;
  • BERT can improve the task with small-size data.

CTR 预估模型