Title: huggingface/transformers: T5 Model, BART summarization example and reduced memory, translation pipeline
Type Software Thomas Wolf, Lysandre Debut, Julien Chaumond, Victor SANH, Patrick von Platen, Aymeric Augustin, Rémi Louf, Funtowicz Morgan, Stefan Schweter, Denis, Sam Shleifer, erenup, Manuel Romero, Matt, Piero Molino, Grégory Châtel, Bram Vanroy, Tim Rault, Gunnlaugur Thor Briem, Anthony MOI, Malte Pietsch, Julien Plu, Catalin Voss, Bilal Khan, Fei Wang, Martin Malmsten, Louis Martin, Davide Fiocco, Clement, Ananya Harsh Jha (2020): huggingface/transformers: T5 Model, BART summarization example and reduced memory, translation pipeline. Zenodo. Software. https://zenodo.org/record/3733180
Links
- Item record in Zenodo
- Digital object URL
Summary
T5 Model (@patrickvonplaten, @thomwolf )
T5 is a powerful encoder-decoder model that formats every NLP problem into a text-to-text format. It achieves state of the art results on a variety of NLP tasks (Summarization, Question-Answering, ...).
Five sets of pre-trained weights (pre-trained on a multi-task mixture of unsupervised and supervised tasks) are released. In ascending order from 60 million parameters to 11 billion parameters:
t5-small, t5-base, t5-large, t5-3b, t5-11b
T5 can now be used with the translation and summarization pipeline.
Related:
paper official code model available in Hugging Face's community models docsBig thanks to the original authors, especially @craffel who helped answer our questions, reviewed PRs and tested T5 extensively.
New BART checkpoint: bart-large-xsum (@sshleifer)These weights are from BART finetuned on the XSum abstractive summarization challenge, which encourages shorter (more abstractive) summaries. It achieves state of the art.
BART summarization example with pytorch-lightning (@acarrera94)New example: BART for summarization, using Pytorch-lightning. Trains on CNN/DM and evaluates.
Translation pipeline (@patrickvonplaten)A new pipeline is available, leveraging the T5 model. The T5 model was added to the summarization pipeline as well.
Memory improvements with BART (@sshleifer)In an effort to have the same memory footprint and same computing power necessary to run inference on BART, several improvements have been made on the model:
Remove the LM head and use the embedding matrix instead (~200MB) Call encoder before expanding input_ids (~1GB) SelfAttention only returns weights if config.output_attentions (~500MB) Two separate, smaller decoder attention masks (~500MB) drop columns that are exclusively pad_token_id from input_ids in evaluate_cnn example. New model: XLMForTokenClassification (@sakares)A new head was added to XLM: XLMForTokenClassification.
More information
- DOI: 10.5281/zenodo.3733180
Dates
- Publication date: 2020
- Issued: March 30, 2020
Rights
- info:eu-repo/semantics/openAccess Open Access
Format
electronic resource
Relateditems
| Description | Item type | Relationship | Uri | 
|---|---|---|---|
| IsSupplementTo | https://github.com/huggingface/transformers/tree/v2.7.0 | ||
| IsVersionOf | https://doi.org/10.5281/zenodo.3385997 | ||
| IsPartOf | https://zenodo.org/communities/zenodo |