This is a limited proof of concept to search for research data, not a production system.

Search the MIT Libraries

Title: huggingface/transformers: T5 Model, BART summarization example and reduced memory, translation pipeline

Type Software Thomas Wolf, Lysandre Debut, Julien Chaumond, Victor SANH, Patrick von Platen, Aymeric Augustin, Rémi Louf, Funtowicz Morgan, Stefan Schweter, Denis, Sam Shleifer, erenup, Manuel Romero, Matt, Piero Molino, Grégory Châtel, Bram Vanroy, Tim Rault, Gunnlaugur Thor Briem, Anthony MOI, Malte Pietsch, Julien Plu, Catalin Voss, Bilal Khan, Fei Wang, Martin Malmsten, Louis Martin, Davide Fiocco, Clement, Ananya Harsh Jha (2020): huggingface/transformers: T5 Model, BART summarization example and reduced memory, translation pipeline. Zenodo. Software. https://zenodo.org/record/3733180

Authors: Thomas Wolf (@huggingface) ; Lysandre Debut (Hugging Face) ; Julien Chaumond (Hugging Face) ; Victor SANH (@huggingface) ; Patrick von Platen ; Aymeric Augustin (@canalplus) ; Rémi Louf ; Funtowicz Morgan (HuggingFace) ; Stefan Schweter ; Denis ; Sam Shleifer (Huggingface) ; erenup ; Manuel Romero ; Matt ; Piero Molino ; Grégory Châtel (DisAItek & Intel AI Innovators) ; Bram Vanroy (@UGent) ; Tim Rault (@huggingface) ; Gunnlaugur Thor Briem (Qlik) ; Anthony MOI (Hugging Face) ; Malte Pietsch (deepset) ; Julien Plu (Leboncoin Lab) ; Catalin Voss (Stanford University) ; Bilal Khan ; Fei Wang (University of Southern California) ; Martin Malmsten ; Louis Martin ; Davide Fiocco ; Clement (@huggingface) ; Ananya Harsh Jha ;

Links

Summary

T5 Model (@patrickvonplaten, @thomwolf )

T5 is a powerful encoder-decoder model that formats every NLP problem into a text-to-text format. It achieves state of the art results on a variety of NLP tasks (Summarization, Question-Answering, ...).

Five sets of pre-trained weights (pre-trained on a multi-task mixture of unsupervised and supervised tasks) are released. In ascending order from 60 million parameters to 11 billion parameters:

t5-small, t5-base, t5-large, t5-3b, t5-11b

T5 can now be used with the translation and summarization pipeline.

Related:

paper official code model available in Hugging Face's community models docs

Big thanks to the original authors, especially @craffel who helped answer our questions, reviewed PRs and tested T5 extensively.

New BART checkpoint: bart-large-xsum (@sshleifer)

These weights are from BART finetuned on the XSum abstractive summarization challenge, which encourages shorter (more abstractive) summaries. It achieves state of the art.

BART summarization example with pytorch-lightning (@acarrera94)

New example: BART for summarization, using Pytorch-lightning. Trains on CNN/DM and evaluates.

Translation pipeline (@patrickvonplaten)

A new pipeline is available, leveraging the T5 model. The T5 model was added to the summarization pipeline as well.

Memory improvements with BART (@sshleifer)

In an effort to have the same memory footprint and same computing power necessary to run inference on BART, several improvements have been made on the model:

Remove the LM head and use the embedding matrix instead (~200MB) Call encoder before expanding input_ids (~1GB) SelfAttention only returns weights if config.output_attentions (~500MB) Two separate, smaller decoder attention masks (~500MB) drop columns that are exclusively pad_token_id from input_ids in evaluate_cnn example. New model: XLMForTokenClassification (@sakares)

A new head was added to XLM: XLMForTokenClassification.

More information

  • DOI: 10.5281/zenodo.3733180

Dates

  • Publication date: 2020
  • Issued: March 30, 2020

Rights

  • info:eu-repo/semantics/openAccess Open Access

Much of the data past this point we don't have good examples of yet. Please share in #rdi slack if you have good examples for anything that appears below. Thanks!

Format

electronic resource

Relateditems

DescriptionItem typeRelationshipUri
IsSupplementTohttps://github.com/huggingface/transformers/tree/v2.7.0
IsVersionOfhttps://doi.org/10.5281/zenodo.3385997
IsPartOfhttps://zenodo.org/communities/zenodo