Snippets Groups Projects

Enhancement: Implement PyTorch's native Transformer model for reference

Although adapted after the initial 2017 arhitecture, PyTorch's native Transformer is a good comparison in terms of speed and memory requirements, since its implementation (most likely) is optimized in C++ CUDA code.

The task is to implement a simple adapter inside optimus/models (probably named PyTorchTransformer), whose __init__() method initializes a PyTorch Transformer, and whose forward() method calls PyTorchTransformer.forward().

Edit: This class may not be exactly what we want, since it has both decoders and encoders (for translation). We only want the decoder part, so perhaps using only the TransfomerDecoder is more appropriate. This, however, probably needs separate Embedding and positional encodings, and a separate Linear Layer at the end.

Probably a good starting point is the PyTorch Transformer example which is almost 90% what we want (there are a few things which our framework does a bit differently though, so those need to be adapted).

Edited 1 year ago

Designs

Child items ...

Activity

Alexandru-Mihai GHERGHESCU changed the description 1 year ago

changed the description
Vlad-Andrei BĂDOIU (78692) mentioned in merge request !19 (merged) 1 year ago

mentioned in merge request !19 (merged)
Alexandru-Mihai GHERGHESCU closed with merge request !19 (merged) 1 year ago

closed with merge request !19 (merged)

Please register or sign in to reply

Due date

None

Health status

None

Confidentiality

Confidentiality controls have moved to the issue actions menu () at the top of the page.

0 Participants