In essence, the GPT-3 is basically a transformer model. Transformer models are sequence-to-sequence deep learning models that can produce a text sequence given an input sequence. These models are designed for text generation tasks, such as answering questions, summarizing text, and machine translation. The following image shows how a transformer model iteratively generates a translation in French given an input sequence in English.
GPT is an architecture based on Transformers and a training procedure for natural language processing tasks. First, a language modeling objective is used on unlabeled data to learn the initial parameters of a neural network model. These parameters are then adapted to a target task using the corresponding supervised objective.