// MODEL OPTIMIZATION AND PROMPT SYNTAX TERM

XLNet

XLNet is an AI model that learns language by predicting words in a mixed-up order, which helps it understand context better than models that only look at words from left to right.

XLNet — illustration from Wikipedia
Image via Wikipedia

TECHNICAL DEFINITION

XLNet is a generalized autoregressive pretraining method that leverages a permutation language modeling objective to learn bidirectional context, overcoming the limitations of BERT's masked language modeling by considering all permutations of the input sequence.

BACKGROUND

Gemini is a generative artificial intelligence chatbot and virtual assistant developed by Google. It is powered by the family of large language models (LLMs) of the same name, after previously being based on LaMDA and PaLM 2.

READ MORE ON WIKIPEDIA

SYNONYMS & ALIASES

  • Permutation Language Model
  • Google XLNet

USAGE NOTE

XLNet excels in tasks requiring strong contextual understanding, such as question answering and natural language inference.

DEVELOPERS

Organizations developing technology related to XLNet.

  • Google AI

    The original developers of XLNet, Google AI continuously conducts cutting-edge research in natural language processing and AI engineering, impacting how models like XLNet are designed and utilized. They are at the forefront of foundational model development.

  • Hugging Face

    Developers of the widely used Transformers library, Hugging Face provides open-source implementations and extensive tooling for applying models like XLNet in AI engineering and prompt design. They are crucial for the practical application and deployment of such models.

  • IBM Research

    IBM Research conducts fundamental and applied AI research, including advanced natural language processing. They develop enterprise AI platforms and methodologies that incorporate or are influenced by state-of-the-art models for real-world applications.

  • Microsoft Research

    A leading research arm developing foundational AI technologies and contributing to the understanding and application of large language models. Microsoft Research's work influences AI engineering practices across the industry and supports models used in Azure AI services.

  • Amazon Web Services (AWS)

    AWS provides cloud-based machine learning services and tools like Amazon SageMaker that enable AI engineers to train, deploy, and manage models such as XLNet, offering a comprehensive platform for related AI engineering tasks.

  • Salesforce AI Research

    Salesforce AI Research focuses on advanced AI research for enterprise applications, including natural language processing. They often develop new methods and models that leverage or build upon the architectures of prominent large language models.

  • NVIDIA

    NVIDIA develops specialized hardware and software platforms, such as NVIDIA NeMo and TensorRT, for optimizing and accelerating the training and deployment of large language models. Their work directly supports the efficient AI engineering of models like XLNet.

RELATED TERMS IN MODEL ARCHITECTURE