Systems and methods for using contrastive pre-training to generate text and code embeddings
Topics: LLMO / GEO, OpenAI / ChatGPT
This patent describes OpenAI’s systems and methods for generating text and code embeddings using contrastive pre-training. The technology creates vector representations by training machine learning models on positive example pairs (semantically related items) and negative example pairs (unrelated items). The resulting embeddings can determine semantic similarity between different pieces of text or code, making them valuable for organizing and processing natural language and code in a way that’s consumable by machine learning models and algorithms.