CuBERT is a collection of BERT-based models trained on source code, specifically five popular programming languages: Python, Java, JavaScript, PHP, and Ruby. Developed by Google Research, CuBERT leverages the BERT architecture to understand and process source code for a variety of downstream software engineering tasks. It is designed to support classification and sequence prediction tasks on code, enabling powerful static analysis and code intelligence capabilities.
Key Features
CuBERT is a collection of BERT-based models trained on source code, specifically five popular programming languages: Python, Java, JavaScript, PHP, and Ruby. Developed by Google Research, CuBERT leverages the BERT architecture to understand and process source code for a variety of downstream software engineering tasks. It is designed to support classification and sequence prediction tasks on code, enabling powerful static analysis and code intelligence capabilities.