Instructions to use WhereIsAI/UAE-Code-Large-V1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use WhereIsAI/UAE-Code-Large-V1 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("WhereIsAI/UAE-Code-Large-V1") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
| license: mit | |
| datasets: | |
| - WhereIsAI/github-issue-similarity | |
| language: | |
| - en | |
| library_name: sentence-transformers | |
| pipeline_tag: feature-extraction | |
| # WhereIsAI/UAE-Code-Large-V1 | |
| 📢 `WhereIsAI/UAE-Code-Large-V1` **is licensed under MIT. Feel free to use it in any scenario.** | |
| If you use it for academic papers, we would greatly appreciate it if you could cite us. 👉 [citation info](#citation). | |
| This model builds upon [WhereIsAI/UAE-Large-V1](https://huggingface.co/WhereIsAI/UAE-Large-V1) and is fine-tuned on the [GIS: Github Issue Similarity](https://huggingface.co/datasets/WhereIsAI/github-issue-similarity) dataset using [AnglE](https://github.com/SeanLee97/AnglE) loss (https://arxiv.org/abs/2309.12871). | |
| It can be used to measure **code/issue similarity**. | |
| Results (test set): | |
| - Spearman correlation: 71.19 | |
| - Accuracy: 84.37 | |
| ## Usage | |
| ### 1. angle-emb | |
| You can use it via `angle-emb` as follows: | |
| install: | |
| ``` | |
| python -m pip install -U angle-emb | |
| ``` | |
| example: | |
| ```python | |
| from scipy import spatial | |
| from angle_emb import AnglE | |
| model = AnglE.from_pretrained('WhereIsAI/UAE-Code-Large-V1').cuda() | |
| quick_sort = '''# Approach 2: Quicksort using list comprehension | |
| def quicksort(arr): | |
| if len(arr) <= 1: | |
| return arr | |
| else: | |
| pivot = arr[0] | |
| left = [x for x in arr[1:] if x < pivot] | |
| right = [x for x in arr[1:] if x >= pivot] | |
| return quicksort(left) + [pivot] + quicksort(right) | |
| # Example usage | |
| arr = [1, 7, 4, 1, 10, 9, -2] | |
| sorted_arr = quicksort(arr) | |
| print("Sorted Array in Ascending Order:") | |
| print(sorted_arr)''' | |
| bubble_sort = '''def bubblesort(elements): | |
| # Looping from size of array from last index[-1] to index [0] | |
| for n in range(len(elements)-1, 0, -1): | |
| swapped = False | |
| for i in range(n): | |
| if elements[i] > elements[i + 1]: | |
| swapped = True | |
| # swapping data if the element is less than next element in the array | |
| elements[i], elements[i + 1] = elements[i + 1], elements[i] | |
| if not swapped: | |
| # exiting the function if we didn't make a single swap | |
| # meaning that the array is already sorted. | |
| return | |
| elements = [39, 12, 18, 85, 72, 10, 2, 18] | |
| print("Unsorted list is,") | |
| print(elements) | |
| bubblesort(elements) | |
| print("Sorted Array is, ") | |
| print(elements)''' | |
| vecs = model.encode([ | |
| 'def echo(): print("hello world")', | |
| quick_sort, | |
| bubble_sort | |
| ]) | |
| print('cos sim (0, 1):', 1 - spatial.distance.cosine(vecs[0], vecs[1])) | |
| print('cos sim (0, 2)', 1 - spatial.distance.cosine(vecs[0], vecs[2])) | |
| print('cos sim (1, 2):', 1 - spatial.distance.cosine(vecs[1], vecs[2])) | |
| ``` | |
| output: | |
| ``` | |
| cos sim (0, 1): 0.34329649806022644 | |
| cos sim (0, 2) 0.3627094626426697 | |
| cos sim (1, 2): 0.6972219347953796 | |
| ``` | |
| ## sentence-transformers | |
| You can also use it via `sentence-transformers` | |
| ```python | |
| from scipy import spatial | |
| from sentence_transformers import SentenceTransformer | |
| model = SentenceTransformer('WhereIsAI/UAE-Code-Large-V1').cuda() | |
| quick_sort = '''# Approach 2: Quicksort using list comprehension | |
| def quicksort(arr): | |
| if len(arr) <= 1: | |
| return arr | |
| else: | |
| pivot = arr[0] | |
| left = [x for x in arr[1:] if x < pivot] | |
| right = [x for x in arr[1:] if x >= pivot] | |
| return quicksort(left) + [pivot] + quicksort(right) | |
| # Example usage | |
| arr = [1, 7, 4, 1, 10, 9, -2] | |
| sorted_arr = quicksort(arr) | |
| print("Sorted Array in Ascending Order:") | |
| print(sorted_arr)''' | |
| bubble_sort = '''def bubblesort(elements): | |
| # Looping from size of array from last index[-1] to index [0] | |
| for n in range(len(elements)-1, 0, -1): | |
| swapped = False | |
| for i in range(n): | |
| if elements[i] > elements[i + 1]: | |
| swapped = True | |
| # swapping data if the element is less than next element in the array | |
| elements[i], elements[i + 1] = elements[i + 1], elements[i] | |
| if not swapped: | |
| # exiting the function if we didn't make a single swap | |
| # meaning that the array is already sorted. | |
| return | |
| elements = [39, 12, 18, 85, 72, 10, 2, 18] | |
| print("Unsorted list is,") | |
| print(elements) | |
| bubblesort(elements) | |
| print("Sorted Array is, ") | |
| print(elements)''' | |
| vecs = model.encode([ | |
| 'def echo(): print("hello world")', | |
| quick_sort, | |
| bubble_sort | |
| ]) | |
| print('cos sim (0, 1):', 1 - spatial.distance.cosine(vecs[0], vecs[1])) | |
| print('cos sim (0, 2)', 1 - spatial.distance.cosine(vecs[0], vecs[2])) | |
| print('cos sim (1, 2):', 1 - spatial.distance.cosine(vecs[1], vecs[2])) | |
| ``` | |
| output: | |
| ``` | |
| cos sim (0, 1): 0.34329649806022644 | |
| cos sim (0, 2) 0.3627094626426697 | |
| cos sim (1, 2): 0.6972219347953796 | |
| ``` | |
| # Citation | |
| ```bibtex | |
| @article{li2023angle, | |
| title={AnglE-optimized Text Embeddings}, | |
| author={Li, Xianming and Li, Jing}, | |
| journal={arXiv preprint arXiv:2309.12871}, | |
| year={2023} | |
| } | |
| ``` |