ml6team
/

keyphrase-extraction-kbir-inspec

Token Classification

keyphrase-extraction

Eval Results (legacy)

Model card Files Files and versions

DeDeckerThomas commited on Apr 29, 2022

Commit

4f19e0a

·

1 Parent(s): 4cebcbc

Update README.md

Files changed (1) hide show

README.md +33 -6

README.md CHANGED Viewed

@@ -54,20 +54,29 @@ def extract_keyphrases(example, predictions, tokenizer, index=0):
     )
     return np.unique([kp.strip() for kp in extracted_kps])
 # Load model and tokenizer
 model_name = "DeDeckerThomas/keyphrase-extraction-kbir-inspec"
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 model = AutoModelForTokenClassification.from_pretrained(model_name)
 # Inference
 text = """
-""".replace(
-    "\n", ""
-)
 encoded_input = tokenizer(
-    text.split(" "),
-    is_split_into_words=True,
     truncation=True,
     padding="max_length",
     max_length=max_length,
@@ -87,6 +96,24 @@ print("***** Prediction *****")
 print(extracted_kps)
 ```
 ## 📚 Trainig Dataset
 ## 👷‍♂️ Training procedure

     )
     return np.unique([kp.strip() for kp in extracted_kps])
+```
+```python
 # Load model and tokenizer
 model_name = "DeDeckerThomas/keyphrase-extraction-kbir-inspec"
 tokenizer = AutoTokenizer.from_pretrained(model_name)
 model = AutoModelForTokenClassification.from_pretrained(model_name)
+```
+```python
 # Inference
 text = """
+Keyword extraction is a technique in text analysis where you extract the important keywords
+from a text. Since this is a time-consuming process, Artificial Intelligence is used to automate it.
+Currently, classical machine learning methods, that use statistics and linguistics, are widely used
+for the extraction process. The fact that these methods have been widely used in the community has
+the advantage that there are many easy-to-use libraries. Now with the recent innovations in
+deep learning methods (such as recurrent neural networks and transformers, GANS, …),
+keyword extraction can be improved. These new methods also focus on the semantics
+and context of a document, which is quite an improvement.
+""".replace("\n", "")
 encoded_input = tokenizer(
+    text,
     truncation=True,
     padding="max_length",
     max_length=max_length,
 print(extracted_kps)
 ```
+```
+***** Input Document *****
+Keyword extraction is a technique in text analysis where you extract the important keywords
+from a text. Since this is a time-consuming process, Artificial Intelligence is used to automate it.
+Currently, classical machine learning methods, that use statistics and linguistics, are widely used
+for the extraction process. The fact that these methods have been widely used in the community has
+the advantage that there are many easy-to-use libraries. Now with the recent innovations in
+deep learning methods (such as recurrent neural networks and transformers, GANS, …),
+keyword extraction can be improved. These new methods also focus on the semantics
+and context of a document, which is quite an improvement.
+***** Prediction *****
+['Artificial Intelligence' 'GANS' 'Keyword extraction'
+ 'classical machine learning methods' 'context' 'deep learning methods'
+ 'keyword extraction' 'linguistics' 'recurrent neural networks'
+ 'semantics' 'statistics' 'text analysis' 'transformers']
+```
 ## 📚 Trainig Dataset
 ## 👷‍♂️ Training procedure