Commit ·
4f19e0a
1
Parent(s): 4cebcbc
Update README.md
Browse files
README.md
CHANGED
|
@@ -54,20 +54,29 @@ def extract_keyphrases(example, predictions, tokenizer, index=0):
|
|
| 54 |
)
|
| 55 |
return np.unique([kp.strip() for kp in extracted_kps])
|
| 56 |
|
|
|
|
|
|
|
|
|
|
| 57 |
# Load model and tokenizer
|
| 58 |
model_name = "DeDeckerThomas/keyphrase-extraction-kbir-inspec"
|
| 59 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 60 |
model = AutoModelForTokenClassification.from_pretrained(model_name)
|
| 61 |
-
|
|
|
|
| 62 |
# Inference
|
| 63 |
text = """
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 67 |
|
| 68 |
encoded_input = tokenizer(
|
| 69 |
-
text
|
| 70 |
-
is_split_into_words=True,
|
| 71 |
truncation=True,
|
| 72 |
padding="max_length",
|
| 73 |
max_length=max_length,
|
|
@@ -87,6 +96,24 @@ print("***** Prediction *****")
|
|
| 87 |
print(extracted_kps)
|
| 88 |
```
|
| 89 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 90 |
## 📚 Trainig Dataset
|
| 91 |
## 👷♂️ Training procedure
|
| 92 |
|
|
|
|
| 54 |
)
|
| 55 |
return np.unique([kp.strip() for kp in extracted_kps])
|
| 56 |
|
| 57 |
+
```
|
| 58 |
+
|
| 59 |
+
```python
|
| 60 |
# Load model and tokenizer
|
| 61 |
model_name = "DeDeckerThomas/keyphrase-extraction-kbir-inspec"
|
| 62 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
| 63 |
model = AutoModelForTokenClassification.from_pretrained(model_name)
|
| 64 |
+
```
|
| 65 |
+
```python
|
| 66 |
# Inference
|
| 67 |
text = """
|
| 68 |
+
Keyword extraction is a technique in text analysis where you extract the important keywords
|
| 69 |
+
from a text. Since this is a time-consuming process, Artificial Intelligence is used to automate it.
|
| 70 |
+
Currently, classical machine learning methods, that use statistics and linguistics, are widely used
|
| 71 |
+
for the extraction process. The fact that these methods have been widely used in the community has
|
| 72 |
+
the advantage that there are many easy-to-use libraries. Now with the recent innovations in
|
| 73 |
+
deep learning methods (such as recurrent neural networks and transformers, GANS, …),
|
| 74 |
+
keyword extraction can be improved. These new methods also focus on the semantics
|
| 75 |
+
and context of a document, which is quite an improvement.
|
| 76 |
+
""".replace("\n", "")
|
| 77 |
|
| 78 |
encoded_input = tokenizer(
|
| 79 |
+
text,
|
|
|
|
| 80 |
truncation=True,
|
| 81 |
padding="max_length",
|
| 82 |
max_length=max_length,
|
|
|
|
| 96 |
print(extracted_kps)
|
| 97 |
```
|
| 98 |
|
| 99 |
+
```
|
| 100 |
+
***** Input Document *****
|
| 101 |
+
Keyword extraction is a technique in text analysis where you extract the important keywords
|
| 102 |
+
from a text. Since this is a time-consuming process, Artificial Intelligence is used to automate it.
|
| 103 |
+
Currently, classical machine learning methods, that use statistics and linguistics, are widely used
|
| 104 |
+
for the extraction process. The fact that these methods have been widely used in the community has
|
| 105 |
+
the advantage that there are many easy-to-use libraries. Now with the recent innovations in
|
| 106 |
+
deep learning methods (such as recurrent neural networks and transformers, GANS, …),
|
| 107 |
+
keyword extraction can be improved. These new methods also focus on the semantics
|
| 108 |
+
and context of a document, which is quite an improvement.
|
| 109 |
+
|
| 110 |
+
***** Prediction *****
|
| 111 |
+
['Artificial Intelligence' 'GANS' 'Keyword extraction'
|
| 112 |
+
'classical machine learning methods' 'context' 'deep learning methods'
|
| 113 |
+
'keyword extraction' 'linguistics' 'recurrent neural networks'
|
| 114 |
+
'semantics' 'statistics' 'text analysis' 'transformers']
|
| 115 |
+
```
|
| 116 |
+
|
| 117 |
## 📚 Trainig Dataset
|
| 118 |
## 👷♂️ Training procedure
|
| 119 |
|