EarthSpeciesProject
/

NatureLM-audio

Safetensors

EarthSpeciesProject

NatureLM

Model card Files Files and versions

xet

Community

Milad Alizadeh commited on Apr 24

Commit

b77c285

unverified ·

1 Parent(s): 8d6eb55

Tweaks to the model card

Browse files

Files changed (1) hide show

README.md +35 -34

README.md CHANGED Viewed

@@ -36,49 +36,23 @@ NatureLM-audio is an audio-language model designed to address bioacoustic tasks
 ### Direct Use
 NatureLM-audio can be used directly for bioacoustic tasks such as species classification, detection, and captioning. It is particularly useful for biodiversity monitoring, conservation, and animal behavior studies.
-```python
-from NatureLM.models import NatureLM
-# Download the model from HuggingFace
-model = NatureLM.from_pretrained("EarthSpeciesProject/NatureLM-audio")
-model = model.eval().to("cuda")
-```
-```python
-from NatureLM.infer import Pipeline
-# pass your audio files in as file paths or as numpy arrays
-# NOTE: the Pipeline class will automatically load the audio and convert them to numpy arrays
-audio_paths = ["assets/nri-GreenTreeFrogEvergladesNP.mp3"]  # wav, mp3, ogg, flac are supported.
-# Create a list of queries. You may also pass a single query as a string for multiple audios.
-# The same query will be used for all audios.
-queries = ["What is the common name for the focal species in the audio? Answer:"]
-pipeline = Pipeline(model=model)
-# NOTE: you can also just do pipeline = Pipeline() which will download the model automatically
-# Run the model over the audio in sliding windows of 10 seconds with a hop length of 10 seconds
-results = pipeline(audio_paths, queries, window_length_seconds=10.0, hop_length_seconds=10.0)
-print(results)
-# ['#0.00s - 10.00s#: Green Treefrog\n']
-```
 Example prompts:
-Prompt: What is the common name for the focal species in the audio?
-Answer: Humpback Whale
-Prompt: Which of these, if any, are present in the audio recording? Single pulse gibbon call, Multiple pulse gibbon call, Gibbon duet, None.
 Answer: Gibbon duet
-Prompt: What is the common name for the focal species in the audio?
 Answer: Spectacled Tetraka
-Prompt: What is the life stage of the focal species in the audio?
 Answer: Juvenile
-Prompt: What type of vocalization is heard from the focal species in the audio? Answer with either 'call' or 'song'.
 Prompt: Caption the audio, using the common name for any animal species.
@@ -103,7 +77,34 @@ Users should be aware of the risks, biases, and limitations of the model. It is
 ## How to Get Started with the Model
-Refer to the GitHub [repository](https://github.com/earthspecies/naturelm-audio) for examples of model usage.
 ## Training Details

 ### Direct Use
 NatureLM-audio can be used directly for bioacoustic tasks such as species classification, detection, and captioning. It is particularly useful for biodiversity monitoring, conservation, and animal behavior studies.
 Example prompts:
+Prompt: What is the common name for the focal species in the audio?
+Answer: Humpback Whale
+Prompt: Which of these, if any, are present in the audio recording? Single pulse gibbon call, Multiple pulse gibbon call, Gibbon duet, None.
 Answer: Gibbon duet
+Prompt: What is the common name for the focal species in the audio?
 Answer: Spectacled Tetraka
+Prompt: What is the life stage of the focal species in the audio?
 Answer: Juvenile
+Prompt: What type of vocalization is heard from the focal species in the audio?
+Answer with either 'call' or 'song'.
 Prompt: Caption the audio, using the common name for any animal species.
 ## How to Get Started with the Model
+Instantiating the model:
+```python
+from NatureLM.models import NatureLM
+# Download the model from HuggingFace
+model = NatureLM.from_pretrained("EarthSpeciesProject/NatureLM-audio")
+model = model.eval().to("cuda")
+```
+Using the model:
+```python
+from NatureLM.infer import Pipeline
+audio_paths = ["assets/nri-GreenTreeFrogEvergladesNP.mp3"]
+queries = ["What is the common name for the focal species in the audio? Answer:"]
+pipeline = Pipeline(model=model)
+# Run the model over the audio in sliding windows of 10 seconds with a hop length of 10 seconds
+results = pipeline(audio_paths, queries, window_length_seconds=10.0, hop_length_seconds=10.0)
+print(results)
+# ['#0.00s - 10.00s#: Green Treefrog\n']
+```
+Refer to the GitHub [repository](https://github.com/earthspecies/naturelm-audio) for more details.
 ## Training Details