Safetensors
EarthSpeciesProject
NatureLM
Milad Alizadeh commited on
Commit
b77c285
·
unverified ·
1 Parent(s): 8d6eb55

Tweaks to the model card

Browse files
Files changed (1) hide show
  1. README.md +35 -34
README.md CHANGED
@@ -36,49 +36,23 @@ NatureLM-audio is an audio-language model designed to address bioacoustic tasks
36
  ### Direct Use
37
 
38
  NatureLM-audio can be used directly for bioacoustic tasks such as species classification, detection, and captioning. It is particularly useful for biodiversity monitoring, conservation, and animal behavior studies.
39
- ```python
40
- from NatureLM.models import NatureLM
41
-
42
- # Download the model from HuggingFace
43
- model = NatureLM.from_pretrained("EarthSpeciesProject/NatureLM-audio")
44
- model = model.eval().to("cuda")
45
- ```
46
-
47
- ```python
48
- from NatureLM.infer import Pipeline
49
-
50
- # pass your audio files in as file paths or as numpy arrays
51
- # NOTE: the Pipeline class will automatically load the audio and convert them to numpy arrays
52
- audio_paths = ["assets/nri-GreenTreeFrogEvergladesNP.mp3"] # wav, mp3, ogg, flac are supported.
53
-
54
- # Create a list of queries. You may also pass a single query as a string for multiple audios.
55
- # The same query will be used for all audios.
56
- queries = ["What is the common name for the focal species in the audio? Answer:"]
57
-
58
- pipeline = Pipeline(model=model)
59
- # NOTE: you can also just do pipeline = Pipeline() which will download the model automatically
60
-
61
- # Run the model over the audio in sliding windows of 10 seconds with a hop length of 10 seconds
62
- results = pipeline(audio_paths, queries, window_length_seconds=10.0, hop_length_seconds=10.0)
63
- print(results)
64
- # ['#0.00s - 10.00s#: Green Treefrog\n']
65
- ```
66
 
67
  Example prompts:
68
 
69
- Prompt: What is the common name for the focal species in the audio?
70
- Answer: Humpback Whale
71
 
72
- Prompt: Which of these, if any, are present in the audio recording? Single pulse gibbon call, Multiple pulse gibbon call, Gibbon duet, None.
73
  Answer: Gibbon duet
74
 
75
- Prompt: What is the common name for the focal species in the audio?
76
  Answer: Spectacled Tetraka
77
 
78
- Prompt: What is the life stage of the focal species in the audio?
79
  Answer: Juvenile
80
 
81
- Prompt: What type of vocalization is heard from the focal species in the audio? Answer with either 'call' or 'song'.
 
82
 
83
  Prompt: Caption the audio, using the common name for any animal species.
84
 
@@ -103,7 +77,34 @@ Users should be aware of the risks, biases, and limitations of the model. It is
103
 
104
  ## How to Get Started with the Model
105
 
106
- Refer to the GitHub [repository](https://github.com/earthspecies/naturelm-audio) for examples of model usage.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
107
 
108
  ## Training Details
109
 
 
36
  ### Direct Use
37
 
38
  NatureLM-audio can be used directly for bioacoustic tasks such as species classification, detection, and captioning. It is particularly useful for biodiversity monitoring, conservation, and animal behavior studies.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
  Example prompts:
41
 
42
+ Prompt: What is the common name for the focal species in the audio?
43
+ Answer: Humpback Whale
44
 
45
+ Prompt: Which of these, if any, are present in the audio recording? Single pulse gibbon call, Multiple pulse gibbon call, Gibbon duet, None.
46
  Answer: Gibbon duet
47
 
48
+ Prompt: What is the common name for the focal species in the audio?
49
  Answer: Spectacled Tetraka
50
 
51
+ Prompt: What is the life stage of the focal species in the audio?
52
  Answer: Juvenile
53
 
54
+ Prompt: What type of vocalization is heard from the focal species in the audio?
55
+ Answer with either 'call' or 'song'.
56
 
57
  Prompt: Caption the audio, using the common name for any animal species.
58
 
 
77
 
78
  ## How to Get Started with the Model
79
 
80
+ Instantiating the model:
81
+
82
+ ```python
83
+ from NatureLM.models import NatureLM
84
+
85
+ # Download the model from HuggingFace
86
+ model = NatureLM.from_pretrained("EarthSpeciesProject/NatureLM-audio")
87
+ model = model.eval().to("cuda")
88
+ ```
89
+
90
+ Using the model:
91
+
92
+ ```python
93
+ from NatureLM.infer import Pipeline
94
+
95
+ audio_paths = ["assets/nri-GreenTreeFrogEvergladesNP.mp3"]
96
+ queries = ["What is the common name for the focal species in the audio? Answer:"]
97
+
98
+ pipeline = Pipeline(model=model)
99
+
100
+ # Run the model over the audio in sliding windows of 10 seconds with a hop length of 10 seconds
101
+ results = pipeline(audio_paths, queries, window_length_seconds=10.0, hop_length_seconds=10.0)
102
+
103
+ print(results)
104
+ # ['#0.00s - 10.00s#: Green Treefrog\n']
105
+ ```
106
+
107
+ Refer to the GitHub [repository](https://github.com/earthspecies/naturelm-audio) for more details.
108
 
109
  ## Training Details
110