astroPT
astronomy
images
huggingscience
science
Mike Smith commited on
Commit
1f6f265
·
1 Parent(s): 9ad161b

Added models

Browse files
.gitattributes CHANGED
@@ -34,3 +34,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  assets/shoggoth_telescope_sticker_2.png filter=lfs diff=lfs merge=lfs -text
 
 
 
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  assets/shoggoth_telescope_sticker_2.png filter=lfs diff=lfs merge=lfs -text
37
+ assets/scaling_law.png filter=lfs diff=lfs merge=lfs -text
38
+ astropt/* filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,25 +1,32 @@
1
- ---
2
- license: cc-by-sa-4.0
3
- datasets:
4
- - Smith42/galaxies
5
- tags:
6
- - astronomy
7
- ---
8
- <center>
9
- <img src="assets/shoggoth_telescope_sticker_2.png" alt="astroPT_shoggoth" width="300px"/>
10
- </center>
11
-
12
- # astroPT: a Large Observation Model for Astronomy
13
-
14
- Here we have the model files for the astroPT project, the code to run inference
15
- with these models is found here:
16
- [https://github.com/smith42/astropt](https://github.com/smith42/astropt)
17
-
18
- You will find the fully trained models (pretrained on 8.6 million galaxies) in
19
- folders labelled with the model parameter count in the `astropt` directory.
20
-
21
- The paper is here: https://arxiv.org/abs/2405.14930
22
-
23
- ## Updates and community
24
-
25
- AstroPT is an open-to-all UniverseTBD project. Please join the [UniverseTBD](https://universetbd.org) Discord for updates: [https://discord.gg/MNEVegvfJq](https://discord.gg/MNEVegvfJq)
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-sa-4.0
3
+ datasets:
4
+ - Smith42/galaxies
5
+ tags:
6
+ - astronomy
7
+ ---
8
+ <center>
9
+ <img src="assets/shoggoth_telescope_sticker_2.png" alt="astroPT_shoggoth" width="300px"/>
10
+ </center>
11
+
12
+ # astroPT: a Large Observation Model for Astronomy
13
+
14
+ Here we have the model files for the astroPT project, the code to run inference
15
+ with these models is found here:
16
+ [https://github.com/smith42/astropt](https://github.com/smith42/astropt)
17
+
18
+ You will find the fully trained models (pretrained on 8.6 million galaxies) in
19
+ folders labelled with the model parameter count in the `astropt` directory.
20
+
21
+ Unlike the older models, these models are trained on the "clipped" galaxies in
22
+ [this dataset](https://huggingface.co/datasets/Smith42/galaxies).
23
+
24
+ We get some promising scaling on this new dataset, see below:
25
+
26
+ <center>
27
+ <img src="assets/scaling_law.png" alt="scaling_law" width="300px"/>
28
+ </center>
29
+
30
+ ## Updates and community
31
+
32
+ AstroPT is an open-to-all UniverseTBD project. Please join the [UniverseTBD](https://universetbd.org) Discord for updates: [https://discord.gg/MNEVegvfJq](https://discord.gg/MNEVegvfJq)
assets/scaling_law.png ADDED

Git LFS Details

  • SHA256: 92862c28c529d290ee6fbdf54d966f0fbeb55826480e1279ab90a31cadf1f640
  • Pointer size: 131 Bytes
  • Size of remote file: 463 kB
astropt/015M/ckpt.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8f43d67206eb36cdacb3c23fee4775dc786f9c816bc7f029c34a33f8f7e5e417
3
+ size 174696805
astropt/015M/hparams.txt ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ AstroPT-0014.6M
2
+ time: 1748713289
3
+ log_via_wandb: False
4
+ log_emissions: False
5
+ out_dir: logs/astropt010M
6
+ eval_interval: 1000
7
+ log_interval: 100
8
+ checkpoint_interval: 5000
9
+ eval_iters: 200
10
+ eval_only: False
11
+ always_save_checkpoint: False
12
+ init_from: resume
13
+ use_hf: True
14
+ stream_hf_dataset: True
15
+ gradient_accumulation_steps: 40
16
+ batch_size: 16
17
+ spiral: True
18
+ block_size: 1024
19
+ image_size: 256
20
+ num_workers: 32
21
+ n_layer: 6
22
+ n_head: 8
23
+ n_embd: 384
24
+ n_chan: 3
25
+ dropout: 0.0
26
+ bias: False
27
+ learning_rate: 0.001
28
+ max_iters: 30000
29
+ weight_decay: 0.1
30
+ beta1: 0.9
31
+ beta2: 0.95
32
+ grad_clip: 1.0
33
+ decay_lr: True
34
+ warmup_iters: 2000
35
+ lr_decay_iters: 29700.000000000004
36
+ min_lr: 0.0001
37
+ backend: nccl
38
+ device: cuda
39
+ dtype: bfloat16
40
+ attn_type: causal
41
+ compile: True
astropt/015M/loss.png ADDED
astropt/015M/loss.txt ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ iter_num,dummy_train,dummy_valid,lr,mfu
2
+ 0,0.4237290322780609,0.42409396171569824,0.0,-100.0
3
+ 0,0.4237290322780609,0.42409396171569824,0.0,-100.0
4
+ 1000,0.27635782957077026,0.2721666693687439,0.0005,42.03169684224443
5
+ 2000,0.25556641817092896,0.2559831738471985,0.001,53.8111819377485
6
+ 3000,0.2424052655696869,0.24332532286643982,0.0009971089396989417,42.273730156086295
7
+ 4000,0.23524266481399536,0.23419052362442017,0.0009884729064831644,46.234833811477216
8
+ 5000,0.23254640400409698,0.228200763463974,0.0009742028660983954,47.34795827719127
9
+ 6000,0.23014438152313232,0.22854839265346527,0.0009544821765324169,51.925385788583945
10
+ 7000,0.22834205627441406,0.22846633195877075,0.0009295642320195118,54.56592659089735
11
+ 8000,0.2249969244003296,0.22747699916362762,0.0008997692071381923,50.38375591285126
12
+ 9000,0.22227264940738678,0.2229512333869934,0.0008654799428378067,45.48437322076528
13
+ 10000,0.22202391922473907,0.2227519005537033,0.0008271370272551168,51.52436284793486
14
+ 11000,0.22325924038887024,0.22143419086933136,0.0007852331345282023,49.45741727055077
15
+ 12000,0.22197683155536652,0.22233669459819794,0.0007403066943491634,47.00793013755864
16
+ 13000,0.22199097275733948,0.22071799635887146,0.0006929349735965315,41.30985543340691
17
+ 14000,0.2206525057554245,0.22099420428276062,0.000643726658942576,50.253477640215586
18
+ 15000,0.2214205116033554,0.2191619873046875,0.0005933140357427553,45.30358553407221
19
+ 16000,0.22087925672531128,0.22044230997562408,0.0005423448637019813,44.33540486861174
20
+ 17000,0.2168184071779251,0.21850600838661194,0.0004914740537085424,38.140507482419274
21
+ 18000,0.2182859480381012,0.21953009068965912,0.0004413552527813483,42.46612986314087
22
+ 19000,0.21671955287456512,0.21851180493831635,0.0003926324452568313,39.06802761357582
23
+ 20000,0.21622294187545776,0.21841511130332947,0.00034593167813317023,43.40764235691585
24
+ 21000,0.21420088410377502,0.21559077501296997,0.00030185301689418824,43.2507601435863
25
+ 22000,0.21658644080162048,0.21614958345890045,0.00026096283517380403,40.05925280018733
26
+ 23000,0.21632450819015503,0.21784153580665588,0.00022378653733235094,42.713108580927525
27
+ 24000,0.21702900528907776,0.21488821506500244,0.00019080180745351615,35.38165183244133
28
+ 25000,0.21359431743621826,0.21773101389408112,0.00016243247150660568,41.33240036788472
29
+ 24000,0.2173842489719391,0.2158595621585846,0.00019080180745351615,-100.0
30
+ 25000,0.2192632257938385,0.2145005464553833,0.00016243247150660568,38.45229436333473
31
+ 26000,0.21564842760562897,0.21615932881832123,0.00013904305154016877,41.248306829630174
32
+ 27000,0.21481478214263916,0.215606227517128,0.00012093408188100388,40.50860466307151
33
+ 28000,0.2149866670370102,0.2138066440820694,0.00010833824752144302,44.98740255177508
34
+ 29000,0.21589569747447968,0.21135900914669037,0.00010141739431337799,48.94041488093347
35
+ 30000,0.21525105834007263,0.21357005834579468,0.0001,54.906110596442815
astropt/095M/ckpt.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:76e87234bdaced88dc626fa19239e35640013f99f7780650e4e600f67400c304
3
+ size 1142218389
astropt/095M/hparams.txt ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ AstroPT-0095.2M
2
+ time: 1748598669
3
+ log_via_wandb: False
4
+ log_emissions: False
5
+ out_dir: logs/astropt0100M
6
+ eval_interval: 1000
7
+ log_interval: 100
8
+ checkpoint_interval: 5000
9
+ eval_iters: 100
10
+ eval_only: False
11
+ always_save_checkpoint: False
12
+ init_from: scratch
13
+ use_hf: True
14
+ stream_hf_dataset: True
15
+ gradient_accumulation_steps: 40
16
+ batch_size: 16
17
+ spiral: True
18
+ block_size: 1024
19
+ image_size: 256
20
+ num_workers: 32
21
+ n_layer: 12
22
+ n_head: 12
23
+ n_embd: 768
24
+ n_chan: 3
25
+ dropout: 0.0
26
+ bias: False
27
+ learning_rate: 0.0006
28
+ max_iters: 30000
29
+ weight_decay: 0.1
30
+ beta1: 0.9
31
+ beta2: 0.95
32
+ grad_clip: 1.0
33
+ decay_lr: True
34
+ warmup_iters: 2000
35
+ lr_decay_iters: 29700.000000000004
36
+ min_lr: 5.9999999999999995e-05
37
+ backend: nccl
38
+ device: cuda
39
+ dtype: bfloat16
40
+ attn_type: causal
41
+ compile: True
astropt/095M/loss.png ADDED
astropt/095M/loss.txt ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ iter_num,dummy_train,dummy_valid,lr,mfu
2
+ 0,0.42446669936180115,0.42516273260116577,0.0,-100.0
3
+ 0,0.42446669936180115,0.42516273260116577,0.0,-100.0
4
+ 0,0.42446669936180115,0.42516273260116577,0.0,-100.0
5
+ 0,0.42446669936180115,0.42516273260116577,0.0,-100.0
6
+ 0,0.42446669936180115,0.42516273260116577,0.0,-100.0
7
+ 0,0.42446669936180115,0.42516273260116577,0.0,-100.0
8
+ 0,0.42446669936180115,0.42516273260116577,0.0,-100.0
9
+ 0,0.42446669936180115,0.42516273260116577,0.0,-100.0
10
+ 1000,0.26434680819511414,0.2652167081832886,0.0003,130.42374748822635
11
+ 2000,0.24884033203125,0.24648204445838928,0.0005999999999999998,134.77852280097997
12
+ 3000,0.2311551719903946,0.23104743659496307,0.000598265363819365,118.54849647222775
13
+ 4000,0.22553430497646332,0.22761836647987366,0.0005930837438898986,132.32466628774816
14
+ 5000,0.22491337358951569,0.2263311892747879,0.0005845217196590372,113.3353341374296
15
+ 6000,0.2255629003047943,0.2228691428899765,0.0005726893059194501,135.8728686329724
16
+ 7000,0.2197704166173935,0.22422553598880768,0.0005577385392117071,110.84594030790453
17
+ 8000,0.22002235054969788,0.2201835960149765,0.0005398615242829152,115.25644654165464
18
+ 9000,0.21719926595687866,0.21702103316783905,0.0005192879657026838,98.85987858747258
19
+ 10000,0.21689319610595703,0.2164955884218216,0.00049628221635307,108.85766962629262
20
+ 11000,0.21829931437969208,0.21505264937877655,0.0004711398807169213,100.01043102661913
21
+ 12000,0.21640326082706451,0.21680274605751038,0.000444184016609498,124.61954545378147
22
+ 13000,0.21806730329990387,0.214841827750206,0.0004157609841579188,98.01014636532719
23
+ 14000,0.21179082989692688,0.21660491824150085,0.00038623599536554554,126.06610900106654
24
+ 15000,0.2128843367099762,0.21720975637435913,0.00035598842144565307,97.0177229415336
25
+ 16000,0.21521154046058655,0.21652516722679138,0.0003254069182211887,108.6942476335411
26
+ 17000,0.21329370141029358,0.21318931877613068,0.0002948844322251254,104.38925274302863
27
+ 18000,0.2106178253889084,0.2075473815202713,0.00026481315166880894,117.02891668058032
28
+ 19000,0.21140195429325104,0.21152400970458984,0.00023557946715409877,88.16451499677783
29
+ 20000,0.20974008738994598,0.21058639883995056,0.00020755900687990212,108.53359641025698
30
+ 21000,0.2121768444776535,0.20934320986270905,0.0001811118101365129,99.20960805406503
31
+ 22000,0.20585870742797852,0.20819951593875885,0.00015657770110428238,118.56344412893496
32
+ 23000,0.20535139739513397,0.20700927078723907,0.00013427192239941053,85.71626382757705
33
+ 24000,0.21157552301883698,0.20992308855056763,0.00011448108447210966,110.13061388772856
34
+ 25000,0.20912402868270874,0.20913390815258026,9.74594829039634e-05,89.34139182727239
35
+ 26000,0.2059999257326126,0.20747224986553192,8.342583092410125e-05,92.38481695782312
36
+ 27000,0.20863810181617737,0.20679494738578796,7.256044912860232e-05,83.93477246233097
37
+ 28000,0.20510248839855194,0.20686577260494232,6.50029485128658e-05,102.17075726214443
38
+ 29000,0.21028758585453033,0.20720404386520386,6.085043658802679e-05,124.95812865937248
39
+ 30000,0.2084340900182724,0.20667771995067596,5.9999999999999995e-05,142.05454962677416
astropt/850M/ckpt.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fb4dd1ff48ed56b713bd488714263a5e228e5d1ec1503309b98cc27fcc9b64b0
3
+ size 10243431725
astropt/850M/hparams.txt ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ AstroPT-0853.6M
2
+ time: 1748682624
3
+ log_via_wandb: False
4
+ log_emissions: False
5
+ out_dir: logs/astropt700M
6
+ eval_interval: 1000
7
+ log_interval: 100
8
+ checkpoint_interval: 5000
9
+ eval_iters: 200
10
+ eval_only: False
11
+ always_save_checkpoint: False
12
+ init_from: resume
13
+ use_hf: True
14
+ stream_hf_dataset: True
15
+ gradient_accumulation_steps: 40
16
+ batch_size: 16
17
+ spiral: True
18
+ block_size: 1024
19
+ image_size: 256
20
+ num_workers: 32
21
+ n_layer: 16
22
+ n_head: 8
23
+ n_embd: 2048
24
+ n_chan: 3
25
+ dropout: 0.0
26
+ bias: False
27
+ learning_rate: 0.0003
28
+ max_iters: 30000
29
+ weight_decay: 0.1
30
+ beta1: 0.9
31
+ beta2: 0.95
32
+ grad_clip: 1.0
33
+ decay_lr: True
34
+ warmup_iters: 2000
35
+ lr_decay_iters: 29700.000000000004
36
+ min_lr: 2.9999999999999997e-05
37
+ backend: nccl
38
+ device: cuda
39
+ dtype: bfloat16
40
+ attn_type: causal
41
+ compile: True
astropt/850M/loss.png ADDED
astropt/850M/loss.txt ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ iter_num,dummy_train,dummy_valid,lr,mfu
2
+ 0,0.43473708629608154,0.4350984990596771,0.0,-100.0
3
+ 1000,0.2604823708534241,0.25862133502960205,0.00015,227.7207437004567
4
+ 2000,0.2350350022315979,0.23761168122291565,0.0002999999999999999,184.89451864787884
5
+ 3000,0.22695167362689972,0.22630847990512848,0.0002991326819096825,142.68290431303575
6
+ 4000,0.22136694192886353,0.22075489163398743,0.0002965418719449493,164.69209073349614
7
+ 5000,0.2173878699541092,0.21584530174732208,0.0002922608598295186,194.3496002137911
8
+ 6000,0.21593311429023743,0.2161586880683899,0.00028634465295972503,189.65261805551407
9
+ 7000,0.21213401854038239,0.21626879274845123,0.00027886926960585354,161.81362475846606
10
+ 8000,0.21388362348079681,0.2156994789838791,0.0002699307621414576,194.83030549965258
11
+ 9000,0.2113541066646576,0.21075700223445892,0.0002596439828513419,206.62870435065764
12
+ 10000,0.2107718139886856,0.2101566344499588,0.000248141108176535,210.6128666058181
13
+ 11000,0.20981331169605255,0.20874501764774323,0.00023556994035846065,211.94408429561662
14
+ 12000,0.211272194981575,0.2096787840127945,0.000222092008304749,212.3979552702134
15
+ 13000,0.2077404260635376,0.20762614905834198,0.0002078804920789594,212.70865436099854
16
+ 14000,0.20752517879009247,0.20760786533355713,0.00019311799768277277,212.7474000568995
17
+ 15000,0.2082953006029129,0.2054262012243271,0.00017799421072282653,182.02349950180033
18
+ 16000,0.2058737725019455,0.20660598576068878,0.00016270345911059434,156.65936123606375
19
+ 17000,0.20293836295604706,0.20433051884174347,0.0001474422161125627,167.64408362875457
20
+ 18000,0.20452049374580383,0.20549240708351135,0.00013240657583440447,196.76315959510902
21
+ 19000,0.2050369679927826,0.2039024531841278,0.00011778973357704939,195.6227500412617
22
+ 20000,0.20305420458316803,0.20371705293655396,0.00010377950343995106,161.65082800564085
23
+ 21000,0.20195505023002625,0.20066562294960022,9.055590506825645e-05,169.07418074614085
24
+ 22000,0.1976902037858963,0.20103618502616882,7.828885055214119e-05,197.42834390898963
25
+ 23000,0.1975751519203186,0.20273856818675995,6.713596119970527e-05,207.8195611069779
26
+ 24000,0.19996100664138794,0.19962093234062195,5.724054223605483e-05,211.5022203199266
27
+ 25000,0.1991165578365326,0.2025628685951233,4.87297414519817e-05,212.6440862984037
28
+ 24000,0.20241282880306244,0.20077800750732422,5.724054223605483e-05,-100.0
29
+ 25000,0.20112377405166626,0.19944730401039124,4.87297414519817e-05,158.77700554746747
30
+ 26000,0.19800467789173126,0.20101504027843475,4.1712915462050624e-05,159.46568448810257
31
+ 27000,0.20109038054943085,0.20026886463165283,3.628022456430116e-05,152.1405778610438
32
+ 28000,0.1989879012107849,0.1982770413160324,3.25014742564329e-05,155.69581851680192
33
+ 29000,0.1973315179347992,0.19575488567352295,3.0425218294013393e-05,171.29352693692962
34
+ 30000,0.1978050172328949,0.19824472069740295,2.9999999999999997e-05,175.91480981143775