Gyanateet Dutta
Ryukijano
AI & ML interests
Computer Graphics, General Artificial Intelligence,model merging, massive ASR for data collection, 3D ML, on-device ML, quantization, model judging, ML in browser, healthcare applications, education, intersection of art and ML.
Recent Activity
liked
a model
11 days ago
vandijklab/C2S-Scale-Gemma-2-27B
upvoted
a
paper
12 days ago
Advancing End-to-End Pixel Space Generative Modeling via Self-supervised
Pre-training
liked
a Space
28 days ago
zerogpu-aoti/wan2-2-fp8da-aoti-faster
Organizations
VILA
Diffusion models
Explore the capabilities of diffusion models for natural language processing. This collection features a diverse set of models trained using diffusion
-
PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models
Paper โข 2309.05793 โข Published โข 50 -
3D Gaussian Splatting for Real-Time Radiance Field Rendering
Paper โข 2308.04079 โข Published โข 191 -
stabilityai/stable-diffusion-xl-base-1.0
Text-to-Image โข Updated โข 2.48M โข โข 7.07k -
Ryukijano/lora-trained-xl-kaggle-p100
Text-to-Image โข Updated โข 14 โข 1
Deep Reinforcement Learning
Features implementations and paces of popular RL algorithms and new paradigms on a variety of environments.
-
Ryukijano/rl_course_vizdoom_health_gathering_supreme
Reinforcement Learning โข Updated -
Ryukijano/Mujoco_rl_halfcheetah_Decision_Trasformer
Reinforcement Learning โข Updated โข 1 -
Ryukijano/poca-SoccerTwos
Reinforcement Learning โข Updated โข 20 -
AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning
Paper โข 2308.03526 โข Published โข 28
Deep learning
-
NeuroPrompts: An Adaptive Framework to Optimize Prompts for Text-to-Image Generation
Paper โข 2311.12229 โข Published โข 27 -
Running on Zero980980
IP-Adapter-FaceID
๐งGenerate images with your face
-
Design2Code: How Far Are We From Automating Front-End Engineering?
Paper โข 2403.03163 โข Published โข 97
Computer vision
-
Unsupervised Universal Image Segmentation
Paper โข 2312.17243 โข Published โข 20 -
Denoising Vision Transformers
Paper โข 2401.02957 โข Published โข 31 -
timm/ViT-B-16-SigLIP
Zero-Shot Image Classification โข Updated โข 16k โข 33 -
Runtime error1919
Slimsam
๐Small yet powerful mask generation application โก๏ธ
Multi modal foundational models
Vision_language_models
2D->3D
Segmentation
Vision_transformer_robotics
Midi-composer
Neural Rendering
This collection focuses on using neural networks for photorealistic rendering and image synthesis. It features models capable to text-to-image gen.
-
NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection
Paper โข 2307.14620 โข Published โข 14 -
LU-NeRF: Scene and Pose Estimation by Synchronizing Local Unposed NeRFs
Paper โข 2306.05410 โข Published โข 4 -
ashawkey/nerf2mesh
Updated โข 14 -
Build error2525
NeRF
๐ฎ
Own Work
LLMs
-
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper โข 2312.11514 โข Published โข 260 -
3D-LFM: Lifting Foundation Model
Paper โข 2312.11894 โข Published โข 15 -
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper โข 2312.15166 โข Published โข 60 -
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper โข 2312.16862 โข Published โข 31
Audio
-
EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks
Paper โข 2402.00892 โข Published โข 14 -
Running on Zero275275
MusicGen Streaming
๐ฅGenerate music from text prompts
-
Runtime error145145
Whisper JAX
๐Transcribe or translate audio from microphone, file, or YouTube
-
Audio Mamba: Bidirectional State Space Model for Audio Representation Learning
Paper โข 2406.03344 โข Published โข 21
Text_to_video diffusion
Text-3D
-
Running on L41.11k1.11k
Stable Fast 3D
๐ฎGenerate a 3D mesh model from an image
-
Runtime error183183
Roblox 3D Assets Generator v1
๐ชCreate a 3D model from an image in 10 seconds!
-
Running on Zero145145
LLaMA Mesh
๐Create 3D mesh by chatting.
-
stabilityai/stable-point-aware-3d
Image-to-3D โข 2B โข Updated โข 545 โข 315
Audio->3D
STEM
Vision_transformer_robotics
VILA
Midi-composer
Diffusion models
Explore the capabilities of diffusion models for natural language processing. This collection features a diverse set of models trained using diffusion
-
PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models
Paper โข 2309.05793 โข Published โข 50 -
3D Gaussian Splatting for Real-Time Radiance Field Rendering
Paper โข 2308.04079 โข Published โข 191 -
stabilityai/stable-diffusion-xl-base-1.0
Text-to-Image โข Updated โข 2.48M โข โข 7.07k -
Ryukijano/lora-trained-xl-kaggle-p100
Text-to-Image โข Updated โข 14 โข 1
Neural Rendering
This collection focuses on using neural networks for photorealistic rendering and image synthesis. It features models capable to text-to-image gen.
-
NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection
Paper โข 2307.14620 โข Published โข 14 -
LU-NeRF: Scene and Pose Estimation by Synchronizing Local Unposed NeRFs
Paper โข 2306.05410 โข Published โข 4 -
ashawkey/nerf2mesh
Updated โข 14 -
Build error2525
NeRF
๐ฎ
Deep Reinforcement Learning
Features implementations and paces of popular RL algorithms and new paradigms on a variety of environments.
-
Ryukijano/rl_course_vizdoom_health_gathering_supreme
Reinforcement Learning โข Updated -
Ryukijano/Mujoco_rl_halfcheetah_Decision_Trasformer
Reinforcement Learning โข Updated โข 1 -
Ryukijano/poca-SoccerTwos
Reinforcement Learning โข Updated โข 20 -
AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning
Paper โข 2308.03526 โข Published โข 28
Own Work
Deep learning
-
NeuroPrompts: An Adaptive Framework to Optimize Prompts for Text-to-Image Generation
Paper โข 2311.12229 โข Published โข 27 -
Running on Zero980980
IP-Adapter-FaceID
๐งGenerate images with your face
-
Design2Code: How Far Are We From Automating Front-End Engineering?
Paper โข 2403.03163 โข Published โข 97
LLMs
-
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper โข 2312.11514 โข Published โข 260 -
3D-LFM: Lifting Foundation Model
Paper โข 2312.11894 โข Published โข 15 -
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper โข 2312.15166 โข Published โข 60 -
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper โข 2312.16862 โข Published โข 31
Computer vision
-
Unsupervised Universal Image Segmentation
Paper โข 2312.17243 โข Published โข 20 -
Denoising Vision Transformers
Paper โข 2401.02957 โข Published โข 31 -
timm/ViT-B-16-SigLIP
Zero-Shot Image Classification โข Updated โข 16k โข 33 -
Runtime error1919
Slimsam
๐Small yet powerful mask generation application โก๏ธ
Audio
-
EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks
Paper โข 2402.00892 โข Published โข 14 -
Running on Zero275275
MusicGen Streaming
๐ฅGenerate music from text prompts
-
Runtime error145145
Whisper JAX
๐Transcribe or translate audio from microphone, file, or YouTube
-
Audio Mamba: Bidirectional State Space Model for Audio Representation Learning
Paper โข 2406.03344 โข Published โข 21
Multi modal foundational models
Text_to_video diffusion
Vision_language_models
Text-3D
-
Running on L41.11k1.11k
Stable Fast 3D
๐ฎGenerate a 3D mesh model from an image
-
Runtime error183183
Roblox 3D Assets Generator v1
๐ชCreate a 3D model from an image in 10 seconds!
-
Running on Zero145145
LLaMA Mesh
๐Create 3D mesh by chatting.
-
stabilityai/stable-point-aware-3d
Image-to-3D โข 2B โข Updated โข 545 โข 315
2D->3D
Audio->3D
Segmentation