ICCV2023

community

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

tedlasai authored a paper 3 days ago

Multispectral Demosaicing via Dual Cameras

tedlasai authored a paper 3 days ago

Learning to Refocus with Video Diffusion Models

tedlasai authored a paper 3 days ago

Generating the Past, Present and Future from a Motion-Blurred Image

View all activity

AdinaY

posted an update 10 days ago

Post

500

Following up on LLaDA 2.0 , the paper is now out on Daily Papers🔥
It has sparked a lot of discussion in the community for showing how discrete diffusion LLMs can scale to 100B and run faster than traditional AR models.
LLaDA2.0: Scaling Up Diffusion Language Models to 100B (2512.15745)

Nymbo

posted an update 10 days ago

Post

1760

🚨 New tool for the Nymbo/Tools MCP server: The new Agent_Skills tool provides full support for Agent Skills (Claude Skills but open-source).

How it works: The tool exposes the standard discover/info/resources/validate actions. Skills live in /Skills under the same File_System root, and any bundled scripts run through Shell_Command, no new infrastructure required.

Agent_Skills(action="discover")  # List all available skills
Agent_Skills(action="info", skill_name="music-downloader")  # Full SKILL.md
Agent_Skills(action="resources", skill_name="music-downloader")  # Scripts, refs, assets

I've included a music-downloader skill as a working demo, it wraps yt-dlp for YouTube/SoundCloud audio extraction.

Caveat: On HF Spaces, Shell_Command works for most tasks, but some operations (like YouTube downloads) are restricted due to the container environment. For full functionality, run the server locally on your machine.

Try it out ~ https://www.nymbo.net/nymbot

YeolJoo

submitted a paper to Daily Papers 12 days ago

Vector Prism: Animating Vector Graphics by Stratifying Semantic Structure

Paper • 2512.14336 • Published 13 days ago • 28

AdinaY

posted an update 13 days ago

Post

4506

Finch 💰 an enterprise-grade benchmark that measures whether AI agents can truly handle real world finance & accounting work.

FinWorkBench/Finch

✨ Built from real enterprise data (Enron + financial institutions), not synthetic tasks
✨ Tests end-to-end finance workflows
✨ Multimodal & cross-file reasoning
✨ Expert annotated (700+ hours) and genuinely challenging hard

AdinaY

authored a paper 13 days ago

Finch: Benchmarking Finance & Accounting across Spreadsheet-Centric Enterprise Workflows

Paper • 2512.13168 • Published 14 days ago • 49

susunghong

submitted a paper to Daily Papers 13 days ago

DiffusionBrowser: Interactive Diffusion Previews via Multi-Branch Decoders

Paper • 2512.13690 • Published 13 days ago • 2

anyirao

authored a paper 18 days ago

Composing Concepts from Images and Videos via Concept-prompt Binding

Paper • 2512.09824 • Published 19 days ago • 27

katanaxu

authored 10 papers 19 days ago

Ethical-Lens: Curbing Malicious Usages of Open-Source Text-to-Image Models

Paper • 2404.12104 • Published Apr 18, 2024 • 1

Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction

Paper • 2412.00556 • Published Nov 30, 2024

MLLM-as-a-Judge for Image Safety without Human Labeling

Paper • 2501.00192 • Published Dec 31, 2024 • 31

Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation

Paper • 2412.01027 • Published Dec 2, 2024

DirectorLLM for Human-Centric Video Generation

Paper • 2412.14484 • Published Dec 19, 2024

Transfer between Modalities with MetaQueries

Paper • 2504.06256 • Published Apr 8 • 2

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

Paper • 2504.15585 • Published Apr 22 • 12

Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models

Paper • 2504.17789 • Published Apr 24 • 23

Improving Chain-of-Thought Efficiency for Autoregressive Image Generation

Paper • 2510.05593 • Published Oct 7

ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models

Paper • 2512.07843 • Published Nov 24 • 19

DavidVivancos

posted an update 23 days ago

Post

260

Need a new challenging Dataset? Now that #NeurIPS2025 is almost over.

DavidVivancos/NeuraxonLife2-1M

1 Million #Neuraxon Artificial Lives, from almost 10000 Research Game runs, with more than 21 Million Neurons and almost 4 years of Simulated Life.

Read the preprint here https://www.researchgate.net/publication/397331336_Neuraxon

And here you have all the code: https://github.com/DavidVivancos/Neuraxon

kaanakan

authored a paper 27 days ago

Infinity-RoPE: Action-Controllable Infinite Video Generation Emerges From Autoregressive Self-Rollout

Paper • 2511.20649 • Published Nov 25 • 45

Nymbo

posted an update about 1 month ago

Post

5032

🚀 I've just shipped a major update to the Nymbo/Tools MCP server: the Agent_Terminal, a single "master tool" that cuts token usage by over 90%!

Anthropic found 98.7% context savings using code execution with MCP, Cloudflare published similar findings. This is my open-source implementation of the same idea.

# The Problem

Traditional MCP exposes every tool definition directly to the model. With 12 tools, that's thousands of tokens consumed *before the conversation even starts*. Each tool call also passes intermediate results through the context window — a 10,000-row spreadsheet? That's all going into context just to sum a column.

# The Solution: One Tool to Rule Them All

Agent_Terminal wraps all 12 tools (Web_Search, Web_Fetch, File_System, Generate_Image, Generate_Speech, Generate_Video, Deep_Research, Memory_Manager, Obsidian_Vault, Shell_Command, Code_Interpreter) into a single Python code execution gateway.

Instead of the model making individual tool calls, it writes Python code that orchestrates the tools directly:

# Search for Bitcoin price
result = Web_Search("current price of bitcoin", max_results=3)
print(result)

Don't know what tools are available? The agent can discover them at runtime:

print(search_tools('image'))  # Find tools by keyword
print(usage('Generate_Image'))  # Get full docs for a specific tool

The individual direct tool calls are all still there, but they can be disabled if using the Agent_Terminal. Try it now - https://www.nymbo.net/nymbot

1 reply

AI & ML interests

Recent Activity

Team members 212

ICCV2023's activity