Yunlong Lin @ling_yunlong

XMU | AI agent | Embodied AI | Multimodal learning lyl1015.github.io Xiamen, China Joined January 2023

Tweets

41
Followers

72
Following

204
Likes

45

Aadit Sheth @aaditsh

2 weeks ago

This guy literally dropped the best visual guide to LLMs you’ll ever see

90 2K 12K 762K 15K

Download Video

Alex Prompter @alex_prompter

3 weeks ago

I tested ChatGPT-5 and Gemini 2.5 Pro with same critical prompts. The results will shock you. ChatGPT-5 Vs. Gemini 2.5 Pro (Video demos are included)

74 226 2K 437K 2K

Download Image

orange.ai @oran_ge

2 months ago

早晨起来，意外发现 Qwen3 Coder 发布了。 Qwen3 Coder 一个具备 Agent 能力的代码模型。这个模型在 Agentic Coding、Agentic Browser-Use 和 Agentic Tool-Use 上取得了开源模型的 SOTA。简单说，代码和 Agent 能力，可以和 Claude Sonnet4 相媲美。模型总参数量只有 480B，激活参数 35B。…

13 50 303 87K 316

Download Image

🧠Video Thinking Test for Reasoning LLMs🧠 *Video Thinking Test* (📽️Video-TT📽️) is a holistic benchmark to assess the advanced reasoning and understanding correctness/robustness between LLMs and humans #ICCV2025 - Project: zhangyuanhan-ai.github.io/video-tt/ - Data: huggingface.co/datasets/lmms-…

0 22 119 8K 52

Download Video

Wenhao Chai @wenhaocha1

2 months ago

Dataset Distillation as Data Compression: A Rate-Utility Perspective arxiv.org/abs/2507.17221 Read this paper tonight, get me some sense: Dataset Distillation ≈ Visual Tokenization? Dataset Distillation: Replace full dataset with few synthetic samples Visual Tokenizer: Replace…

2 6 45 3K 17

OpenAI @OpenAI

2 months ago

ChatGPT can now do work for you using its own computer. Introducing ChatGPT agent—a unified agentic system combining Operator’s action-taking remote browser, deep research’s web synthesis, and ChatGPT’s conversational strengths.

849 2K 14K 3.7M 5K

Download Video

Yunlong Lin @ling_yunlong

2 months ago

thanks for sharing!

Crynet @crynetio

2 months ago

thanks for sharing!

0 0 2 163 0

Download Video

0 0 0 66 0

AK @_akhaliq

2 months ago

MoVieS Motion-Aware 4D Dynamic View Synthesis in One Second

3 15 128 29K 78

Download Video

Yunlong Lin @ling_yunlong

2 months ago

Amazing！

Panwang Pan @paulpanwang

2 months ago

Amazing！

1 2 27 22K 11

Download Gif

0 0 0 85 0

Donghao Zhou @ CUHK @donghao_zhou

4 months ago

Check out our new work! 👇🏻

elvis @omarsar0

4 months ago

Check out our new work! 👇🏻

2 26 101 18K 58

Download Image

0 1 3 205 0

Panwang Pan @paulpanwang

2 months ago

🎉 Big thanks to @_akhaliq for featuring our work! We’re excited to release the 💻 code & 🤗 Hugging Face checkpoints for PartCrafter: 👉 github.com/wgsxm/PartCraf… --- ⭐️ PartCrafter generates multiple parts from a single RGB image in one unified pass. Stay tuned for updates! 🚀

AK @_akhaliq

3 months ago

11 190 2K 121K 1K

Download Video

2 20 92 8K 40

Zhengzhong Tu @_vztu

2 months ago

🤨Ever dream of a tool that can magically restore and upscale any (low-res) photo to crystal-clear 4K? 🔥Introducing "4KAgent: Agentic Any Image to 4K Super-Resolution", the most capable upscaling generalist designed to handle broad image types. 🔗4kagent.github.io 1/🧵

5 43 206 39K 122

Download Image

Panwang Pan @paulpanwang

2 months ago

Thanks @_akhaliq for sharing AnyCoder with us! It's an AI-powered code generator specifically focused on creating applications. If you're into faster, smarter UI development, definitely keep an eye on AnyCoder! huggingface.co/spaces/akhaliq…

3 3 15 8K 8

Download Image

AK @_akhaliq

2 months ago

JarvisArt Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent

5 30 124 29K 70

Download Video

Yunlong Lin @ling_yunlong

2 months ago

Thanks for sharing and promoting!@_akhaliq

AK @_akhaliq

2 months ago

Thanks for sharing and promoting!@_akhaliq

5 30 124 29K 70

Download Video

0 2 14 13K 4

Owen Tian Ye @tiny85114767

2 months ago

🚀 Introducing Kontext-Style-LoRAs! Turn any image into Ghibli, Jojo, Chibi, Chinese Ink & more — with ONE Kontext model. 🔥 Built on FLUX.1 Kontext ✨ Powered by GPT-4o paired data 🎨 10+ stylish LoRA adapters 💻 100% open-source [Apache-2.0] No more boring generations — remix…

5 11 80 5K 70

Download Image

Wenhao Chai @wenhaocha1

3 months ago

We introduce LiveCodeBench Pro. Models like o3-high, o4-mini, and Gemini 2.5 Pro score 0% on hard competitive programming problems.

5 27 189 24K 55

Download Image

Ziwei Liu @liuziwei7

2 months ago

📽️Expert-Level Cinematic Understanding in VLM📽️ #ShotBench: benchmark covering 8 core cinematography dimensions #ShotQA: 70k training dataset #ShotVL: 3B and 7B model surpassing GPT-4o on cinematic understanding - Project: vchitect.github.io/ShotBench-proj… - Code: github.com/Vchitect/ShotB…

0 16 105 9K 46

Download Video

Zhiwen(Aaron) Fan @zhiwen_fan_

3 months ago

We present VLM-3R: a Vision-Language Model capable of 3D spatial reasoning from monocular video, grounding visual cues, geometry, and camera motion. ✅ No depth sensor ✅ No pre-built 3D maps ✅ End-to-end spatial + temporal reasoning 🔗 Code & benchmark: vlm-3r.github.io…