cs.AI updates on the arXiv.org e-print archive.
From Load Tests to Live Streams: Graph Embedding-Based Anomaly Detection in Microservice Architectures
1 week ago
KD-MARL: Resource-Aware Knowledge Distillation in Multi-Agent Reinforcement Learning
1 week ago
Efficient Quantization of Mixture-of-Experts with Theoretical Generalization Guarantees
1 week ago
cs.LG updates on the arXiv.org e-print archive.
Quantization Impact on the Accuracy and Communication Efficiency Trade-off in Federated Learning for Aerospace Predictive Maintenance
1 week ago
The Workload-Router-Pool Architecture for LLM Inference Optimization: A Vision Paper from the vLLM Semantic Router Project
1 week ago
Fast Heterogeneous Serving: Scalable Mixed-Scale LLM Allocation for SLO-Constrained Inference
1 week ago
Community focused on running large language models locally. Covers llama.cpp, Ollama, quantization, and open-weight models.
Just bought a DGX Spark, what kind of VLMs are you guys running on this kind of hardware?
2 weeks ago
I put a transformer model on a stock Commodore 64
2 weeks ago
Strix Halo + eGPU RTX 5070 Ti via OCuLink in llama.cpp: Benchmarks and Conclusions (Part 2)
2 weeks ago
cs.CV updates on the arXiv.org e-print archive.
SUPERGLASSES: Benchmarking Vision Language Models as Intelligent Agents for AI Smart Glasses
1 week ago
DP-DeGauss: Dynamic Probabilistic Gaussian Decomposition for Egocentric 4D Scene Reconstruction
1 week ago
HAWK: Head Importance-Aware Visual Token Pruning in Multimodal Models
1 week ago
AI Technology & Industry Review
Comment on DeepSeek-V3 New Paper is coming! Unveiling the Secrets of Low-Cost Large Model Training through Hardware-Aware Co-design by Video to Text
1 week ago
Comment on Microsoft’s Fully Pipelined Distributed Transformer Processes 16x Sequence Length with Extreme Hardware Efficiency by gin'gin li
2 weeks ago
Comment on Microsoft’s Fully Pipelined Distributed Transformer Processes 16x Sequence Length with Extreme Hardware Efficiency by Awesome Skills
2 weeks ago
Connecting The Global Electronics Industry
The Magic of Agentic AI Will Come From a Holistic Approach to Chip Design
1 week ago
Edge AI Is Forcing a Rethink of Predictive Maintenance Architecture
2 weeks ago
Silicon Choices Grow in Importance as Industrial AI Moves Closer to the Factory Floor
2 weeks ago
A leading provider of news and information on the AI industry
Quantum Method May Reduce Memory Needs for AI Systems
1 week ago
Amazon Deepens AI Infrastructure Push as Uber Expands AWS Deal to Adopt Graviton and Trainium Chips
2 weeks ago
Microsoft AI Launches Multimodal Foundation Models to Expand In-House AI Capabilities
2 weeks ago
Latest technology news, AI breakthroughs, and electric vehicle developments from China's innovative tech landscape
PolyU and OPPO Propose Vision-Only Super-Resolution Framework VOSR
1 week ago
NIO’s 2026 ONVO L90 to Feature In-House 5nm Autonomous Driving Chip
2 weeks ago
Alibaba DAMO Academy Launches XuanTie C950 CPU for Large AI Models
1 month ago
Enterprise technology leadership news covering IT strategy, digital transformation, and CIO decision-making.
Leveraging heterogeneous computing architecture to power AI solutions
1 week ago
시스코, 차세대 AI 인프라 청사진 제시…“성능·전력·보안 역량 강화”
2 weeks ago
Intel bets on Terafab to help it reassert itself in the AI chip race
2 weeks ago
All the latest content from the Tom's Hardware team
Intel and SambaNova team up on heterogenous AI inference platform — different hardware performs different workloads
2 weeks ago
Intel introduces its own Neural Compression technology with a fallback mode that works on GPUs without dedicated AI cores — early performance is on the level of Nvidia NTC
2 weeks ago
Researchers train living rat neurons to perform real-time AI computations — experiments could pave the way for new brain-machine interfaces
2 weeks ago
Bloomberg Technology
Harvard’s Kreiman Seeks $100 Million to Build AI Memory Tech
1 week ago
Nvidia, Arm Return the CPU to Center Stage in the Age of AI
3 weeks ago
Meeting Surging Demand for AI Memory Chips Has a Climate Cost
1 month ago
The most recent home feed on DEV Community.
TGI - Text Generation Inference - Install, Config, Troubleshoot
1 week ago
Đưa World Model Từ Bản Demo Đẹp Mắt Thành Trải Nghiệm Tương Tác Thực Sự Trên GPU Phổ Thông
1 week ago
Building an ML-Powered Notification Router on AWS: A Production Architecture Guide
2 weeks ago
The latest technology news, reviews, gadgets, launches and products.
As AI race with US intensifies, China’s Alibaba launches 10,000-card computing cluster
2 weeks ago
AI infrastructure on the front line: Lessons for Asean from the Iran war
3 weeks ago
Alibaba debuts its latest RISC-V-based chip amid shift to AI agents
4 weeks ago
Making AI accessible to 100K+ learners. Find the most practical, hands-on and comprehensive AI Engineering and AI for Work certifications at academy.towardsai.net - we have pathways for any experience ...
Multimodal AI Systems: Real vs. Batch Processing
1 week ago
Breaking the Memory Wall: TurboQuant KV Cache Quantization on Apple Silicon
2 weeks ago
Inside LLM Inference: KV Cache, Prefill, and the Decode Bottleneck
2 weeks ago
Deep Insights For Chip Engineers
Rethinking Robotics Reinforcement Learning: A Practical Humanoid Training Workflow
2 weeks ago
Fast Isn’t Fast Enough: Redefining Metrics for Edge AI
2 weeks ago
Redefining AI Inference With New Silicon Architecture
2 weeks ago
News and Industry Trends
CEA-Leti, CEA-List and PSMC Collaborate to Integrate RISC-V and MicroLED Silicon Photonics into 3D Stacking and Interposer for Next-Generation AI
2 weeks ago
Hailo Demonstrates Groundbreaking Edge AI Processors for Intelligent Security Systems at ISC West 2026
4 weeks ago
New Computer Chip Material Inspired by the Human Brain Could Slash AI Energy Use
1 month ago
Technical deep dives from NVIDIA on GPU computing, CUDA, deep learning frameworks, and AI infrastructure.
Running AI Workloads on Rack-Scale Supercomputers: From Hardware to Topology-Aware Scheduling
2 weeks ago
Accelerate Token Production in AI Factories Using Unified Services and Real-Time AI
3 weeks ago
Maximize AI Infrastructure Throughput by Consolidating Underutilized GPU Workloads
4 weeks ago
Startup and Technology News
Cognichip wants AI to design the chips that power AI, and just raised $60M to try
3 weeks ago
Google unveils TurboQuant, a new AI memory compression algorithm — and yes, the internet is calling it ‘Pied Piper’
4 weeks ago
Startup Gimlet Labs is solving the AI inference bottleneck in a surprisingly elegant way
1 month ago
The Future of AI Is Open and Proprietary
4 weeks ago
Blowing Off Steam: How Power-Flexible AI Factories Can Stabilize the Global Energy Grid
4 weeks ago
Advancing Open Source AI, NVIDIA Donates Dynamic Resource Allocation Driver for GPUs to Kubernetes Community
4 weeks ago
Technology insight for the enterprise
Meta’s Muse Spark: a smaller, faster AI model for broad app deployment
1 week ago
AWS turns its S3 storage service into a file system for AI agents
2 weeks ago
Google gives enterprises new controls to manage AI inference costs and reliability
2 weeks ago
All News
NVIDIA AI Ecosystem Expands as Marvell Joins Forces Through NVLink Fusion
3 weeks ago
NVIDIA and Global Industrial Software Giants Bring Design, Engineering and Manufacturing Into the AI Era
1 month ago
Hyundai Motor, Kia and NVIDIA Expand Strategic Partnership for Next-Generation Autonomous Driving Technology
1 month ago
Artificial Intelligence: News, Business, Research
Apple gets full Gemini access and uses distillation to build lightweight on-device AI
3 weeks ago
Arm breaks from its licensing-only model with first in-house chip built for AI data centers
4 weeks ago
Qualcomm shrinks AI reasoning chains by 2.4x to fit thinking models on smartphones
1 month ago
ComfyUI LTX Lora Trainer for 16GB VRAM
2 weeks ago
Black Forest Labs just released FLUX.2 Small Decoder: a faster, drop-in replacement for their standard decoder. ~1.4x faster, Lower peak VRAM - Compatible with all open FLUX.2 models
2 weeks ago
Made a 4 minute video with a 53 word single prompt, with my new video pipeline tool that goes from a simple or complex single prompt to a full video. I haven't fully tested the maximum length based on the context window I have but its a revolutionary product on consumer hardware. RTX 4090 laptop
2 weeks ago
#1 Blog in Deutschland mit Fokus auf Künstliche Intelligenz und Robotik
NVIDIA: Neue Sicherheitsbedrohungen für KI-Infrastrukturen
2 weeks ago
Biologische Computer: Cortical Labs nutzt menschliche Neuronen für neue Systeme
2 weeks ago
Apples MacBook Pro M5: Neue Maßstäbe für KI-Leistung
2 weeks ago
No-Nvidia interconnect club delivers 2.0 spec before v1.0 silicon ships
2 weeks ago
How Nvidia learned to embrace the light in its quest for scale
2 weeks ago
PrismML debuts energy-sipping 1-bit LLM in bid to free AI from the cloud
2 weeks ago
Tech Funding News
SuperSeed’s £50M Physical AI fund: the next UK deeptech play?
1 month ago
Fidelity, Qualcomm back Frore’s $143M liquid cooling bet for NVIDIA GPUs
1 month ago
Standard Kernel wants to outsmart NVIDIA’s libraries with AI kernels. Here’s how!
1 month ago
cs.CL updates on the arXiv.org e-print archive.
Double: Breaking the Acceleration Limit via Double Retrieval Speculative Parallelism
1 week ago
Dual-Pool Token-Budget Routing for Cost-Efficient and Reliable LLM Serving
1 week ago
MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU
2 weeks ago
Faster Diffusion on Blackwell: MXFP8 and NVFP4 with Diffusers and TorchAO
2 weeks ago
Monarch: an API to your supercomputer
2 weeks ago
PyTorch Foundation Welcomes Helion as a Foundation-Hosted Project to Standardize Open, Portable, and Accessible AI Kernel Authoring
2 weeks ago
แกร่งทะลุขีดจำกัดชิป AI รู้จักชิป ‘Memristor’ ทนความร้อนทะลุ 700 องศา! นักวิจัยพบโดยบังเอิญระหว่างทดลองวัสดุอื่น
2 weeks ago
สรุป 20 กฎเหล็กและวิสัยทัศน์ของ Jensen Huang ตั้งแต่การสร้างชิป AI ปรัชญาการบริหารองค์กร มุมมองต่อ AGI และ การส่งจิตสำนึกสู่อวกาศ
4 weeks ago
XPENG ยกระดับสู่ยุค ‘Physical AI’ เปลี่ยนรถให้คิดได้ เปิดตัว ‘New X9’ พวงมาลัยขวาครั้งแรกของโลก พร้อมชู Turing AI Chip ขุมพลังใหม่ของรถอัจฉริยะ
1 month ago
Where Innovation Meets Imagination
Dexterity says its physical AI world model ‘unlocks full potential on Nvidia hardware’
1 month ago
Nebius teams with Nvidia to build cloud for robotics and physical AI
1 month ago
From industrial robot arms to humanoids: Nvidia tightens its grip on the future of robotics
1 month ago