For you Ai Security Dev Cloud Hardware Startups Releases General

From Hugging Face Blog · 29 stories

2 sources 2 reports 4h ago Updated 3h ago

Gemma 4 12B Boosts Multimodal AI Processing on Laptops

Google DeepMind introduced Gemma 4 12B, a new encoder-free multimodal AI model, enabling advanced processing on laptops with minimal memory. Gemma 4's architecture eliminates multimodal encoders, creating efficient audio and visual input processing. Collaboration with Cerebras and Hugging Face enhances real-time speech-to-speech capabilities, improving applications like voice assistants.

ai multimodal model huggingface cerebras
1 source 1 report 1d ago

ScarfBench Launches as New AI Benchmark for Java Framework Migration

ScarfBench provides a new open benchmark to evaluate AI agents on Enterprise Java framework migrations. It focuses on ensuring successful builds, deployments, and behavior preservation across major Java ecosystems like Spring and Jakarta EE, addressing gaps in existing AI-assisted modernization efforts.

general java ai software migration
1 source 1 report 1d ago

Hugging Face Integrates Every Eval Ever for Model Reporting

Hugging Face has integrated the Every Eval Ever (EEE) JSON schema into its Community Evals to standardize AI evaluation reporting. This collaboration aims to enhance trust and comparability in model performance, addressing inconsistencies in evaluation results reported across multiple formats.

ai evaluation huggingface ml
1 source 1 report 4d ago

Hybrid models outperform transformers in predicting meaning-rich tokens

Experiments revealed that hybrid models, like Olmo Hybrid, predict meaning-rich tokens better than transformers. However, on simple repetitive tokens, transformers maintain an edge, indicating differing strengths in architectural approaches.

ai hybrid language models transformers
1 source 1 report 4d ago

NVIDIA NeMo AutoModel Enhances Fine-Tuning for Generative AI Models

NVIDIA launched NeMo AutoModel, enhancing fine-tuning for generative AI models by enabling higher training performance. This tool achieves up to 3.7x faster training and reduces GPU memory use by up to 32%, making it easier for developers to implement advanced models without extensive code changes.

ai generative-ai neural-networks nvidia transformers
1 source 1 report 4d ago

Launch of FFASR Leaderboard to Benchmark ASR in Real-World Conditions

Treble Technologies and Hugging Face introduced the FFASR Leaderboard to evaluate Automatic Speech Recognition (ASR) models under far-field conditions. This community-driven benchmark aims to address the significant gap in performance between traditional clean-speech evaluations and real-world usage scenarios involving background noise and reverberation.

dev asr audio benchmarking voice
1 source 1 report 4d ago

IBM's CUGA Offers Lightweight Framework for Building Agentic Apps

IBM has released CUGA, a Configurable Generalist Agent harness that simplifies the development of agentic applications by automating the orchestration and state management. With two dozen example applications provided, developers can create functional agents quickly without extensive groundwork, increasing efficiency in building machine learning applications.

dev agentic cuga fastapi ibm
1 source 1 report 4d ago

Transformers.js improves browser-based AI model management with Cross-Origin Storage API

Transformers.js now integrates the proposed Cross-Origin Storage API to manage AI model resources more efficiently. This change reduces the redundant downloads of commonly used models across different web applications, addressing issues related to cache storage and data usage.

dev api models transformers web development
1 source 1 report 4d ago

PP-OCRv6: New OCR Model on Hugging Face with 50-Language Support

PaddleOCR has launched PP-OCRv6, a new OCR model with capabilities in 50 languages and scalability from 1.5M to 34.5M parameters. The model improves text detection and recognition accuracy compared to its predecessor, PP-OCRv5, making it suitable for a variety of real-world OCR applications.

dev ai model ocr paddleocr
1 source 1 report 4d ago

Local Models Triaged Issues in OpenClaw Repository for Free

In June 2026, local AI models were used to efficiently triage issues in the OpenClaw repository. This method allows for real-time notifications and reduces costs associated with cloud-based models, highlighting the growing importance of local AI implementation.

ai issue triage local models open source
1 source 1 report 4d ago

MosaicLeaks addresses privacy risks in deep research agents with new training method

MosaicLeaks reveals privacy vulnerabilities in deep research agents that combine private documents and web searches, leading to potential leakage of sensitive information. The proposed Privacy-Aware Deep Research (PA-DR) method improves task accuracy and decreases information leakage significantly, from 34.0% to 9.9% for full-information leakage.

ai privacy research security
1 source 1 report 4d ago

Benchmarking agent-driven software models with transformer tools

A new benchmarking approach evaluates the efficiency of coding agents in software development, focusing on task completion rather than just final output. This shift highlights the importance of designing libraries for effective agent interaction, emphasizing the need for clear APIs and documentation.

dev ai benchmarks coding transformers
1 source 1 report 4d ago

GLM-5.2 Launches with Advanced Long-Horizon Coding Capabilities

GLM-5.2 introduces a 1M-token context improving performance in long-horizon coding tasks. The model features enhanced coding capabilities and architecture improvements that significantly reduce computational costs while maintaining performance, marking it as a competitive player in the open-source sector.

dev ai coding model open-source
1 source 1 report 4d ago

New ARD Specification Enables Dynamic Agent Searches Across Tools

The Agentic Resource Discovery (ARD) specification has been developed collaboratively by major tech companies to allow agents to discover tools at runtime. This move shifts from a static model requiring pre-installed capabilities to dynamic, intent-based searches, enhancing the ability of agents to access and utilize a broader range of tools effectively.

general ard google huggingface microsoft
1 source 1 report 4d ago

Migrating CI from GitHub to Hugging Face Jobs for Enhanced Performance

Trackio has migrated its CI from GitHub Actions to Hugging Face Jobs, achieving a 30% reduction in CPU CI time and enabling GPU testing. This step is significant for improving efficiency and expanding testing capabilities in machine learning projects.

dev ci github huggingface jobs
1 source 1 report 4d ago

OpenEnv Gains Support from Major AI Organizations for Open Source Development

OpenEnv has transitioned to an open-source model coordinated by leading AI organizations such as Meta-PyTorch and Microsoft. This move aims to improve agent training efficiency across various AI harnesses and environments, fostering collaboration within the AI community.

ai openenv opensource reinforcementlearning
1 source 1 report 4d ago

Nemotron 3.5 Enhances Multimodal Content Safety with Custom Policies

Nemotron 3.5 introduces customizable multimodal safety integration, considering user prompts, images, and responses simultaneously. This update captures policy violations emerging from interaction, enhancing deployments across various global languages and industries.

ai general releases
1 source 1 report 4d ago

Direct Preference Optimization Reduces Text Degeneration in OCR Models

DharmaOCR introduces Direct Preference Optimization (DPO) to combat text degeneration in OCR models. The second training stage reduced degeneration rates by an average of 59.4%, addressing a significant limitation of supervised fine-tuning.

ai degeneration dpo ocr
1 source 1 report 4d ago

Holo3.1 Released with Local Execution and Enhanced Performance

Holo3.1 has been released, featuring enhanced robustness for local and mobile environments, quantized checkpoints for local inference, and improved performance across various deployment frameworks. This release addresses the challenges of deployment flexibility and performance consistency in diverse operational settings.

dev holo3 local mobile performance
1 source 1 report 4d ago

JetBrains Launches Mellum2: 12B Mixture-of-Experts AI Model

JetBrains has released Mellum2, a 12 billion-parameter Mixture-of-Experts model optimized for natural language and coding tasks. With efficient parameter activation and over 2x faster inference compared to similar models, Mellum2 is positioned for high-throughput AI applications.

ai jetbrains language-model mellum2
1 source 1 report 1d ago

Analysis of AI Specialization and Its Emergence as a Key Principle

A recent analysis highlights the inevitability of specialization in effective AI systems, drawing on various domains. It argues that focused AI systems outperform general models, correlating with findings in optimization theory and evolutionary biology.

ai machine learning optimization specialization
1 source 1 report 4d ago

Exploring Alternatives to LoRA in Parameter-Efficient Fine-Tuning

The article investigates alternatives to LoRA, the predominant technique in parameter-efficient fine-tuning (PEFT). It highlights the potential of PEFT techniques to reduce memory requirements for model fine-tuning and mentions the development of the PEFT library by Hugging Face, which supports various methods and improves accessibility.

dev finetuning huggingface lora peft
1 source 1 report 1d ago

DiScoFormer model estimates density and score for data distributions

The DiScoFormer model estimates both the density and score of data distributions in a single forward pass. This model improves upon existing methods by allowing for high-dimensional data analysis without the need for retraining, addressing challenges in density estimation and score matching.

ai ml transformers generative model
1 source 1 report 4d ago

Hugging Face simplifies vLLM server setup with single command

Hugging Face introduced a command to run a vLLM server easily, facilitating model testing and evaluation. This command allows users to quickly deploy models and interact with them via the OpenAI API using Hugging Face infrastructure.

dev api development huggingface vllm
1 source 2 reports 4d ago

Hugging Face Enhances CLI and Adopts Weekly Releases for Improved Efficiency

Hugging Face has updated their command-line interface (CLI) to cater to both human and artificial intelligence (AI) agents, optimizing token usage. Additionally, they have shifted to a weekly release schedule for the huggingface_hub Python client to accelerate the implementation of fixes and features. These changes enhance CLI efficiency and streamline the release process.

dev ai cli codingagents devops
1 source 1 report 4d ago

Strands Robots SDK integrates LeRobot for seamless robot task management

The Strands Robots SDK now integrates LeRobot hardware and simulations, streamlining task management for robots. Users can record, test, and deploy robot tasks with fewer tools, enhancing workflow efficiency across multiple robots.

dev automation integration robotics software
1 source 1 report 4d ago

Agent Creates 3D Paris Gallery Using Hugging Face Spaces

A coding agent utilized Hugging Face Spaces to create a web gallery featuring 3D Gaussian models of Paris monuments without manually engaging with image or 3D tools. This illustrates a shift towards modular software construction where AI integrates existing components easily.

ai 3dmodels huggingface softwaredevelopment
1 source 1 report 4d ago

Introduction of MCP Tools for Reachy Mini Enhances Remote Functionality

The Reachy Mini now supports remote tools through MCP canary Space, allowing the addition of external functionalities like weather queries. This update enhances the robot's interactivity and potential use cases without modifying the core app directly.

dev development reachy robotics tools
1 source 1 report 4d ago

Profiling in PyTorch: Expanding to Fused MLP with nn.Linear

The second part of the 'Profiling in PyTorch' series introduces the use of nn.Linear to create a Multilayer Perceptron (MLP) block. This change highlights how to efficiently profile and optimize deep learning models in PyTorch by leveraging GPU capabilities.

dev gpu mlp profiling pytorch