arxiv:2601.03252

InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields

Published on Jan 6

· Submitted by

Haotong Lin on Jan 7

#1 Paper of the day

Zhejiang University

Upvote

Authors:

Haotong Lin ,

Xueyang Zhang ,

Sida Peng

Abstract

InfiniDepth represents depth as neural implicit fields using a local implicit decoder, enabling continuous 2D coordinate querying for arbitrary-resolution depth estimation and superior performance in fine-detail regions.

AI-generated summary

Existing depth estimation methods are fundamentally limited to predicting depth on discrete image grids. Such representations restrict their scalability to arbitrary output resolutions and hinder the geometric detail recovery. This paper introduces InfiniDepth, which represents depth as neural implicit fields. Through a simple yet effective local implicit decoder, we can query depth at continuous 2D coordinates, enabling arbitrary-resolution and fine-grained depth estimation. To better assess our method's capabilities, we curate a high-quality 4K synthetic benchmark from five different games, spanning diverse scenes with rich geometric and appearance details. Extensive experiments demonstrate that InfiniDepth achieves state-of-the-art performance on both synthetic and real-world benchmarks across relative and metric depth estimation tasks, particularly excelling in fine-detail regions. It also benefits the task of novel view synthesis under large viewpoint shifts, producing high-quality results with fewer holes and artifacts.

View arXiv page View PDF Add to collection

Community

haotongl

Paper author Paper submitter 2 days ago

Depth Beyond Pixels 🚀
We Introduce InfiniDepth — casting monocular depth estimation as a neural implicit field.
🔍 Arbitrary-Resolution
📐 Accurate Metric Depth
📷 Single-View NVS under large viewpoints shifts
Arxiv: https://arxiv.org/abs/2601.03252
page: https://zju3dv.github.io/InfiniDepth

grantsing

about 23 hours ago

arXiv explained breakdown of this paper 👉 https://arxivexplained.com/papers/infinidepth-arbitrary-resolution-and-fine-grained-depth-estimation-with-neural-implicit-fields

yyh929

2 days ago

🎉🎉🎉

seasideGargantua

2 days ago

🐮

Edisoneh

2 days ago

🐂

sun10

1 day ago

🐂

Yujie0012

1 day ago

👏

mishig

1 day ago

InfiniDepth: Main Results and Key Findings

Overview

InfiniDepth introduces a revolutionary approach to monocular depth estimation by representing depth as neural implicit fields rather than discrete grids. This enables arbitrary-resolution and fine-grained depth prediction, addressing fundamental limitations of existing methods.

Key Innovations and Results

1. Neural Implicit Field Representation

Figure 1 showcases InfiniDepth's three main capabilities:

(a) Arbitrary-resolution depth estimation - can query depth at any continuous coordinate
(b) Fine-grained point clouds with geometric detail preservation
(c) Enhanced novel view synthesis with fewer holes and artifacts

The core insight is modeling depth as a continuous function:

d_I(x,y) = N_θ(I,(x,y))

where any 2D coordinate can be mapped to a depth value, breaking free from grid constraints.

2. Multi-Scale Local Implicit Decoder

Figure 2 illustrates the two-module architecture:

Feature Query (a):

Extracts multi-scale features from ViT encoder layers
Constructs feature pyramid with different spatial resolutions
Uses bilinear interpolation to query features at continuous coordinates

Depth Decoding (b):

Hierarchically fuses features from high-to-low resolution
Employs residual gated fusion blocks
Predicts depth through lightweight MLP head

3. Infinite Depth Query Strategy

Figure 3 demonstrates a key breakthrough: traditional per-pixel depth prediction creates density imbalance due to perspective projection and surface orientation effects. InfiniDepth's adaptive query strategy:

Computes adaptive weights: w(x,y) = d_I(x,y)² / |n(x,y)·v(x,y)| + ε
Allocates sub-pixel query budgets proportionally to 3D surface area
Generates uniformly distributed 3D points on object surfaces

4. High-Quality Internal Geometry

Figure 4 shows that the model learns high-quality internal geometry, with normal maps computed through autograd revealing detailed surface structure.

Quantitative Results

Synthetic Benchmark (Synth4K)

The paper introduces Synth4K, a new 4K synthetic benchmark from five games with diverse scenes and geometric details.

Relative Depth Estimation (Table 1):

InfiniDepth achieves state-of-the-art performance across all metrics
Particularly strong in high-frequency (HF) masked regions
δ₁ accuracy significantly outperforms baselines

Metric Depth Estimation (Table 2):

Combined with sparse depth inputs ("Ours-Metric")
Superior performance on stricter thresholds (δ₀.₀₁, δ₀.₀₂, δ₀.₀₄)

Real-World Benchmarks

Relative Depth (Table 3):

Competitive performance on KITTI, ETH3D, NYUv2, ScanNet, DIODE
On par with current SOTA methods

Metric Depth (Table 4):

Clear improvements over existing metric depth methods
Outperforms Marigold-DC, Omni-DC, PriorDA, PromptDA

Qualitative Comparisons

Depth Map Quality

Figure 5 shows:

First two rows: Synth4K predictions with superior detail preservation
Bottom row: Real-world data with low-resolution input
Highlighted boxes demonstrate fine-detail recovery capabilities

Metric Depth Results

Figure 6 highlights geometric detail recovery in high-frequency regions, showing cleaner edges and better preservation of fine structures.

Novel View Synthesis

Figure 8 demonstrates superior novel view synthesis under large viewpoint shifts:

InfiniDepth produces complete, stable results
ADGaussian baseline shows noticeable geometric holes and artifacts
The infinite depth query strategy ensures uniform point distribution

Key Ablation Studies

Depth Representation Effectiveness (Table 5)

Neural implicit fields significantly outperform discrete grid representations
Gains more pronounced in metric depth estimation with sparse inputs

Multi-Scale Feature Query

Multi-scale mechanism brings substantial improvements
Single-scale baseline performs considerably worse

Computational Efficiency (Table 6)

Decoder has the lowest parameter count among compared methods
Competitive computational efficiency despite superior detail preservation

Impact and Significance

Resolution Independence: Breaks free from training resolution constraints
Fine Detail Preservation: Excels in geometrically complex regions
Multi-Task Versatility: Effective for both relative and metric depth estimation
Downstream Applications: Benefits novel view synthesis, 3D reconstruction pipelines
Benchmark Contribution: Synth4K enables better evaluation of high-resolution depth estimation

Limitations and Future Work

No explicit temporal consistency for video applications
Future work: extend to multi-view settings for improved temporal stability and 3D consistency

InfiniDepth represents a fundamental shift in depth estimation, moving from discrete grid representations to continuous neural implicit fields, enabling unprecedented resolution scalability and detail preservation capabilities.

librarian-bot

about 21 hours ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2601.03252 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2601.03252 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2601.03252 in a Space README.md to link it from this page.

InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields

Abstract

Community

InfiniDepth: Main Results and Key Findings

Overview

Key Innovations and Results

1. Neural Implicit Field Representation

2. Multi-Scale Local Implicit Decoder

3. Infinite Depth Query Strategy

4. High-Quality Internal Geometry

Quantitative Results

Synthetic Benchmark (Synth4K)

Real-World Benchmarks

Qualitative Comparisons

Depth Map Quality

Metric Depth Results

Novel View Synthesis

Key Ablation Studies

Depth Representation Effectiveness (Table 5)

Multi-Scale Feature Query

Computational Efficiency (Table 6)

Impact and Significance

Limitations and Future Work

Models citing this paper 0

Datasets citing this paper 0

Spaces citing this paper 0

Collections including this paper 7