一种单视图江豚三维模型重建方法

黄志勇, 杨晨龙, 石小涛, 华喜锋, 涂法宪, 丁妥君, 佘雅丽, 向梦丽

黄志勇, 杨晨龙, 石小涛, 华喜锋, 涂法宪, 丁妥君, 佘雅丽, 向梦丽. 一种单视图江豚三维模型重建方法[J]. 水生生物学报, 2025, 49(4): 042510. DOI: 10.7541/2025.2024.0183
引用本文: 黄志勇, 杨晨龙, 石小涛, 华喜锋, 涂法宪, 丁妥君, 佘雅丽, 向梦丽. 一种单视图江豚三维模型重建方法[J]. 水生生物学报, 2025, 49(4): 042510. DOI: 10.7541/2025.2024.0183
HUANG Zhi-Yong, YANG Chen-Long, SHI Xiao-Tao, HUA Xi-Feng, TU Fa-Xian, DING Tuo-Jun, SHE Ya-Li, XIANG Meng-Li. A SINGLE-VIEW 3D MODEL RECONSTRUCTION METHOD FOR YANGTZE FINLESS PORPOISE[J]. ACTA HYDROBIOLOGICA SINICA, 2025, 49(4): 042510. DOI: 10.7541/2025.2024.0183
Citation: HUANG Zhi-Yong, YANG Chen-Long, SHI Xiao-Tao, HUA Xi-Feng, TU Fa-Xian, DING Tuo-Jun, SHE Ya-Li, XIANG Meng-Li. A SINGLE-VIEW 3D MODEL RECONSTRUCTION METHOD FOR YANGTZE FINLESS PORPOISE[J]. ACTA HYDROBIOLOGICA SINICA, 2025, 49(4): 042510. DOI: 10.7541/2025.2024.0183
黄志勇, 杨晨龙, 石小涛, 华喜锋, 涂法宪, 丁妥君, 佘雅丽, 向梦丽. 一种单视图江豚三维模型重建方法[J]. 水生生物学报, 2025, 49(4): 042510. CSTR: 32229.14.SSSWXB.2024.0183
引用本文: 黄志勇, 杨晨龙, 石小涛, 华喜锋, 涂法宪, 丁妥君, 佘雅丽, 向梦丽. 一种单视图江豚三维模型重建方法[J]. 水生生物学报, 2025, 49(4): 042510. CSTR: 32229.14.SSSWXB.2024.0183
HUANG Zhi-Yong, YANG Chen-Long, SHI Xiao-Tao, HUA Xi-Feng, TU Fa-Xian, DING Tuo-Jun, SHE Ya-Li, XIANG Meng-Li. A SINGLE-VIEW 3D MODEL RECONSTRUCTION METHOD FOR YANGTZE FINLESS PORPOISE[J]. ACTA HYDROBIOLOGICA SINICA, 2025, 49(4): 042510. CSTR: 32229.14.SSSWXB.2024.0183
Citation: HUANG Zhi-Yong, YANG Chen-Long, SHI Xiao-Tao, HUA Xi-Feng, TU Fa-Xian, DING Tuo-Jun, SHE Ya-Li, XIANG Meng-Li. A SINGLE-VIEW 3D MODEL RECONSTRUCTION METHOD FOR YANGTZE FINLESS PORPOISE[J]. ACTA HYDROBIOLOGICA SINICA, 2025, 49(4): 042510. CSTR: 32229.14.SSSWXB.2024.0183

一种单视图江豚三维模型重建方法

基金项目: 国家自然科学基金 (52279069) 资助
详细信息
    作者简介:

    黄志勇(1979—), 男, 博士, 副教授; 主要从事计算机视觉方面研究。E-mail: hzy@hzy.org.cn

    通信作者:

    石小涛(1981—), 男, 博士, 教授; 主要从事生态水利方面研究。E-mail: fishlab@163.com

  • 中图分类号: Q-334

A SINGLE-VIEW 3D MODEL RECONSTRUCTION METHOD FOR YANGTZE FINLESS PORPOISE

Funds: Supported by the National Natural Science Foundation of China (52279069)
    Corresponding author:
  • 摘要:

    在江豚三维重建领域, 存在水下图像色偏失真、江豚数据集不足、获取江豚多视角图像困难等问题, 而新兴方法尚未出现针对江豚的应用研究。为了解决这些难题, 文章提出了一种结合扩散模型和神经辐射场的单视图江豚三维模型重建方法。首先, 改进水下图像增强方法, 有效地解决水下图像色偏失真的问题。其次, 自制江豚多视角图像数据集, 微调视角条件扩散模型, 实现由单视图合成多视角图像, 为单张图像重建江豚提供了新思路。最后, 由神经辐射场进行重建, 得到江豚三维模型。对江豚三维重建的结果使用平均倒角距离和法向量一致性进行了对比评估, 平均倒角距离低于现有方法, 法向量一致性高于现有方法, 表明文章方法能够有效重建出符合江豚体色及形态的三维模型, 合成新视角图像PSNR、SSIM、LPIPS值分别为38.968、0.972和0.294, 效果优于现有方法, 经过水下图像增强的重建结果的平均倒角距离值最低为0.428, 法向量一致性最高达到0.882。

    Abstract:

    In the field of 3D reconstruction of Yangtze finless porpoises, challenges such as underwater image color distortion, limited datasets, and difficulty in capturing multi-view images of Yangtze porpoises remain significant. Emerging methods have yet to address these issues specifically for Yangtze finless porpoises. To tackle these challenges, this paper proposes a novel single-view 3D reconstruction method for Yangtze finless porpoises, combining diffusion models and neural radiance fields. First, an improved underwater image enhancement technique is developed to effectively address the issue of underwater color distortion. Second, a custom multi-view image dataset of Yangtze finless porpoises is created to fine-tune a view-conditioned diffusion model, enabling the synthesis of multi-view images from a single view. This provides a new approach for reconstructing Yangtze finless porpoises from a single image. Finally, a neural radiance field is employed to reconstruct the 3D model of the porpoise. The reconstruction results were evaluated using the average chamfer distance (ACD) and normal consistency (NC). The proposed method achieved lower ACD and higher NC compared to existing methods, demonstrating its effectiveness in reconstructing 3D models that accurately capture the coloration and morphology of Yangtze finless porpoises. The synthesized multi-view images achieved PSNR, SSIM, and LPIPS values of 38.968, 0.972, and 0.294, respectively, surpassing the performance of existing methods. Additionally, the reconstruction results after underwater image enhancement yielded the lowest ACD of 0.428 and the highest NC of 0.882, further highlighting the superiority of the proposed approach.

  • 图  1   扫描获得的江豚基础模型

    Figure  1.   Basic model of finless porpoise obtained by scanning

    图  2   经过骨骼绑定、蒙皮后的江豚动作模型

    Figure  2.   A model of a finless porpoise after bone rigging and skinning

    图  3   一组江豚视图和相机位姿数据集(包括江豚模型的12个不同角度的视图, 以及其对应的相机位姿)

    Figure  3.   A set of views and camera pose datasets including 12 different angle views of the finless porpoise model and their corresponding camera pose

    图  4   新视图合成及三维重建示意图

    Figure  4.   Schematic diagram of new view synthesis and 3D reconstruction

    图  5   通道注意力ECA-Net模块

    Figure  5.   Channel Attention ECA-Net Module

    图  6   原始图像(a)、增强处理后的图像(b)、分割后的图像(c)和深度图(d)

    Figure  6.   Original image (a), enhanced image (b), segmented image (c), and depth map (d)

    图  7   视角条件扩散模型原理示意图

    Figure  7.   Schematic diagram of the principle of view-condition diffusion model

    图  8   神经辐射场使用的全连接神经网络架构

    Figure  8.   Fully connected neural network architecture used by neural radiance fields

    图  9   利用视角条件扩散模型进行新视图合成的效果

    Figure  9.   The effect of new view synthesis using the perspective conditional diffusion model

    图  10   本文方法与RealFusion、One-2-3-45的重建结果对比

    Figure  10.   Comparison of the reconstruction results of the proposed method with RealFusion and One-2-3-45

    图  11   未经水下图像增强的重建结果和经过水下图像增强的重建结果对比

    Figure  11.   Comparison of the reconstruction results without and with after underwater image enhancement

    图  12   重建质量不佳的模型示意图

    Figure  12.   Schematic diagram of a poorly reconstructed model

    表  1   关键属性统计信息

    Table  1   Key attribute statistics

    属性Property 值Value
    边的数量 2187315
    面的数量 1458210
    点的数量 729107
    随机渲染颜色 (0.94, 0.27, 0.73)
    下载: 导出CSV

    表  2   新视图合成指标对比

    Table  2   Comparison of composite indicators for new views

    指标IndexRealFusionOne-2-3-45Ours
    PSNR↑37.78438.13238.968
    SSIM↑0.9430.9550.972
    LPIPS↓0.3050.2980.294
    下载: 导出CSV

    表  3   网格模型平均倒角距离评估

    Table  3   Evaluation of average chamfer distance of mesh model

    江豚Finless porpoise RealFusion One-2-3-45 Ours (未增强) Ours (增强)
    江豚1 1.146 2.352 0.689 0.503
    江豚2 0.871 2.341 0.649 0.554
    江豚3 1.217 1.761 0.733 0.535
    江豚4 1.473 3.576 0.675 0.583
    江豚5 1.591 2.287 0.874 0.759
    江豚6 1.748 1.237 0.462 0.428
    下载: 导出CSV

    表  4   法向量一致性评估

    Table  4   Evaluation of normal vector consistency evaluation

    江豚Finless porpoise RealFusion One-2-3-45 Ours (未增强) Ours (增强)
    江豚1 0.624 0.602 0.853 0.866
    江豚2 0.705 0.483 0.824 0.841
    江豚3 0.718 0.472 0.819 0.837
    江豚4 0.531 0.415 0.807 0.849
    江豚5 0.683 0.585 0.763 0.774
    江豚6 0.524 0.673 0.876 0.882
    下载: 导出CSV
  • [1] 程兆龙, 李永涛, 左涛, 等. 我国东亚江豚的研究现状、面临的威胁与保护建议 [J]. 应用海洋学学报, 2024, 43(3): 597-606.] doi: 10.3969/J.ISSN.2095-4972.20230601002

    Cheng Z L, Li Y T, Zuo T, et al. Threats and conservation strategies of the East Asian finless porpoises in China [J]. Journal of Applied Oceanography, 2024, 43(3): 597-606. [ doi: 10.3969/J.ISSN.2095-4972.20230601002

    [2] 王康伟, 周开亚, 陈敏敏, 等. 长江江豚迁地保护需要注意的几个问题 [J]. 南京师大学报(自然科学版), 2024, 47(2): 91-98.]

    Wang K W, Zhou K Y, Chen M M, et al. Beware of several problems in ex-situ protection of Yangtze finless porpoise [J]. Journal of Nanjing Normal University (Natural Science Edition), 2024, 47(2): 91-98. [

    [3] 郝玉江, 唐斌, 梅志刚, 等. 长江江豚保护进展的回顾性分析及进一步保护建议 [J]. 水生生物学报, 2024, 48(6): 1065-1072.] doi: 10.7541/2024.2024.0020

    Hao Y J, Tang B, Mei Z G, et al. Further suggestions on conservation of the Yangtze finless porpoise based on retrospective analysis of the current progress [J]. Acta Hydrobiologica Sinica, 2024, 48(6): 1065-1072. [ doi: 10.7541/2024.2024.0020

    [4]

    Zuffi S, Kanazawa A, Jacobs D W, et al. 3D Menagerie: Modeling the 3D Shape and Pose of Animals [C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). July 21-26, 2017, Honolulu, HI, USA. IEEE, 2017: 5524-5532.

    [5]

    Rüegg N, Zuffi S, Schindler K, et al. BARC: Learning to Regress 3D Dog Shape from Images by Exploiting Breed Information [C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 18-24, 2022, New Orleans, LA, USA. IEEE, 2022: 3866-3874.

    [6]

    Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models [J]. Advances in Neural Information Processing Systems, 2020(33): 6840-6851.

    [7]

    Chan E R, Nagano K, Chan M A, et al. Generative Novel View Synthesis with 3d-aware Diffusion Models [C]. 2023 IEEE/CVF International Conference on Computer Vision (ICCV). October 1-6, 2023, Paris, France. IEEE, 2023: 4217-4229.

    [8]

    Watson D, Chan W, Martin-Brualla R, et al. Novel view synthesis with diffusion models [EB/OL]. 2022: 2210.04628. https://arxiv.org/abs/2210.04628v1.

    [9]

    Melas-Kyriazi L, Laina I, Rupprecht C, et al. RealFusion 360° Reconstruction of Any Object from a Single Image [C]. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 17-24, 2023. Vancouver, BC, Canada. IEEE, 2023: 8446-8455.

    [10]

    Liu R, Wu R, Van Hoorick B, et al. Zero-1-to-3: Zero-shot One Image to 3D Object [C]. 2023 IEEE/CVF International Conference on Computer Vision (ICCV). October 1-6, 2023, Paris, France. IEEE, 2023: 9264-9275.

    [11]

    Deitke M, Schwenk D, Salvador J, et al. Objaverse: A Universe of Annotated 3D Objects [C]. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 17-24, 2023, Vancouver, BC, Canada. IEEE, 2023: 13142-13153.

    [12]

    Arampatzakis V, Pavlidis G, Mitianoudis N, et al. Monocular depth estimation: a thorough review [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(4): 2396-2414. doi: 10.1109/TPAMI.2023.3330944

    [13]

    Hu K, Weng C, Zhang Y, et al. An overview of underwater vision enhancement: from traditional methods to recent deep learning [J]. Journal of Marine Science and Engineering, 2022, 10(2): 241. doi: 10.3390/jmse10020241

    [14]

    Jaffe J S. Computer modeling and the design of optimal underwater imaging systems [J]. IEEE Journal of Oceanic Engineering, 1990, 15(2): 101-111. doi: 10.1109/48.50695

    [15]

    Mobley C D. Light and Water: Radiative Transfer in Natural Waters [M]. Academic Press, 1994.

    [16]

    Anwar S, Li C, Porikli F. Deep underwater image enhancement [EB/OL]. 2018: 1807.03528. https://arxiv.org/abs/1807.03528.

    [17]

    Fu Z, Wang W, Huang Y, et al. Uncertainty Inspired Underwater Image Enhancement [C]. 2022 European Conference on Computer Vision (ECCV). Cham: Springer Nature Switzerland, 2022: 465-482.

    [18]

    Wang Q, Wu B, Zhu P, et al. ECA-net: Efficient Channel Attention for Deep Convolutional Neural Networks [C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 13-19, 2020, Seattle, WA, USA. IEEE, 2020: 11531-11539.

    [19]

    Ranftl R, Bochkovskiy A, Koltun V. Vision Transformers for Dense Prediction [C]. 2021 IEEE/CVF International Conference on Computer Vision (ICCV). October 10-17, 2021, Montreal, QC, Canada. IEEE, 2021: 12159-12168.

    [20]

    Saharia C, Chan W, Saxena S, et al. Photorealistic text-to-image diffusion models with deep language understanding [J]. Advances in neural information processing systems, 2022(35): 36479-36494.

    [21]

    Rombach R, Blattmann A, Lorenz D, et al. High-Resolution Image Synthesis with Latent Diffusion Models [C]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 18-24, 2022, New Orleans, LA, USA. IEEE, 2022: 10674-10685.

    [22]

    Schuhmann C, Beaumont R, Vencu R, et al. Laion-5b: An open large-scale dataset for training next generation image-text models [J]. Advances in Neural Information Processing Systems, 2022(35): 25278-25294.

    [23]

    Radford A, Kim J W, Hallacy C, et al. Learning Transferable Visual Models from Natural Language Supervision [C]. Proceedings of the 38th International Conference on Machine Learning (ICML). July 18-24, 2021, New York, NY, USA. PMLR 139: 8748-8763.

    [24]

    Mildenhall B, Srinivasan P P, Tancik M, et al. Nerf: Representing scenes as neural radiance fields for view synthesis [J]. Communications of the ACM, 2021, 65(1): 99-106.

    [25]

    Ranftl R, Lasinger K, Hafner D, et al. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 44(3): 1623-1637.

    [26]

    Shen T, Gao J, Yin K, et al. Deep marching tetrahedra: a hybrid representation for high-resolution 3d shape synthesis [J]. Advances in Neural Information Processing Systems, 2021(34): 6087-6101.

    [27]

    Zhang R, Isola P, Efros A A, et al. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric [C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 18-23, 2018, Salt Lake City, UT, USA. IEEE, 2018: 586-595.

    [28]

    Wang Z, Bovik A C, Sheikh H R, et al. Image quality assessment: from error visibility to structural similarity [J]. IEEE Transactions on Image Processing, 2004, 13(4): 600-612. doi: 10.1109/TIP.2003.819861

    [29]

    Liu M, Xu C, Jin H, et al. One-2-3-45: Any single image to 3d mesh in 45 seconds without per-shape optimization[J]. Advances in Neural Information Processing Systems, 2024: 36.

图(12)  /  表(4)
计量
  • 文章访问数:  121
  • HTML全文浏览量:  15
  • PDF下载量:  28
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-05-01
  • 修回日期:  2024-09-17
  • 网络出版日期:  2024-10-23
  • 刊出日期:  2025-04-14

目录

    /

    返回文章
    返回