《Keras 3 单眼深度估计》：此文为AI自动翻译

《Keras 3 单眼深度估计》

作者：Victor Basu
创建日期：2021/08/30
最后修改时间：2024/08/13
描述：使用卷积网络实现深度估计模型。

（i）此示例使用 Keras 3

在 Colab 中查看

GitHub 源

介绍

深度估计是从 2D 图像推断场景几何结构的关键步骤。 单眼深度估计的目标是预测每个像素的深度值或推断深度信息，仅给出一个 RGB 图像作为输入。此示例将展示一种使用 convnet 构建深度估计模型的方法和简单的损失函数。

设置

import os

os.environ["KERAS_BACKEND"] = "tensorflow"

import sys

import tensorflow as tf
import keras
from keras import layers
from keras import ops
import pandas as pd
import numpy as np
import cv2
import matplotlib.pyplot as plt

keras.utils.set_random_seed(123)

下载数据集

为此，我们将使用数据集 DIODE：密集的室内和室外深度数据集 教程。但是，我们使用验证集生成训练和评估子集对于我们的模型。我们使用原始数据集的验证集而不是训练集的原因是训练集包含 81GB 的数据，相比之下，下载起来很有挑战性到只有 2.6GB 的验证集。您可以使用的其他数据集包括 NYU-v2 和 KITTI。

annotation_folder = "/dataset/"
if not os.path.exists(os.path.abspath(".") + annotation_folder):
    annotation_zip = keras.utils.get_file(
        "val.tar.gz",
        cache_subdir=os.path.abspath("."),
        origin="http://diode-dataset.s3.amazonaws.com/val.tar.gz",
        extract=True,
    )

Downloading data from http://diode-dataset.s3.amazonaws.com/val.tar.gz

2774625282/2774625282 ━━━━━━━━━━━━━━━━━━━━ 205s 0us/步

准备数据集

我们只使用室内图像来训练我们的深度估计模型。

path = "val/indoors"

filelist = []

for root, dirs, files in os.walk(path):
    for file in files:
        filelist.append(os.path.join(root, file))

filelist.sort()
data = {
       
    "image": [x for x in filelist if x.endswith(".png")],
    "depth": [x for x in filelist if x.endswith("_depth.npy")],
    "mask": [x for x in filelist if x.endswith("_depth_mask.npy")],
}
df = pd.DataFrame(data)

df = df.sample(frac=1, random_state=42)

准备超参数

HEIGHT = 256
WIDTH = 256
LR = 0.00001
EPOCHS = 30
BATCH_SIZE = 32

构建数据管道

管道采用包含 RGB 图像路径的 DataFrame，以及深度和深度蒙版文件。
它读取 RGB 图像并调整其大小。
它读取深度和深度蒙版文件，对其进行处理以生成深度图图像，然后调整其大小。
它返回批处理的 RGB 图像和深度图图像。

class DataGenerator(keras.utils.PyDataset):
    def __init__(self, data, batch_size=6, dim=(768, 1024), n_channels=3, shuffle=True):
        super().__init__()
        """
        Initialization
        """
        self.data = data
        self.indices = self.data.index.tolist()
        self.dim = dim
        self.n_channels = n_channels
        self.batch_size = batch_size
        self.shuffle = shuffle
        self.min_depth = 0.1
        self.on_epoch_end()

    def __len__(self):
        return int(np.ceil(len(self.data) / self.batch_size))

    def __getitem__(self, index):
        if (index + 1) * self.batch_size > len(self.indices):
            self.batch_size = len(self.indices) - index * self.batch_size
        # Generate one batch of data
        # Generate indices of the batch
        index = self.indices[index * self.batch_size : (index + 1) * self.batch_size]
        # Find list of IDs
        batch = [self.indices[k] for k in index]
        x, y = self.data_generation(batch)

        return x, y

    def on_epoch_end(self):
        """
        Updates indexes after each epoch
        """
        self.index = np.arange(len(self.indices))
        if self.shuffle == True:
            np.random.shuffle(