The project began with hours of recorded bird-fishpond videos captured by the Nobify team from various angles, cameras, times of day, and areas of the pond. To prepare the data for labeling, we extracted bird images and requested classification into categories such as swimming birds, distant birds, flying birds, standing birds, and birds in groups. Once the labeling was completed and we had a well-organized dataset, we initiated training of our AI model. Rigorous testing followed to ensure the model met the required performance and accuracy benchmarks. We tested different configurations and datasets using both YOLOv8 and R-CNN, but none of them met our benchmarks. To overcome the challenges, especially considering the edge device’s limited computational power, we implemented SAHI (Slicing Aided Hyper Inference). This approach allowed us to detect smaller objects by slicing the image, running inference on each slice, and merging the results. SAHI significantly improved performance for small object detection, which traditional models had missed. With this breakthrough, we not only passed our benchmark with flying colors, but we’re now ready to move on to full implementation and integration of the AI model into the edge device, ensuring optimal performance in real-world scenarios.