Abstract:
Thunnus alalunga is a highly migratory oceanic fish, widely distributed in the Pacific, Atlantic, Indian, and Mediterranean regions. Improving the accuracy of Pacific albacore fishery locations and cache predictions not only enhances the efficiency in deep-sea fishing, but also measures the saturation degree of albacore fishery and provides a theoretical support for the sustainable fishery development. Based on the longline fishing data for albacore in Pacific (120°E—80°W、45°S—45°N) from 2000 to 2021, 16 environmental factors including month, longitude, latitude, sea water potential temperature (ST), sea water salinity (SS), chlorophyll a, dissolved iron (Fe), primary production (PP), dissolved molecular oxygen (DO), PH, surface partial pressure of carbon dioxide (SPCO2), phytoplankton expressed as carbon (PHYC), ocean mixed layer thickness (MLD), sea surface height (SSH), eastward wind (EW) and northward wind (NW) have been choose with various shape and size. We proposed a novel Multiple Channels Single Regression (MCSR) module built by stacked convolutional operators with residual structures and fully connected layers. The module is divided into 3 components: the “root” component derives feature maps from various environmental factors, processing each factor in the sample differently; the “bulk” component concatenates the feature maps from the root and extract features about whole sample; and the "head" component computes the likelihood of predicting a
T. alalunga fishery location and the expected catch. For measuring the performance of this module, we employed SHAP to calculate the contributions of each environmental factor, leveraging the additive contributions from various factors to reveal relationship between factors and catch. When compared to traditional fishery forecasting models, including Random Forest, XGBOOST, Generalized Additive Model, SVN, Long Short-Term Memory network, and BP, our module achieved the best performance, with MSE, RMSE, and MAE values of 0.00322, 0.0567, and 0.0272, respectively, outperforming other models by 3.9%—82.6%. The MCSR module demonstrated superior performance among statistical and ensemble learning modules, as well as deep learning modules. The SHAP summary, aggregated across various factors and locations, revealed that the module effectively learns potential relationships between factors and fishery caches, minimizing redundancy and noise. From a modeling perspective, this approach adapts well to heterogeneous data inputs, enabling end-to-end learning and accurate forecasting of fishery locations and catches according to various environmental factors. In the aspect of biology fields, this study introduces a new method combining deep learning modules and explainable approaches, which can be used in researching relationships between species and environmental factors.