Looking for advice on food weight estimate model

I’m trying to get a model that estimates weight of foods from 2D image. I’m focusing only on fruits and vegetables for now.
So far experimented with image classifiers for food detection, although without a bounding box. I tried jazzmacedo/fruits-and-vegetables-detector-36

I found this article: Vision-Based Approach for Food Weight Estimation from 2D Images


I would like to recreate the high-level architecture of the proposed solution but I need some help in how to approach this. Looking for opinions, tips, references that might help me, i.e. I’m looking for advice from someone with more experience. From my lack of experience I might be getting over my head but I would like to try still. :cowboy_hat_face:

It seems to me there is no public food weight estimation model, maybe even no public food recognition model with bounded box. I’m probably wrong. I’m not even sure could I use some existing model and/or dataset as pieces of puzzles for the article approach or would I need to do everything from scratch? In case of the latter where then should I start? In case of the former what might be a good fit?