One of my clients has an image gallery with brief English and French descriptions for each picture.
How can I use this dataset to demonstrate my data science abilities? What models can I create? Which algorithms can I test out?
One of my clients has an image gallery with brief English and French descriptions for each picture.
How can I use this dataset to demonstrate my data science abilities? What models can I create? Which algorithms can I test out?
What do the images contain? What is your clients business model?
He has been collecting random images he finds on the street, like furniture, people, and objects, but no real theme. He wants people to pick and choose which ones they like, download them, and maybe even pay for higher resolution versions. I think he needs a tool to help people find new images, so he can do advanced search or make recommendations based on the ones they choose.
This is a strange topic, but I’d be interested in seeing a qualitative comparison of the two feature spaces by using clip embeddings to identify photos near the text and, for example, Dinov2 embeddings to find related images.
But obviously, as others say, your client should have the ideas.
Honestly with AI anyone can upscale and enhance the low res images they download these days
As far as what you could train, you could use all of these inputs to fine tune a flux model - it would then generate photos like the ones he’s taken
You could use it to find tune one of the image captioning models / talk with image models
I am interested in learning about image captioning, indeed. Do you think there is also something interesting to do with the text descriptions that we already have?