ImageBind can apparently embed audio (and also thermal camera output, depth sensor output and IMU traces...?) into the same embedding space as Meme Search Engine's CLIP model. I could plausibly add a search-by-audio option.
ImageBind can apparently embed audio (and also thermal camera output, depth sensor output and IMU traces...?) into the same embedding space as Meme Search Engine's CLIP model. I could plausibly add a search-by-audio option.