The screenshots we use as reference when describing software contain a vast amount of contextual information. If Eddy were able to analyze surrounding screenshots in addition to text when generating responses, I think the quality of AI output would greatly improve. Also, a lot of textual descriptions would become much shorter or even unnecessary if the screenshot clearly shows the context.