Crowdsourced software testing has been shown to be capable of detecting many bugs and simulating real usage scenarios. As such, it is popular in mobile-application testing. However in mobile testing, test reports often consist of only some screenshots and short text descriptions. Inspecting and under-standing the overwhelming number of mobile crowdsourced test reports becomes a time-consuming but inevitable task. The paucity and potential inaccuracy of textual information and the well-defined screenshots of activity views within mobile applications motivate us to propose a novel technique to assist developers in understanding crowdsourced test reports by automatically describing the screenshots. To reach this goal, in this paper, we propose a fully automatic technique to generate descriptive words for the well-defined screenshots. We employ the test reports written by professional testers to build up language models. We use the computer-vision technique, namely Spatial Pyramid Matching (SPM), to measure similarities and extract features from the screenshot images. The experimental results, based on more than 1000 test reports from 4 industrial crowdsourced projects, show that our proposed technique is promising for developers to better understand the mobile crowdsourced test reports.