<p>Urban development models typically provide simulated building areas in an aggregated form. When using such outputs to parametrize pluvial flood risk simulations in an urban setting, we need to identify ways to characterize imperviousness and flood exposure. We develop data-driven approaches for establishing this link, and we focus on the data resolutions and spatial scales that should be considered. We use regression models linking aggregated building areas to total imperviousness, and models that link aggregated building areas and simulated flood areas to flood damages. The data-resolutions used for training regression models are demonstrated to have a strong impact on identifiability, with too fine data resolutions preventing the identification of the link between building areas and hydrology, and too coarse resolutions leading to uncertain parameter estimates. The optimal data resolution for modelling imperviousness was identified to be 400 m in our case study, while an aggregation of the data to at least 1000 m resolution is required when modelling flood damages. In addition, regression models for flood damages are more robust when considering building data with coarser resolutions of 200 m than for finer resolutions. The results suggest that aggregated building data can be used to derive realistic estimations of flood risk in screening simulations. Future work needs to focus on training regression approaches where different degrees of flood-awareness in landuse management can be considered.</p>