EnvyRState

EnvyRState is the join of two principal concepts.

  • The first one: “Enviousness Real Estate market”, that refers to a perfect and innovative market where you can get all the information you need.

  • The second one: “R Studio” which is the development environment where our team works.

Our multidisciplinary team is composed of members from different areas and background. Economics, Stadistics and Computer Science are the main areas, and the join of them makes it possible to understand and adopt their knowledge to the real estate market environment.

The main problem is that the information on the real estate market is either quite heterogeneous or it is missing. Moreover, the constant changes in the behaviour of the real estate market make the evaluation of these types of changes very hard for the professionals. So our first objective is to provide a platform that homogenises and collects all these disperse information then transforming them into useful knowledge.

Our app explores a selected set of indicators that can potentially inform on the state of the housing market. To this aim, we decided to split this approach into five dimensions: Housing conditions, Housing prices, Construction, Mortgages and Macroeconomy. Each dimension involves several databases from different sources. Most of them closely related to providing insights about a possible housing bubble.

The main objective of this application is to create a complete tool for visualising and analysing interactively different effects on the environment of the housing market. This tool will provide answers to the general public, experts, investors and even policymakers all information they need to evaluate, analyse and extract conclusions about changes in prices of houses, prices of renting, different macroeconomic aspects, the evolution of the construction or even if a country is more prone to be owner instead of tenant, among others. With the purpose of answering different questions related to the housing market, giving them enough information to determine what the cause of a phenomenon is and helping them to make predictions and take action.

If this is your first time here, please visit the support section in the sidebar menu.

Team

Alejandro Zornoza Martínez

Alejandro

Rocío Gutiérrez López

Rocio

Manuel Alfaro García

Manu

Emilio López Cano

Emilio

Supported by:

Powered by

Dimensions Overview

Dimensions

We have selected more than 40 databases from different Open Data Sources. Instead of taking only conventional datasets concerning prices, we identified five dimensions related to the Real Estate Market: Prices, living conditions, construction sector, mortgages, and macroeconomy. The application allows to gather, combine, explore, visualise, and Analise relevant open data in order to get value for Real Estate market stakeholders.

Check the list of datasets and their metadata here.

41

Databases

5

Dimensions +1

8

Data sources

Databases summary

Housing conditions Dimension Explorer

Selection

Manu: seleccionar base de datos, variables y factores; Alejandro: escribir en support; Rocío: escribir en support o db

Hit play button

Housing prices Dimension Explorer

Selection

Hit play button

Construction Dimension Explorer

Selection

Mortgages Dimension Explorer

Selection

Macroeconomy Dimension Explorer

Selection

Regions Dimension Explorer

Selection

EnvyRState Report

Select the contents for your report and download.

EnvyRState Technical details

Data gathering and preprocessing

  • Python to get and transfor into consistent EnvyRState format

  • From the metadata prepared by experts

Data structures

  • Consistent data structure across data sources.
  • Quite similar to EUROSTAT data
  • Allows the combination of different sources in order to provide richer insights
  • Metadata in Excel files, easy to use by non-computer scientists

R & Shiny

  • Exploratory Data analysis
  • Model fitting
  • Data Science

We use the free version of Shiny Server. In a production setting, the Pro version shall be used, improving performance and reliability.

Visualizations

  • Interactive, awesome maps with Leaflet
  • Interactive plots with Plotly
  • Dynamic visualizations with GoogleVis

Fog computing approach

  • Static insights given the anlyst expertise

  • Dynamic data explorarion, from local databases

  • Data update and time-consuming model fitting as different processes

Work in progress

  • Automate data gathering and preprocessing

  • Extend to other domains beyond Real Estate: Energy, Agrifood, Health, etc.

  • Add config options and (further) customise visualizations using them

EnvyRState configuration options

blah

EnvyRState Frequently Asked Questions

This section has the objetive of answering typical questions that the user can ask himself.

  • When the project development began?

We started the project the day it was announced that we had been selected to be finalist in EU Datathon 2019.

  • Which project management process has been followed?

The time was very limited, so we perform a agile development based on SCRUM philosophy dividing the project in minimum increments that allow us to create an application in continious build since the first version was desployement. For the project management we use Microsoft Teams with our academic licenses and Trello for the planification of the tasks.

  • Why use R instead of other languages like Python which is the most used in the state of the art?

A large part of the team members have advanced knowledge on programming with R, so we had to take advantage of this knowledge that allow us reduce the time to programming some parts. The other team member works in Python and using pickle he made the serialization of objects to traslade them to the R Environment.

  • What can i do if i can not find the information i am looking for?

One of our goals is to provide the maximum information at our fingertips. You can easily contact us and detail what are you looking for. We will study your proposal and we will add it in the shortest possible time.

  • How the information data is controlled?

For each database, we define and create the following information: Database title and description, URLs for font and metadata, dimension of the database, type of missing values, all vars description and their standarization codes. And we add their geography level and type time period.

  • What is the process followed to include new dimensions and databases?

We had built a data mining pipeline for obtaining knowledge around the real state market following the following steps.

  1. We define a real estate dimension to attack.

  2. We make a search of the relevant information about this dimension.

  3. We obtain the relevant databases.

  4. We make a descriptive analysis and a standardization proccess of the data.

  5. We add the dimension to the system.

  • What is the status of the application?

For the EU Datathon 2019 we have created the first version of our application that it is based on R Shiny server for make a minimum value product to provide us feedback about the line of work that we are following. The databases come from different platforms like Central Bank or Eurostat, but we know that all information it is not there. After that we will make triggers for maintenance the information from that fonts updated while we are getting more complex information, specially the information of smaller areas than country level.

  • Could there be some problems in the future?

Obviusly there will be several problems with the curse of dimensionality at the server level. But we have designed the system with the possibility of auto-scale in a multi-layer architecture. That gives us the capacity of migrate the system to other platforms, one of the main objetives after the feedback received from experts of EU datathon it is to migrate the server to Amazon Web Services but the benefit of this type of architecture is that we can add new modules without problems. Load balancers are fundamental if we want to work with a good response for our users, specially where the data to load or work with it became near to big data.

  • What is the immediate future of the application?

Resuming and having all the first objetives proposed we have designed the future of our application based on these points:

  1. Evaluate the feedback of the experts of EU Datathon.

  2. Consider the possibility of make colaboration with other teams, institutions or experts in the area.

  3. Apply for partners and obtain European financing.

  4. Define a new line of work.

  5. Expand the team to include a member who advises in law and other aspects related with intelectual property.

  • What types of sources are used?

Only sources accesible to the public are used in order to the directive Directive 2013/37/EU of the European Parliament and of the Council of 26 June 2013 amending Directive 2003/98/EC on the re-use of public sector information Text with EEA relevance. For the rest of the sources, we have the necessary permits required. It is remarkable that in this app, we only use open data to analyse it, but to obtain a more refined app and better conclusions would be necessary to use another data additionally, like for example private surveys.

  • How can i download graphics and information?

All graphics can be downloaded from the corresponding graphic interface in format .png. In addition, each dimension has a report part where the user can download the insights and other relevant information in format .docx that is useful if the user need to make reproducible his results. In the lab, the user can choose different databases and parameters giving him the possibility of download this work too.

  • Why analyse the Real Market State?

Real estate is one of the most important sectors in the economy. Changes in house prices can have considerable effects on the rest of the economy because there is a positive correlation between fluctuations in house prices and in economic activity.

  • What could be the economic transcendence of the application?

Researchers and politicians increasingly consider that real estate markets play a decisive role in the transmission of monetary policy impulses, house price indicators are among the indicators that are closely monitored by policymakers, so this app could be so useful for them to compare different indicators in the housing market in Europe, in an interactive way and deduce greatly useful conclusions.

What is our TRL?

We are in TRL 3 – Experimental proof of concept

  • What kind of visual graphs is using?

  • Who can i download graphics and information? We use line graphs because it provides a chronological perspective of the state of the market can be characterised by the length of its upturns or downturns, in particular, in comparison with the average duration of such phases. Moreover, we use rankings to emphasise the indicators in some countries, and heat maps that provide a visual image to detect potential regions.

Data Science Discovery

Disclaimer: Make decisions at your own risk. This a tool to provide insights. Consult a Statistician for model validation.

Selection

Loading...