Monday, March 12, 2018

Mauritius Heat Map for Real Estate Prices per m2 using Python

Yesterday I thought I would give a try to see whether it is possible to create a heatmap of real estate prices per m2 in Mauritius , this covers the prices of apartments , houses and villas.

Given that there is no dataset to work against readily I managed to collect information from popular online sources that had prices of real estate in an attempt to then plot them on our Mauritian map.

I used only Python and Jupyter notebook for all the process of collecting data , analysing and plotting.

The process gathering data off from web pages is called webscraping and I will not give out codes in regards to this if you are not careful enough you might cause DoS attacks against the websites and you don't want that.

I used  beautifulsoup4  to parse through the retrieved HTML data to retrieve specifically what data I was interested in to create some pandas dataframes .

The first analysis based on the data showed the following :



The price in Rs per m2 is lower in the center than on the North and its much more expensive that on the East of the island . However note that 4000+  records were used  on which an average was made for each Area.

Below are the regions where the prices are the lowest per m2 in Mauritius :

And here are the regions which have the highest prices m2 in Mauritius :




Note that there are over 150 regions so I cannot show the full list but that gives a good enough indication of prices where the costal regions obviously being more expensive.

Then I used the Google Maps API to get the coordinates based on the region name  , the code for this can be seen below, function name is geodataMapper , it also handles the problem that if you are making a number of subsequent calls to Google Maps API (in this case over 150+) , then at one point you will get an exception saying something like "too many retries" :
def geodataMapper( location ):
GOOGLE_MAPS_API_URL = 'https://maps.googleapis.com/maps/api/geocode/json'
params = {
'address': location +', Mauritius',
'sensor': 'false',
'region': 'Mauritius',
'key' : 'YOUR_GOOGLE_API_KEY_HERE'
}
sleepTimeMin = 8
sleepTimeMax = 15
# Do the request and get the response data
try:
req = requests.get(GOOGLE_MAPS_API_URL, params=params)
except Exception:
print("Exception occurred for :"+ location + " retrying.... ")
sleep(randint(sleepTimeMin,sleepTimeMax))
req = requests.get(GOOGLE_MAPS_API_URL, params=params)
res = req.json()
geodata = dict()
# Use the first result
if len(res['results']):
result = res['results'][0]
geodata['lat'] = result['geometry']['location']['lat']
geodata['lng'] = result['geometry']['location']['lng']
geodata['address'] = result['formatted_address']
return geodata


To retrieve latitude and longitude data for all the regions i through the following code that will then iterate through a panda dataframe containing the region and price m2 and for each region call the above geodataMapper , at the end it will create a csv file containing region , price per m2 , latitude and longitude :

latitudes = []
longitudes = []
regions = []
prices_m2 = []
for index, row in region_price_df.iterrows():
print (row["Region"], row["Price per m2"])
# Do mapping of lat and long for specific region
geodata_region = geodataMapper(row["Region"])
#Only fill the arrays if there is mapping data available
if 'lat' in geodata_region:
regions.append(row["Region"])
prices_m2.append(row["Price per m2"])
latitudes.append(geodata_region['lat'])
longitudes.append(geodata_region['lng'])
heatmap_df = pd.DataFrame({'latitude': latitudes,
'longitude': longitudes,
'region' : regions,
'price_m2' : prices_m2})
heatmap_df.to_csv('heat_map.csv', encoding='utf-8-sig')

Now once you have this data you can install gmaps which is awesome for creating some basic heatmaps and it even has extension for jupyter notebook .

Below are the codes for it , all you need is a pandas dataframe from the csv you saved:

locations = heatmap_df[["latitude", "longitude"]]
weights = heatmap_df["price_m2"]
fig = gmaps.figure()
fig.add_layer(gmaps.heatmap_layer(locations, weights=weights))
fig

And you should be able to see something like this appearing on your jupyter notebook:


Which is pretty neat I still need to work out how to create the heat map based on google map boundaries instead of just one actual point but that I will leave for another time.


No comments: