Mauritius Heat Map for Real Estate Prices per m2 using Python
Yesterday I thought I would give a try to see whether it is possible to create a heatmap of real estate prices per m2 in Mauritius , this covers the prices of apartments , houses and villas.
Given that there is no dataset to work against readily I managed to collect information from popular online sources that had prices of real estate in an attempt to then plot them on our Mauritian map.
I used only Python and Jupyter notebook for all the process of collecting data , analysing and plotting.
The process gathering data off from web pages is called webscraping and I will not give out codes in regards to this if you are not careful enough you might cause DoS attacks against the websites and you don't want that.
I used beautifulsoup4 to parse through the retrieved HTML data to retrieve specifically what data I was interested in to create some pandas dataframes .
The first analysis based on the data showed the following :
The price in Rs per m2 is lower in the center than on the North and its much more expensive that on the East of the island . However note that 4000+ records were used on which an average was made for each Area.
Below are the regions where the prices are the lowest per m2 in Mauritius :
And here are the regions which have the highest prices m2 in Mauritius :
Note that there are over 150 regions so I cannot show the full list but that gives a good enough indication of prices where the costal regions obviously being more expensive.
Then I used the Google Maps API to get the coordinates based on the region name , the code for this can be seen below, function name is geodataMapper , it also handles the problem that if you are making a number of subsequent calls to Google Maps API (in this case over 150+) , then at one point you will get an exception saying something like "too many retries" :
To retrieve latitude and longitude data for all the regions i through the following code that will then iterate through a panda dataframe containing the region and price m2 and for each region call the above geodataMapper , at the end it will create a csv file containing region , price per m2 , latitude and longitude :
Now once you have this data you can install gmaps which is awesome for creating some basic heatmaps and it even has extension for jupyter notebook .
Below are the codes for it , all you need is a pandas dataframe from the csv you saved:
And you should be able to see something like this appearing on your jupyter notebook:
Which is pretty neat I still need to work out how to create the heat map based on google map boundaries instead of just one actual point but that I will leave for another time.
Given that there is no dataset to work against readily I managed to collect information from popular online sources that had prices of real estate in an attempt to then plot them on our Mauritian map.
I used only Python and Jupyter notebook for all the process of collecting data , analysing and plotting.
The process gathering data off from web pages is called webscraping and I will not give out codes in regards to this if you are not careful enough you might cause DoS attacks against the websites and you don't want that.
I used beautifulsoup4 to parse through the retrieved HTML data to retrieve specifically what data I was interested in to create some pandas dataframes .
The first analysis based on the data showed the following :
Below are the regions where the prices are the lowest per m2 in Mauritius :
And here are the regions which have the highest prices m2 in Mauritius :
Then I used the Google Maps API to get the coordinates based on the region name , the code for this can be seen below, function name is geodataMapper , it also handles the problem that if you are making a number of subsequent calls to Google Maps API (in this case over 150+) , then at one point you will get an exception saying something like "too many retries" :
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def geodataMapper( location ): | |
GOOGLE_MAPS_API_URL = 'https://maps.googleapis.com/maps/api/geocode/json' | |
params = { | |
'address': location +', Mauritius', | |
'sensor': 'false', | |
'region': 'Mauritius', | |
'key' : 'YOUR_GOOGLE_API_KEY_HERE' | |
} | |
sleepTimeMin = 8 | |
sleepTimeMax = 15 | |
# Do the request and get the response data | |
try: | |
req = requests.get(GOOGLE_MAPS_API_URL, params=params) | |
except Exception: | |
print("Exception occurred for :"+ location + " retrying.... ") | |
sleep(randint(sleepTimeMin,sleepTimeMax)) | |
req = requests.get(GOOGLE_MAPS_API_URL, params=params) | |
res = req.json() | |
geodata = dict() | |
# Use the first result | |
if len(res['results']): | |
result = res['results'][0] | |
geodata['lat'] = result['geometry']['location']['lat'] | |
geodata['lng'] = result['geometry']['location']['lng'] | |
geodata['address'] = result['formatted_address'] | |
return geodata |
To retrieve latitude and longitude data for all the regions i through the following code that will then iterate through a panda dataframe containing the region and price m2 and for each region call the above geodataMapper , at the end it will create a csv file containing region , price per m2 , latitude and longitude :
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
latitudes = [] | |
longitudes = [] | |
regions = [] | |
prices_m2 = [] | |
for index, row in region_price_df.iterrows(): | |
print (row["Region"], row["Price per m2"]) | |
# Do mapping of lat and long for specific region | |
geodata_region = geodataMapper(row["Region"]) | |
#Only fill the arrays if there is mapping data available | |
if 'lat' in geodata_region: | |
regions.append(row["Region"]) | |
prices_m2.append(row["Price per m2"]) | |
latitudes.append(geodata_region['lat']) | |
longitudes.append(geodata_region['lng']) | |
heatmap_df = pd.DataFrame({'latitude': latitudes, | |
'longitude': longitudes, | |
'region' : regions, | |
'price_m2' : prices_m2}) | |
heatmap_df.to_csv('heat_map.csv', encoding='utf-8-sig') |
Now once you have this data you can install gmaps which is awesome for creating some basic heatmaps and it even has extension for jupyter notebook .
Below are the codes for it , all you need is a pandas dataframe from the csv you saved:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
locations = heatmap_df[["latitude", "longitude"]] | |
weights = heatmap_df["price_m2"] | |
fig = gmaps.figure() | |
fig.add_layer(gmaps.heatmap_layer(locations, weights=weights)) | |
fig |
And you should be able to see something like this appearing on your jupyter notebook:
Which is pretty neat I still need to work out how to create the heat map based on google map boundaries instead of just one actual point but that I will leave for another time.
No comments:
Post a Comment