Company
Company

About us

Delivery Model

Business Model

Career

Data Privacy

Contact us
Services
Services
Solutions
Solutions
Analytics
Analytics
Coverage
Coverage
API
API
Insights
Get in Touch
Get in Touch

How Analyzing Zomato Restaurant Data Can Help You Extract the Best Cuisines Data in Local Areas?

December 15, 2021

Zomato restaurant data scraping is among the most valuable tools for foodies looking to sample the finest cuisines from across the world while staying within their budget. This research is also helpful for people looking for low-cost eateries in various parts of the nation that serve a variety of cuisines. Additionally, restaurant data extraction responds to the demands of consumers who are looking for the best cuisine in a country and which locality in that country has the most restaurants serving that cuisine.

Dataset Details

Restaurant Id: Every restaurant’s unique identifier in numerous cities throughout the world

Restaurant Name: Name of the restaurant

Country Code: Location of the Country in which Restaurant is located

Address: Address of the restaurant

Locality: City location

Locality Verbose: Detailed explanation of the location – Longitude: The location of the restaurant’s longitude coordinates

Latitude: The restaurant’s location’s latitude coordinate

Cuisines: The restaurant’s menu includes a variety of cuisines.

Average cost for two: Cost of two individuals in different currencies

Currency: Currency of the country

Has Table Booking: yes/no

Has Online Delivery: Yes/No

Is Delivering: Yes/ No

Switch to order menu: Yes/No

Price range: range of price of food

Aggregate Rating: Average rating out of 5

Rating color: Depending upon the average rating color

Rating text: text depending on basis of rating

Votes: Number of ratings casted by people.

Dataset Importing

This project can be developed on google Colab. The dataset can be downloaded as follows:

from google.colab import files
files.upload()
! pip install opendatasets --upgrade
import opendatasets as od
dataset_url = 'https://www.foodspark.io/archives'
od.download(dataset_url)

Reading Data Using Pandas

import pandas as pd
df = pd.read_csv('/content/zomato-restaurants-data/zomato.csv', engine='python')
df.head(2)

	Restaurant ID	Restaurant Name	Country Code	City	Address	Locality	Locality Verbose	Longitude	Latitude	Cuisines	Average Cost for two	Currency	Has Table booking	Has Online delivery	Is delivering now	Switch to order menu	Price range	Aggregate rating	Rating color	Rating text	Votes
0	6317637	Le Petit Souffle	162	Makati City	Third Floor, Century City Mall, Kalayaan Avenu…	Century City Mall, Poblacion, Makati City	Century City Mall, Poblacion, Makati City, Mak…	121.027535	14.565443	French, Japanese, Desserts	1100	Botswana Pula(P)	Yes	No	No	No	3	4.8	Dark Green	Excellent	314
1	6304287	Izakaya Kikufuji	162	Makati City	Little Tokyo, 2277 Chino Roces Avenue, Legaspi…	Little Tokyo, Legaspi Village, Makati City	Little Tokyo, Legaspi Village, Makati City, Ma…	121.014101	14.553708	Japanese	1200	Botswana Pula(P)	Yes	No	No	No	3	4.5	Dark Green	Excellent	591

Checking if Dataset Contains any Null

## Checking if dataset contains any null
nan_values = df.isna()
nan_columns = nan_values.any()
columns_with_nan = df.columns[nan_columns].tolist()
print(columns_with_nan)
['Cuisines']

Null values appear to exist in cuisines. As a result, any additional study employing Cuisines must take the NaN values into account.

Along with this dataset, there is another file that is also available.

df1 = pd.read_excel('/content/zomato-restaurants-data/Country-Code.xlsx')
df1.head()

Let’s combine the two datasets. This will assist us in comprehending the dataset on a country-by-country basis.

	Restaurant ID	Restaurant Name	Country Code	City	Address	Locality	Locality Verbose	Longitude	Latitude	Cuisines	Average Cost for two	Currency	Has Table booking	Has Online delivery	Is delivering now	Switch to order menu	Price range	Aggregate rating	Rating color	Rating text	Votes	Country
0	6317637	Le Petit Souffle	162	Makati City	Third Floor, Century City Mall, Kalayaan Avenu…	Century City Mall, Poblacion, Makati City	Century City Mall, Poblacion, Makati City, Mak…	121.027535	14.565443	French, Japanese, Desserts	1100	Botswana Pula(P)	Yes	No	No	No	3	4.8	Dark Green	Excellent	314	Phillipines
1	6304287	Izakaya Kikufuji	162	Makati City	Little Tokyo, 2277 Chino Roces Avenue, Legaspi…	Little Tokyo, Legaspi Village, Makati City	Little Tokyo, Legaspi Village, Makati City, Ma…	121.014101	14.553708	Japanese	1200	Botswana Pula(P)	Yes	No	No	No	3	4.5	Dark Green	Excellent	591	Phillipines

df2 = pd.merge(df,df1,on='Country Code',how='left')
df2.head(2)

Exploratory Zomato Restaurant Analysis and Visualizations

Before we ask questions about the dataset, it’s important to understand the geographical spread of the restaurants, as well as the ratings, currency, online delivery, city coverage, and so on.

List of Countries the Survey is Done

print('List of counteris the survey is spread accross - ')
for x in pd.unique(df2.Country): print(x)
print()
print('Total number to country', len(pd.unique(df2.Country)))

List of countries the survey is spread across –

Philippines
Brazil
United States
Australia
Canada
Singapore
UAE
India
Indonesia
New Zealand
United Kingdom
Qatar
South Africa
Sri Lanka
Turkey

Total number to country 15

The study appears to have been conducted in 15 nations. This demonstrates that Zomato is an international corporation with operations in all of those nations.

from plotly.offline import init_notebook_mode, plot, iplot
labels = list(df2.Country.value_counts().index)
values = list(df2.Country.value_counts().values)
fig = {
    "data":[
        {
            "labels" : labels,
            "values" : values,
            "hoverinfo" : 'label+percent',
            "domain": {"x": [0, .9]},
            "hole" : 0.6,
            "type" : "pie",
            "rotation":120,
        },
    ],
    "layout": {
        "title" : "Zomato's Presence around the World",
        "annotations": [
            {
                "font": {"size":20},
                "showarrow": True,
                "text": "Countries",
                "x":0.2,
                "y":0.9,
            },
        ]
    }
}
iplot(fig)

Zomato is an Indian startup, thus it’s only natural that it has the most business among Indian restaurants.

Understanding the Rating aggregate, Color, and Text

df3 = df2.groupby(['Aggregate rating','Rating color', 'Rating text']).size().reset_index().rename(columns={0:'Rating Count'})
df3
df3

	Restaurant ID	Restaurant Name	Country Code	City	Address	Locality	Locality Verbose	Longitude	Latitude	Cuisines	Average Cost for two	Currency	Has Table booking	Has Online delivery	Is delivering now	Switch to order menu	Price range	Aggregate rating	Rating color	Rating text	Votes	Country
0	6317637	Le Petit Souffle	162	Makati City	Third Floor, Century City Mall, Kalayaan Avenu…	Century City Mall, Poblacion, Makati City	Century City Mall, Poblacion, Makati City, Mak…	121.027535	14.565443	French, Japanese, Desserts	1100	Botswana Pula(P)	Yes	No	No	No	3	4.8	Dark Green	Excellent	314	Phillipines
1	6304287	Izakaya Kikufuji	162	Makati City	Little Tokyo, 2277 Chino Roces Avenue, Legaspi…	Little Tokyo, Legaspi Village, Makati City	Little Tokyo, Legaspi Village, Makati City, Ma…	121.014101	14.553708	Japanese	1200	Botswana Pula(P)	Yes	No	No	No	3	4.5	Dark Green	Excellent	591	Phillipines

The preceding information clarifies the relationship between aggregate rating, color, and text. Finally, we assign the following hue to the ratings:

Rating 0 — White — Not rated

Rating 1.8 to 2.4 — Red — Poor

Rating 2.5 to 3.4 — Orange — Average

Rating 3.5 to 3.9 — Yellow — Good

Rating 4.0 to 4.4 — Green — Very Good

Rating 4.5 to 4.9 — Dark Green — Excellent

Let’s try to figure out why eateries have such a wide range of ratings.

import seaborn as sns
import matplotlib
import matplotlib.pyplot as plt
%matplotlib inline
sns.set_style('darkgrid')
matplotlib.rcParams['font.size'] = 14
matplotlib.rcParams['figure.figsize'] = (9, 5)
matplotlib.rcParams['figure.facecolor'] = '#00000000'
plt.figure(figsize=(12,6))
# plt.xticks(rotation=75)
plt.title('Rating Color')
sns.barplot(x=df3['Rating color'], y=df3['Rating Count']);

Surprisingly, the majority of establishments appear to have dropped their star ratings. Let’s see whether any of these restaurants are from a specific country.

No_rating = df2[df2['Rating color']=='White'].groupby('Country').size().reset_index().rename(columns={0:'Rating Count'})
No_rating

	Country	Rating Count
0	Brazil	5
1	India	2139
2	United Kingdom	1
3	United States	3

India appears to have the highest number of unrated restaurants. Because the habit of ordering food is still gaining traction in India, most eateries remain unrated on Zomato, as individuals may choose to visit the restaurant for a meal.

Country and Currency

	Country	Currency
0	Phillipines	Botswana Pula(P)
1	Brazil	Brazilian Real(R$)
2	Australia	Dollar($)
3	Canada	Dollar($)
4	Singapore	Dollar($)
5	United States	Dollar($)
6	UAE	Emirati Diram(AED)
7	India	Indian Rupees(Rs.)
8	Indonesia	Indonesian Rupiah(IDR)
9	New Zealand	NewZealand($)
10	United Kingdom	Pounds(��)
11	Qatar	Qatari Rial(QR)
12	South Africa	Rand(R)
13	Sri Lanka	Sri Lankan Rupee(LKR)
14	Turkey	Turkish Lira(TL)

The table above shows the countries and the currencies they accept. Surprisingly, four countries appear to accept dollars as a form of payment.

Online Delivery Distribution

plt.figure(figsize=(12,6))
plt.title('Online Delivery Distribution')
plt.pie(df2['Has Online delivery'].value_counts()/9551*100, labels=df2['Has Online delivery'].value_counts().index, autopct='%1.1f%%', startangle=180);

Only around a quarter of eateries accept internet orders. Because the majority of the restaurants listed here are from India, this data may be skewed. Perhaps a city-by-city examination would be more useful.

Understanding the Coverage of the City

from plotly.offline import init_notebook_mode, plot, iplot
import plotly.graph_objs as go
plt.figure(figsize=(12,6))
# import plotly.plotly as py

labels = list(df2.City.value_counts().head(20).index)
values = list(df2.City.value_counts().head(20).values)

fig = {
    "data":[
        {
            "labels" : labels,
            "values" : values,
            "hoverinfo" : 'label+percent',
            "domain": {"x": [0, .9]},
            "hole" : 0.6,
            "type" : "pie",
            "rotation":120,
        },
    ],
    "layout": {
        "title" : "Zomato's Presence Citywise",
        "annotations": [
            {
                "font": {"size":20},
                "showarrow": True,
                "text": "Cities",
                "x":0.2,
                "y":0.9,
            },
        ]
    }
}
iplot(fig);

Questions and Answers

Which locality has maximum hotels listed in Zomato?

Delhi = df2[(df2.City == 'New Delhi')]
plt.figure(figsize=(12,6))
sns.barplot(x=Delhi.Locality.value_counts().head(10), y=Delhi.Locality.value_counts().head(10).index)

plt.ylabel(None);
plt.xlabel('Number of Resturants')
plt.title('Resturants Listing on Zomato');

Connaught Place appears to have a large number of restaurants listed on Zomato. Let’s take a look at the cuisines that the highest eateries have to offer.

2. What Kind of the Cuisines Highly Rated Restaurants Offers?

# I achieve this by the following steps

## Fetching the resturants having 'Excellent' and 'Very Good' rating
ConnaughtPlace = Delhi[(Delhi.Locality.isin(['Connaught Place'])) & (Delhi['Rating text'].isin(['Excellent','Very Good']))]

ConnaughtPlace = ConnaughtPlace.Cuisines.value_counts().reset_index()

## Extracing all the cuisens in a single list
cuisien = []
for x in ConnaughtPlace['index']: 
  cuisien.append(x)

# cuisien = '[%s]'%', '.join(map(str, cuisien))
cuisien

['North Indian, Chinese, Italian, Continental',
 'North Indian',
 'North Indian, Italian, Asian, American',
 'Continental, North Indian, Chinese, Mediterranean',
 'Chinese',
 'Continental, American, Asian, North Indian',
 'North Indian, Continental',
 'Continental, Mediterranean, Italian, North Indian',
 'North Indian, European, Asian, Mediterranean',
 'Japanese',
 'Bakery, Desserts, Fast Food',
 'North Indian, Afghani, Mughlai',
 'Biryani, North Indian, Hyderabadi',
 'Ice Cream',
 'Continental, Mexican, Burger, American, Pizza, Tex-Mex',
 'North Indian, Chinese',
 'North Indian, Mediterranean, Asian, Fast Food',
 'North Indian, European',
 'Healthy Food, Continental, Italian',
 'Continental, North Indian, Italian, Asian',
 'Continental, Italian, Asian, Indian',
 'Biryani, Hyderabadi',
 'Bakery, Fast Food, Desserts',
 'Italian, Mexican, Continental, North Indian, Finger Food',
 'Cafe',
 'Asian, North Indian',
 'South Indian',
 'Modern Indian',
 'North Indian, Chinese, Continental, Italian',
 'North Indian, Chinese, Italian, American, Middle Eastern',
 'Fast Food, American, Burger']

from wordcloud import WordCloud, STOPWORDS
import matplotlib.pyplot as plt
import pandas as pd
  
comment_words = ''
stopwords = set(STOPWORDS)
  
# iterate through the csv file
for val in cuisien:
      
    # typecaste each val to string
    val = str(val)
  
    # split the value
    tokens = val.split()
      
    # Converts each token into lowercase
    for i in range(len(tokens)):
        tokens[i] = tokens[i].lower()
      
    comment_words += " ".join(tokens)+" "
  
wordcloud = WordCloud(width = 800, height = 800,
                background_color ='white',
                stopwords = stopwords,
                min_font_size = 10).generate(comment_words)
  
# plot the WordCloud image                       
plt.figure(figsize = (8, 8), facecolor = 'b', edgecolor='g')
plt.title('Resturants cuisien -  Top Resturants')
plt.imshow(wordcloud)
plt.axis("off")
plt.tight_layout(pad = 0)
  
plt.show()

The following cuisine appears to be performing well among top-rated restaurants.

North Indian
Chinese
Italian
American

3. How Many Restaurants Accept Online Food Delivery?

top_locality = Delhi.Locality.value_counts().head(10)
sns.set_theme(style="darkgrid")
plt.figure(figsize=(12,6))
ax = sns.countplot(y= "Locality", hue="Has Online delivery", data=Delhi[Delhi.Locality.isin(top_locality.index)])
plt.title('Resturants Online Delivery');

Apart from Shahdara, restaurants in other parts of the city accept online orders.

In Defense colony and Malviya Nagar, online delivery appears to be on the rise.

4. Understanding Restaurants Ratings Locality

Besides Malviya Nagar and Defense colony, the rest of the neighborhoods seemed to prefer going to restaurants than buying food online.

Now let us know how these restaurants in Malviya Nagar, Defense colony, rate when it comes to online delivery.

understanding-restaurants-ratings-locality

The number of highly rated restaurants appears to be high in Defense Colony, but Malviya Nagar appears to have done better as a result of Good and Average restaurants.

As eateries with a ‘Bad’ or ‘Not Rated’ rating are far fewer than those with a ‘Good’, ‘Very Good’, or ‘Excellent’ rating. As a result, residents in these areas prefer to order online.

5. Rating vs. Cost of Dining

plt.figure(figsize=(12,6))
sns.scatterplot(x="Average Cost for two", y="Aggregate rating", hue='Price range', data=Delhi)

plt.xlabel("Average Cost for two")
plt.ylabel("Aggregate rating")
plt.title('Rating vs Cost of Two');

It is found that there is no relation between pricing and rating. For example, restaurant having good rating (like 4-5) have eateries encompasses the entire X axis with a wide range of prices.

6. Location of Highly Rated Restaurants Across New Delhi

Delhi['Rating text'].value_counts()
Average      2495
Not rated    1425
Good         1128
Very Good     300
Poor           97
Excellent      28
Name: Rating text, dtype: int64

import plotly.express as px
Highly_rated = Delhi[Delhi['Rating text'].isin(['Excellent'])]

fig = px.scatter_mapbox(Highly_rated, lat="Latitude", lon="Longitude", hover_name="City", hover_data=["Aggregate rating", "Restaurant Name"],
                        color_discrete_sequence=["fuchsia"], zoom=10, height=300)
fig.update_layout(mapbox_style="open-street-map")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.update_layout(title='Highle rated Resturants Location',
                  autosize=True,
                  hovermode='closest',
                  showlegend=False)
fig.update_layout(
    autosize=False,
    width=800,
    height=500,)

fig.show()

The four cities indicated above account for almost 65 percent of the total information in the dataset. Besides the highest rated local restaurants, it would be fascinating to learn about the well-known eateries in the area. These are the verticals over which they can be found:

Ice Creams
Shakes
and Desserts

7. Common Eateries

Breakfast and Coffee Locations

types = {
    "Breakfast and Coffee" : ["Cafe Coffee Day", "Starbucks", "Barista", "Costa Coffee", "Chaayos", "Dunkin' Donuts"],
    "American": ["Domino's Pizza", "McDonald's", "Burger King", "Subway", "Dunkin' Donuts", "Pizza Hut"],
    "Ice Creams and Shakes": ["Keventers", "Giani", "Giani's", "Starbucks", "Baskin Robbins", "Nirula's Ice Cream"]
}

breakfast = Delhi[Delhi['Restaurant Name'].isin(types['Breakfast and Coffee'])]
american = Delhi[Delhi['Restaurant Name'].isin(types['American'])]
ice_cream = Delhi[Delhi['Restaurant Name'].isin(types['Ice Creams and Shakes'])]

breakfast = breakfast[['Restaurant Name','Aggregate rating']].groupby('Restaurant Name').mean().reset_index().sort_values('Aggregate rating',ascending=False)
breakfast
https://www.foodspark.io/how-analyzing-zomato-restaurant-data-can-help-you-extract-the-best-cuisines-data-in-local-areas.php
import plotly.express as px

df= breakfast
fig = px.bar(df, y='Aggregate rating', x='Restaurant Name', text='Aggregate rating', title="Breakfast and Coffee locations")
fig.update_traces(texttemplate='%{text:.3s}', textposition='outside')
fig.update_layout(
    autosize=False,
    width=800,
    height=500,)
fig.show()

The Chaayos stores are doing much better. More of those are needed in Delhi. Café Coffee Day appears to have a low average rating. They must increase the quality of their offerings.

Fast Food Restaurants

american = american[['Restaurant Name','Aggregate rating']].groupby('Restaurant Name').mean().reset_index().sort_values('Aggregate rating',ascending=False)
american

	Restaurant Name	Aggregate rating
0	Burger King	3.477778
3	McDonald’s	3.445455
2	Dunkin’ Donuts	3.300000
4	Pizza Hut	3.158333
5	Subway	3.047368
1	Domino’s Pizza	2.794545

import plotly.express as px

df= american
fig = px.bar(df, y='Aggregate rating', x='Restaurant Name', text='Aggregate rating', title="Fast Food Resturants")
fig.update_traces(texttemplate='%{text:.3s}', textposition='outside')
fig.update_layout(
    autosize=False,
    width=800,
    height=500,)

fig.show()

Ice Cream Parlors

ice_cream = ice_cream[['Restaurant Name','Aggregate rating']].groupby('Restaurant Name').mean().reset_index().sort_values('Aggregate rating',ascending=False)
ice_cream

	Restaurant Name	Aggregate rating
5	Starbucks	3.750000
2	Giani’s	3.011765
3	Keventers	2.983333
0	Baskin Robbins	2.769231
1	Giani	2.675000
4	Nirula’s Ice Cream	2.400000

import plotly.express as px

df= ice_cream
fig = px.bar(df, y='Aggregate rating', x='Restaurant Name', text='Aggregate rating', title="Ice Cream Parlours")
fig.update_traces(texttemplate='%{text:.3s}', textposition='outside')
fig.update_layout(
    autosize=False,
    width=800,
    height=500,)
fig.show()

Local brands appear to be doing better than foreign brands.

Conclusion

Here’s a quick rundown of a few:

The dataset is skewed toward India and does not include all restaurant data from around the world.

Restaurants are rated in six different categories.

Not a PG-13 film
Average
Good
Excellent
Excellent

Although Connaught Palace has the most restaurants listed on Zomato, it does not offer online deliveries. The situation in Defense Colony and Malviya Nagar appears to be improving.

The following cuisine appears to be receiving good ratings at the top-rated restaurants.

Italian-American Chinese-North Indian

There is no link between the price and the rating. Some of the highest-rated restaurants are also the least expensive, and vice versa.

In terms of common eateries, Indian restaurants appear to be better rated for breakfast and coffee, whereas American restaurants appear to be better rated for fast food chains and ice cream parlors.

If you want to know more about our other restaurant data scraping services or Zomato restaurant data scraping service then contact Foodspark now!!