Amongst the biggest apps of Web Scraping is within scraping restaurants listings from different websites. It might be to create aggregators, monitor prices, or offer superior UX on the top of available hotel booking sites.
We will see how a simple script can do that. We will utilize BeautifulSoup for scraping information as well as retrieve hotels data on Zomato.
To begin with, the given code is boilerplate and we require to get Zomato search result pages and set BeautifulSoup for helping us utilize CSS selectors for asking the pages for important data.
# -*- coding: utf-8 -*- from bs4 import BeautifulSoup import requests headers = {'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2 Safari/601.3.9'} url = 'https://www.zomato.com/ncr/restaurants/pizza' response=requests.get(url,headers=headers) soup=BeautifulSoup(response.content,'lxml') #print(soup.select('[data-lid]')) for item in soup.select('.search-result'): try: print('----------------------------------------') print(item) except Exception as e: #raise e print('')
We are passing user agents’ headers for simulating a browser call to avoid getting blocked.
Now, it’s time to analyze Zomato searching results for the destination we need and it works like this.
When we review the page, we will find that all the HTML items are encapsulated in the tag having class search-results.
We need to use it to break an HTML document to these parts that have individual item data like this.
# -*- coding: utf-8 -*- from bs4 import BeautifulSoup import requests headers = {'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2 Safari/601.3.9'} url = 'https://www.zomato.com/ncr/restaurants/pizza' response=requests.get(url,headers=headers) soup=BeautifulSoup(response.content,'lxml') #print(soup.select('[data-lid]')) for item in soup.select('.search-result'): try: print('----------------------------------------') print(item) except Exception as e: #raise e print('')
And once you run that…
python3 scrapeZomato.py
You could tell that the code is separating the HTML cards.
For further assessment, you can observe the restaurant’s name that always has a class result-title. Therefore, let’s try to reclaim that.
# -*- coding: utf-8 -*- from bs4 import BeautifulSoup import requests headers = {'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2 Safari/601.3.9'} url = 'https://www.zomato.com/ncr/restaurants/pizza' response=requests.get(url,headers=headers) soup=BeautifulSoup(response.content,'lxml') #print(soup.select('[data-lid]')) for item in soup.select('.search-result'): try: print('----------------------------------------') #print(item) print(item.select('.result-title')[0].get_text()) except Exception as e: #raise e print('')
This will provide us different names…
Hurrah!
Now, it’s time to get other data…
# -*- coding: utf-8 -*- from bs4 import BeautifulSoup import requests headers = {'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2 Safari/601.3.9'} url = 'https://www.zomato.com/ncr/restaurants/pizza' response=requests.get(url,headers=headers) soup=BeautifulSoup(response.content,'lxml') #print(soup.select('[data-lid]')) for item in soup.select('.search-result'): try: print('----------------------------------------') #print(item) print(item.select('.result-title')[0].get_text().strip()) print(item.select('.search_result_subzone')[0].get_text().strip()) print(item.select('.res-rating-nf')[0].get_text().strip()) print(item.select('[class*=rating-votes-div]')[0].get_text().strip()) print(item.select('.res-timing')[0].get_text().strip()) print(item.select('.res-cost')[0].get_text().strip()) except Exception as e: #raise e print('')
And once you run that…
Creates all the details we require including reviews, price, ratings, and addresses.
In more superior implementations, you would need to rotate a User-Agent string as Zomato just can’t detect it is the similar browser!
In case, we find a bit advanced, you would understand that Zomato could just block the IP by ignoring all the other tricks. It is a letdown and that is where the majority of web scraping projects fail.
Investing in the private turning proxy services including Proxies API could mostly make a difference between any successful as well as headache-free data scraping project that complete the job constantly and one, which never works.
In addition, with 1000 free API calls working, you have nothing to lose with using our comparing notes and rotating proxy. This only takes a single line of addition to its barely disruptive.
Our turning proxy server Proxies API offers an easy API, which can solve your IP Blocking difficulties instantly.
Hundreds of clients have successfully resolved the problem of IP blocking using our east API.
The entire thing could be accessed with an easy API from Foodspark.
To know more about our Zomato Listings Scraper, contact us or ask for a free quote!
We will catch you as early as we receive the message
“We were searching for a web scraping partner for our restaurant data scraping requirements. We have chosen Foodspark and it was an amazing experience to work with them. They are complete professionals in their attitude towards data scraping. We would certainly recommend them to others for their food data scraping requirements.”
“Working with Foodspark was a completely exceptional experience for me. Foodspark team is professional, calm, and works well with all my food data scraping requirements. 5 Stars to them for their web data scraping work.
“We had a great time working with Foodspark for our restaurant food data scraping requirements that no other service providers would able to cope with competently. Foodspark is just amazing! They have done their work wonderfully well! Thank You Foodspark!”
“We were searching for a food data scraping service provider and we have found Foodspark! It was a great experience working with this professional company. They are absolutely professionals in their method of doing web scraping. You can surely hire them for all your food data scraping service requirements.”
“We are a food aggregator app and we were searching for a food data aggregator app data scraping service provider that can satisfy our requirements of extracting food data from our competitor’s app. Team Foodspark has worked extremely hard as the task was very difficult. They have provided great results and we have become their permanent client!”
“We are very much impressed with Foodspark for their Food Data Scraping Services. Our requirements were quite unusual and hard to implement but they were equally good to the job and they have worked very hard to offer us the finest results. Thumps Up to Foodspark!”