How Python Is Used to Scrape Store Locations from Target.com?

How Python Is Used to Scrape Store Locations from Target.com?

 

Rather than taking time to manually acquire information, using web scraping is a faster and more efficient approach to go and get store location details for a specific website.

This tutorial will show you how to scrape store locations and contact information from Target.com, Target is one of the leading discount retailer websites in the United States.

In this blog, our scraper will retrieve the information of store information using zip code.

The below is the data fields that will be extracted:

1. Store name
2. Contact details
3. Weekday
4. Address
5. Open Hours

A screenshot of the information which will be retrieved as part of this blog can be found below.

Data-Field.

We could collect a lot more information from Target’s store summary page, including pharmacy and grocery hours.

Scraping Logic

1. Target is used to create URL for a search results page. Let’s look at Clinton, New York as an example. To scrape data from that page, we’ll have to manually build this URL: https://www.target.com/store-locator/find-stores?address=12901&capabilities=&concept=
2. Utilizing Python Requests, download the HTML of the results page— once you know the URL, it’s quite simple. To download the whole HTML of this page, we use Python requests.
3. Save the information in a JSON file.

Installing Python 3 and Pip

• PIP to install the following packages in Python (https://pip.pypa.io/en/stable/installing/)
• Python Requests, to make requests and download the HTML content of the pages ( http://docs.python-requests.org/en/master/user/install/).

The Script

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import requestsimport refrom time 
import time
import json
import argparse 
def get_store(store): 
store_name = store['Name'] 
store_timings = store['OperatingHours']['Hours'] 
street = store['Address']['AddressLine1'] 
city = store['Address']['City'] 
county = store['Address'].get('County') 
zipcode = store['Address']['PostalCode'] 
state = store['Address']['Subdivision'] 
country = store['Address'].get('CountryName') 
try: contact = store['TelephoneNumber'][0]['PhoneNumber'] except: contact = store['TelephoneNumber'].get('PhoneNumber') 
open_timing = [] stores_open = [] 
for store_timing in store_timings: 
    timing = store_timing['TimePeriod']['Summary'] 
    weekDay = store_timing['FullName'] 
    stores_open.append(weekDay) 
    open_timing.append({"Week Day":weekDay,"Open Hours":timing}) 
    data = { 'Store_Name' : store_name, 'Street' : street, 'City' : city, 'County' : county, 'Zipcode' : zipcode, 'State' : state, 'Contact' : contact, 'Timings' : open_timing, 'Stores_Open' : stores_open, 'Country' : country } return data def parse(zipcode): 
    #sending requests to get the accesskey for the store listing page url stores_url = 'https://www.target.com/store-locator/find-stores?address={0}&capabilities=&concept='.format(zipcode) front_page_response = requests.get(stores_url) raw_access_key = re.findall("accesskey\s+?\:\"(.*)\"",front_page_response.text) 
        if raw_access_key: accesskey = raw_access_key[0] 
        else: print("Access key not found") 
        access_time = int(time()) 
        stores_listing_url = 'https://api.target.com/v2/store?nearby={0}&range=100&locale=en-US&key={1}&callback=jQuery2140816666152355445_1500385885308&_={2}'.format(zipcode,accesskey,access_time) 
        storeing_response = requests.get(stores_listing_url) 
        content =re.findall("\((.*)\)",storeing_response.text) 
        Locations = [] try: json_data = json.loads(content[0]) 
        total_stores = json_data['Locations']['@count'] 
        if not total_stores == 0: stores = json_data["Locations"]["Location"] 
        # Handling multiple Locations if total_stores > 
        1: for store in stores: Locations.append(get_store(store)) 
        # Single Location else: Locations.append(get_store(stores)) 
        return Locations except ValueError: print("No json content found in response") 
        if __name__=="__main__": argparser = argparse.ArgumentParser() argparser.add_argument('zipcode', help='Zip code') args = argparser.parse_args() zipcode = args.zipcode print("Fetching Location details") 
        scraped_data = parse(zipcode) 
        print("Writing data to output file") 
        with open('%s-locations.json'%(zipcode),'w') 
        as fp: json.dump(scraped_data,fp,indent = 4)

Executing the Scraper

Let’s say the scraper’s name is target.py. In the command prompt, type the script name followed by a -h.

usage: target.py [-h] zipcode
positional arguments:
zipcode Zip code
optional arguments:
-h, --help show this help message and exit

The parameter area code is the zip code that will be used to locate stores in the vicinity of the specified location.

To find all of the Target stores in and around Clinton, New York, we would enter 12901 as the zip code:

python target.py 12901

This will develop a JSON output file named 12901-locations. JSON was created in a similar folder same as the script. The output will look like this:

{
"County": "Clinton", 
"Store_Name": "Plattsburgh", 
"State": "NY", 
"Street": "60 Smithfield Blvd", 
"Stores_Open": [
"Monday-Friday", 
"Saturday", 
"Sunday"
], 
"Contact": "(518) 247-4961", 
"City": "Plattsburgh", 
"Country": "United States", 
"Zipcode": "12901-2151", 
"Timings": [
{
"Week Day": "Monday-Friday", 
"Open Hours": "8:00 a.m.-10:00 p.m."
}, 
{
"Week Day": "Saturday", 
"Open Hours": "8:00 a.m.-10:00 p.m."
}, 
{
"Week Day": "Sunday", 
"Open Hours": "8:00 a.m.-9:00 p.m."
}
]
}

For any queries regarding scraping store locations data from Target, contact Locationscloud today!!

Locationscloud Helps You To Get Customise DataSet As Per Requirements

Contact Us