Using Lambda & Selenium to buy a mountain bike in 2021

Using Lambda & Selenium to buy a mountain bike in 2021

2021 has been a strange year. Supply chains are in ruins, cost of goods are sky high and there is extreme shipping congestion.

This is all really bad news for me if i want to buy a new mountain bike from the UK.

The bike i want is only available from one online store, globally. There are country specific domains for the website, but the stock of these bikes are globally shared. The site also does not offer an alert system for these bikes due to scarcity. This creates a very interesting situation that i believe i can solve for myself (and now you).

When the bike i want does come into stock, i want to know about that instantly.

I’m going to create an AWS Lambda bot that will check the stock of the bike i want, every 5 minutes, and send me a SMS message if it is in stock.

Table of Contents

Selenium

Selenium is a powerful web browser automation tool.

Today i’m going to use it to load the website where i want to buy my bike and determine the stock level. First i need to load the website and locate the information.

Using the inspect tool, you can analyse the HTML of any webpage. The goal here is to determine where the key information is located, and how you can extract it.

The key information for me is:

  • Mountain bike size
  • Stock Status

HTML from mountain bike shop

HTML from mountain bike shop

HTML from mountain bike shop

Let’s get a better look at that important bit of HTML:

<li title="Large" for="104711085" class="email-when-in-stock  bem-sku-selector__option-group-item" style="display: list-item;">
        <input id="104711085" data-colour="Artichoke Green " data-size="Large" data-size-cd="" data-out-of-stock-for-country="False" type="radio" data-ga-action="Size" data-ga-label="Large" data-display-buy="{&quot;BuyType&quot;:1,&quot;AddToBasketButtonText&quot;:&quot;Add to Basket&quot;,&quot;ProductAvailabilityMessage&quot;:&quot;Currently out of stock&quot;,&quot;IsAvailableToOrder&quot;:true,&quot;IsInStock&quot;:false,&quot;EmailWhenInStockAvailable&quot;:false,&quot;ShowAddToBasketButton&quot;:false,&quot;IsAddedToDefaultWishList&quot;:false,&quot;ProductAvailabilityAdditionalMessage&quot;:&quot;Out of stock. Normally available in 2-4 weeks&quot;}" data-list-price="NZ$7,743.83" data-unit-price="NZ$7,743.83" data-price-reason="Regular Price Change" data-additional-message="Out of stock. Normally available in 2-4 weeks" name="id" value="104711085" data-promo-message="" data-promo-sticker="" data-promo-ends="" data-percentage-saving="" data-product-availability-message="Currently out of stock" data-ewis-subscribed="false" data-ewis-message="" data-available-to-order="0" class="js-product-sku productId_104711085">
        <span class="bem-sku-selector__size js-size">Large</span>
        <span class="bem-sku-selector__price pull-right">NZ$7,743.83</span>
            <div class="bem-sku-selector__status">
                <span class="bem-sku-selector__status-stock bem-product-selector__radio out-of-stock js-stock-status-message">Currently out of stock</span>
            </div>
    </li>

What i can see here is title is the size of the bike - This is how i will extract the bike size information.

Within the sub-element of type input, there is a attribute called data-display-buy. data-display-buy contains a json structure, and one of those keys is called ProductAvailabilityMessage - That’s how i’ll get the bike’s stock level.

Get the Size

First we need to set things up.

# Create webdriver
driver = webdriver.Chrome("/opt/bin/chromedriver",
                            options=options)
# Load the web page
driver.get("https://<insert bike product page>)

Now the driver is setup. I want to search for all elements whose class name is bem-sku-selector__option-group-item. I know there is one of these elements for each size of the bike.

From there all i needed to do is get the title of that element.

# iterate through all elements of class name bem-sku-selector__option-group-item
# i.e. iterate through all sizes
for bike_driver in driver.find_elements_by_class_name("bem-sku-selector__option-group-item"):

    # get title from element.
    size = bike_driver.get_attribute("title").lower()

Get the Stock Status

A bike_driver variable exists for each size of the bike, which i’m iterating through. bike_driver represents the element of class name bem-sku-selector__option-group-item.

I know that object has a sub-element called input, and i want to get inside that and extract some information from one of input’s attributes.

    # get inner element, from within the previously used 'bike_driver' (specific size)
    inner_element = bike_driver.find_element_by_xpath("input").get_attribute("data-display-buy")

    # load the json and extract the value from key 'ProductAvailabilityMessage'.
    status = json.loads(inner_element)['ProductAvailabilityMessage'].lower()

Putting the Information to Work

I can see the bike status when out of stock is called: currently out of stock. I also know that the in-stock status message is different depending on the level of stock.

I want to be notified for every single stock level.

I also only care about the Large size.

            # Throw away anything that's not large.
            if size == 'large':
                # Check if in stock by:
                # Check the if the out-of-stock status is NOT the current status
                if 'currently out of stock' not in status:
                    # Print the successful result
                    print(f"{bike.split('/')[-1].replace('-',' ')} available in {size}. See here - {bike}")

                    # Send me a SMS via AWS SNS.
                    sns_client.publish(TopicArn='arn:aws:sns:ap-southeast-2:111122223333:mobile', Message=f"{bike.split('/')[-1].replace('-',' ')} available in {size}. See here - {bike}")

                # If out of stock
                else:
                    # Print unsuccessful result
                    print(f"{bike.split('/')[-1].replace('-',' ')} is unavailable in {size}")
    # Catch any error
    except Exception as e:

        # Send me an email of the error so i can fix it asap.
        sns_client.publish(TopicArn='arn:aws:sns:ap-southeast-2:111122223333:personal-email', Message=f"MTB Lambda Error: {e}")

        # Print error
        print(e)

# We're all done! Close the driver.
driver.quit()

AWS Lambda

I want to run this bot day and night, non-stop, and be notified immediately in the event of stock - But i also don’t want to spend money to run this on a server.

That’s where AWS Lambda comes in.

AWS Lambda is a serverless compute service and it costs peanuts. Peanuts however, is more than zero! Thankfully, AWS has a huge free tier and Lambda is a key part of that. I get 400,000 GB seconds of lambda compute every month, forever, for free. Thanks Jeff mate!

Now that sounds great, but it’s a little more tricky to run Selenium within Lambda… To solve that, i’ll be using Docker.

Docker

I’m not the first person to have this issue, and the good people of the internet have shared their knowledge for me to learn from. Thanks docker-selenium-lambda!

FROM public.ecr.aws/lambda/python:3.8 as build
RUN mkdir -p /opt/bin/ && \
    mkdir -p /tmp/downloads && \
    yum install -y unzip && \
    curl -SL https://chromedriver.storage.googleapis.com/2.37/chromedriver_linux64.zip > /tmp/downloads/chromedriver.zip && \
    curl -SL https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-37/stable-headless-chromium-amazonlinux-2017-03.zip > /tmp/downloads/headless-chromium.zip && \
    unzip /tmp/downloads/chromedriver.zip -d /opt/bin/ && \
    unzip /tmp/downloads/headless-chromium.zip -d /opt/bin/

FROM public.ecr.aws/lambda/python:3.8
COPY requirements.txt ./
RUN mkdir -p /opt/bin && python3.8 -m pip install -r requirements.txt -t .
COPY google-chrome.repo /etc/yum.repos.d/
RUN yum install -y --enablerepo=google-chrome google-chrome-stable
COPY --from=build /opt/bin/headless-chromium /opt/bin/
COPY --from=build /opt/bin/chromedriver /opt/bin/
COPY app.py ./
CMD ["app.lambda_handler"]

We’re almost there!

We currently have:

  • A Lambda script that sends me a SMS if a Large bike is in stock
  • Containerized the Lambda script

And now we need to deploy this to AWS, and run it on a schedule.

Deploy via AWS SAM

AWS SAM (Serverless Application Model) is a framework to build and deploy serverless apps.

I’ll be using AWS Lambda, SNS, and EventBridge - So AWS SAM is perfect for me!

Now this post’s focus is on Selenium & running it within Lambda, so i won’t dwell on AWS SAM - Here’s a getting started guide.

Template

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: >
  python3.8

  SAM Template for mtb-stock-notification  

# More info about Globals: https://github.com/awslabs/serverless-application-model/blob/master/docs/globals.rst
Globals:
  Function:
    Timeout: 120

Resources:
  MTBStockNotification:
    Type: AWS::Serverless::Function # More info about Function Resource: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction
    Properties:
      PackageType: Image
      MemorySize: 230
      ImageUri: 'mtbstocknotification'
      Events:
        MTBStockNotificationSchedule:
          Type: Schedule
          Properties:
            Schedule: 'rate(5 minutes)'
            Name: mtb-stock-notification
            Enabled: True
      Policies:
        - SNSPublishMessagePolicy:
            TopicName: mobile
        - SNSPublishMessagePolicy:
            TopicName: personal-email
    Metadata:
      Dockerfile: Dockerfile
      DockerContext: ./source
      DockerTag: python3.8-v1
      DockerBuildArgs: {"--platform": "linux/amd64"}

Outputs:
  MTBStockNotification:
    Description: "MTBStockNotification Function ARN"
    Value: !GetAtt MTBStockNotification.Arn
  MTBStockNotificationIamRole:
    Description: "Implicit IAM Role created for MTBStockNotification function"
    Value: !GetAtt MTBStockNotificationRole.Arn

Build & Deploy

aws sam build

aws sam deploy --guided

Conclusion

Nicely done!

We’ve successfully used selenium to extract the important information from a website. Using Serverless AWS tools, we deployed a system that will run our code over and over, reporting any MTB stock success to our mobile phone.

I’ve been running this bot for 9 months now. The bike came into stock once over that time, for 3 hours, while i was overseas. Maybe next time i’ll be more successful…

The finished product

The finished product

The finished product

comments powered by Disqus