Using the Amazon Workspaces Cost Optimizer


Travers Annan

5 minute read

Travers Annan

5 minute read

Given the current situation, it’s fair to say that most IT departments have experienced a jump in demand for remote workstations, and many are turning to cloud providers. AWS Workspaces is a convenient way to provide digital workspaces to your employees, but how can you be sure that you’re getting the most out of it?

Enter the AWS Workspace Cost Optimizer. This premade CloudFormation stack provided by AWS sets up a lambda function that checks the usage of each workspace on your account, then updates the billing method for that workstation to the most cost-effective plan. The function runs every 24 hours and generates a report detailing the changes made to your billing structure and the monthly usage of your workstations. With this running, you’ll be able to limit overspending on both underused and overused workspaces.

To deploy it, download the template, then create a new stack using that template. CloudFormation will prompt you with a set of parameters, allowing you to customize billing thresholds and deployment VPCs, however, the solution works well as-is. Once you deploy this stack, it will create a Lambda function and s3 buckets for logging and reporting.

The template is customizable and there are many parameters to play around with, but that base install does a lot of work. Note that some types of workspaces have different optimal cost thresholds - for example Windows performance instances should be converted to monthly billing after 74 hours at time of writing. It’s always worthwhile to check the break-even point of your workstations.

If you’re nervous about automatically making changes to your billing settings, there is a dry-run option that can be enabled in the parameters section. In dry-run mode this solution will continue to generate reports and recommendations, but will refrain from modifying your billing settings. If you enjoy the test drive, you can always disable dry-run mode later by updating the stack and changing that parameter.

This solution is perfectly serviceable for keeping track of and managing your workstation spending, but it would be nice to have some more detailed reporting. Thankfully it isn’t that difficult to implement a monitoring solution for this use case.

The plan is to create a lambda function set to trigger on object creation events in the cost optimizer bucket, scraping relevant data from the report csv file into a DynamoDB table. Once we have that data tabulated we can use additional Lambda functions to run reports. Let’s take a look at the scraper function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
import boto3
import json
import pandas as pd
import datetime
from botocore.exceptions import ClientError


def lambda_handler(event, context):
    # start clients
    s3_client = boto3.client('s3')
    workspaces_client = boto3.client('workspaces')
    dynamo_client = boto3.client('dynamodb')

    #get csv contents into data frame
    df = get_data(s3_client, workspaces_client, event)

    # push df to DynamoDB
    put_items(dynamo_client, df)


# Populate the DynamoDB table with the parsed values in the df
def put_items(client, df):
    for index, row in df.iterrows():
        client.put_item(
            TableName='WorkspacesReportTable',
            Item={
                'WorkspaceID': {'S': str(row['WorkspaceID'])},
                'Billable Hours': {'N': str(row['Billable Hours'])},
                'UserID':{'S': str(row['UserID'])},
                'Date':{'S': str(row['Date'])},
                'Directory':{'S': str(row['Directory'])},
                'Computer Name':{'S': str(row['Computer Name'])},
                'Country':{'S': str(row['Country'])},
                'Office':{'S': str(row['Office'])},
            }
        )


# Get data from a csv uploaded to the workspaces bucket and parse it into a data frame object.
def get_data(s3_client, workspaces_client, event):
    # declare empty lists
    users = []
    report_dates = []
    directories = []
    ws_office = []
    ws_country = []
    computer_names = []

    try:
        # get bucket name and object key
        bucket = event['Records'][0]['s3']['bucket']['name']
        obj_key = event['Records'][0]['s3']['object']['key']
    except:
        print('Couldn\'t read bucket and obj_key from event.')

    # Get current date
    tz = datetime.datetime.now().astimezone().tzinfo
    currentTime = datetime.datetime.now(tz)
    currentDate = currentTime.date()

    # Read columns WorkspaceID and Billable Hours from the cost optimizer csv.
    csv_obj = s3_client.get_object(Bucket=bucket, Key=obj_key)
    body = csv_obj['Body']
    df = pd.read_csv(body,skipinitialspace=True, usecols=['WorkspaceID', 'Billable Hours'])

    # Build arrays for the current month and user of each workstation in the csv.
    for index, row in df.iterrows():
        desc = workspaces_client.describe_workspaces(WorkspaceIds=[row['WorkspaceID']])
        tag = workspaces_client.describe_tags(ResourceId=row['WorkspaceID'])

        # base tag var states
        tag_country_found = False
        tag_office_found = False

        for t in tags['TagList']:
            if t['Key'] == 'ws-country':
                ws_country.append(t['Value'])

Prerequisites for this solution:

  • A DynamoDB table to store your data.
  • The workspace cost optimizer template deployed and running.

This function does three things:

  1. Imports data from the csv file generated by the cost optimizer.
  2. Enriches and transforms that data using API calls and the Pandas library.
  3. Pushes the transformed data into the DynamoDB table.

The three API calls used in this solution are:

  1. get_object()
    • Downloads an object from a s3 bucket (the cost optimizer csv in this case)
  2. describe_workspaces()
    • Returns detailed workspace information
  3. put_items()
    • Adds items to a DynamoDB table. Using Pandas makes our lives easier when manipulating csv files or dealing with tables in general.

In this solution we’re enriching the original data with potentially relevant information, such as user names, tag data, and the date. Once the information is in a table, it becomes much easier to visualize usage patterns and run additional reports.

We’ve used a similar solution to send reports directly to Slack for one of our customers, allowing them to keep a close eye on their workspaces usage.

The Workspaces Cost Optimizer is a handy free tool to have in your belt that can really make a difference in your workspaces spending. Throw in customizability and detailed reporting and you almost can’t afford not to install it.


Orbit

Like what you read? Why not subscribe to the weekly Orbit newsletter and get content before everyone else?