Category Archives: Analysis

Excel Requests – HTTP for Humans

Written by Bjoern Stiel

APIs are the lifeblood of the internet, providing access to bits and bytes that make our personal and professional worlds work.

For example, Quandl provide a vast amount of financial and economic data through their API, while GitHub and StackOverflow let you pull out all sort of user and repo data through their API.

But how do we communicate with these APIs… from Excel?

Working with HTTP with all its different authentication methods and custom headers can be painful. Plus, Excel doesn’t play nicely with Json. Sure, VBA is always an option (VBA-Web seems a very decent project) and Power Query has been a huge leap forward, but neither of them are ideal if you need something quick or programmatic.

As a seasoned Pythonista, I am a big fan of Kenneth Reitz’ Requests package as it massively simplifies HTTP communication in the Python world. This inspired me to write Excel Requests, a dead simple Excel Addin to make Json over HTTP from Excel a walk in the park.

Installation

Installing Excel Requests is fairly simple, download the PowerShell installation script from the docs and run it (right click, Run with PowerShell).

Making your first Request

Start a new Excel (in case you haven’t already done so after the installation) and let’s make a GET request to get the Facebook share price history from Quandl.

 

>>> =REQUESTS.GET("https://www.quandl.com/api/v3/datasets/WIKI/FB.json")
    "https://www.quandl.com/api/v3/datasets/WIKI/FB.json"

Now, this function simply returns the URL which might be a bit surprising. This URL represents the Quandl server response. It is a (nested) Json dictionary that we cannot represent easily in our two dimensional grid system. So instead we return a “handle” to the cached dictionary, which we can access through two other Excel functions.

Let’s see what data the response dictionary contains:

>>> =REQUESTS.DICT.KEYS("https://www.quandl.com/api/v3/datasets/WIKI/FB.json")
    {"Text";"StatusCode";"StatusDescription";"ContentType";"Method";"Json"}

REQUEST.DICT.KEYS gives us the list of dictionary fields. In this case it is the response dictionary with meta information on StatusCode, StatusDescription, ContentType and Method and the actual response in the “Text” field. And, in the Json case, the actual response parsed as… you guessed it, another dictionary.

If we go back to https://www.quandl.com/api/v3/datasets/WIKI/FB.json we can see, that the timeseries data we are after is in dataset/data, so let’s pull it out (you can see from the Json representation that it is an array, so bear in mind to do it as an Excel array formula):

>>> =REQUESTS.DICT.GET("https://www.quandl.com/api/v3/datasets/WIKI/FB.json", "Json/dataset/data")
    {"2016-09-12",125.96;"2016-09-09",129.71; ...}

Authentication

You might often need to pass authentication. If you happen to have a GitHub account, you need to authenticate to pull information about yourself. A quick look at the GitHub API docs reveals that we can use basic authentication whereby the authentication string is simply <username>:<token>:

>>> =REQUESTS.GET("https://api.github.com/user",,,"<username>:<token>")
    https://api.github.com/user

 

 

Custom Headers

If you’d like to add HTTP headers to a request, simply pass in a two-column range to the headers parameter (column one refers to the header name and column two to the header value).

For example, Appveyor uses a (very slightly) different authentication method, usually referred to as Bearer Token Authentication

>>> =REQUESTS.GET("https://ci.appveyor.com/api/projects",,{"Authorization","Bearer <token>"})
    https://ci.appveyor.com/api/projects
 

 

Excel Requests is an open source Addin for Excel written in C#
To get started, check out the docs – http://excel-requests.pathio.com/en/latest/
If you need help, contact me bjoern.stiel@pathio.com

 

Written by Bjoern Stiel
Minimalist, Ex-Investment Banker, Founder of Pathio – Version control for Excel
Email me bjoern.stiel@pathio.com, follow me @bjoernstiel

 

 

Another way to do bullet graphs in Excel

Those of you who know me, know I’m a huge fan of Stephen Few. (I mean, I had him autograph my copy of Information Dashboard Design!) Few is highly regarded as an expert on data visualization and dashboard design. He is the original designer of bullet graphs.

Bullet graphs are not native to Excel’s chart library. If you want to create a bullet graph, be prepared for some finagling with chart features, layouts, and color. Several Excel blogs have articles about creating bullet graphs. Here’s a list of my favorites:

How to Make Horizontal Bullet Graphs in Excel and  How to Make Vertical Bullet Graphs in Excel by Jon Peltier

How to Create Bullet Graphs To Replace Gauges in Excel by Charley Kyd

Excel Bullet Graphs by Chandoo

Bullet Graphs for Excel: A Simple Way? by Matt Grams

In addition, Michael Alexander’s book Excel 2007 Dashboards & Reports for Dummies has several great suggestions regarding bullet graphs. And you might also consider Sparklines in Excel, a free tool that can create bullet graphs (and much more) for you automatically. Finally, if you haven’t attempted making these graphs on your own before, I’d start with Jon Peltier’s tutorials. In my opinion, his are the best. Or, alternatively, you could just wait for my book to come out. (Great idea!) But I don’t blame you if you don’t feel like waiting. 

My solution

I’m ready to submit my own solution to creating bullet graphs in Excel. I’ll tell you now it’s an imperfect solution – and there are tradeoffs depending upon what you want to display. You’ll see what I mean in a moment. Basically, my version is a combination of Excel’s Sparklines features and in-cell bar charts.

Here’s how it works. First, you create a series of numbers like I’ve done in the graphic below. These numbers will help form the shaded part of my bullet chart. The basic rule for now is that you need three different groups of three different numbers.

image

Next, select the entire range and insert a new Sparkline Column Chart (in the Sparklines Group on the Insert tab). Choose a location off to the side in which to place your new chart to take advantage of the full length cell. You should have something that looks like the image below.

image

If white spaces appear between the columns in your cell, adjust the cell’s column width to a smaller size until the white spaces disappear.

Next, click on the cell so that the Design tab appears in the tab menu at the top. Check both the High Point and Low Point checkboxes in the Show group. Now, staying on the Design tab, select Marker Color (in the Style Group), pull up the High Point color selector and choose a dark shade of green. Click Marker Color again and this time select the Low Point color and choose a very light shade of green. Finally, click Sparkline Color and select a middle-level shade of green. You now see something like this:

image

From here, we’ll do some axes adjustments. From the Design tab, press the Axis dropdown and select custom value for the minimum value. Type 0 and press ENTER. Now, again, select a custom value but this time click on the maximum value. Type 1 and press ENTER. You should now see a graph like this:

image

OK. With our shaded regions defined, we’re now ready to do some experimentation.

For a simple bar, we can use the ol’ pipe symbol repetition method, like this. One cool feature of the pipe symbol method is that I can make the bar go in the other direction by changing the alignment from left-aligned to right-aligned. Make sure to try it out in the enclosed example file.

image

We could also use various symbols as well (as a variation on the on the pipe symbol method). See this article by Jon Peltier for ideas.

image

If we wanted to insert a target into the chart, we could also insert a line shape like this (now with new colors!):

image

I’m not a huge fan of this method because it’s not easy to automate. If you have a series of bullet charts all of which have a target of 100%, the line isn’t such a bad idea. You could draw a straight line down across the series of them. But the pipe symbol repetition and shaded regions are all easily defined and changed with formulas. For example, I went back to our original data range and replaced several three’s with two’s to increase the range of the middle region.

image

One way to make the target-line automated is to create the line out of the Sparkline columns. If I go back to my original data set, I can change it to look like this:

image

Note that I’ve inserted a one into the series of twos and left the remainder blank. (You don’t need the red font, I just wanted to draw your attention to the difference.) My graph will now look like this:

image

But if I change the Low Point’s marker color to black, I can make it look like this:

image 
If I move that one in the range, I can make it look like this:

image

Obviously, I lose the final shaded region with this method. But that might not be a big deal in the end. With some scales and design, the last shade is still visually implied:

image

One last alternative (see edit!), also not really a favorite, is to use the the pipe symbol for the target and setting the font to a strikethrough style, like this:

image

I don’t like this method as much because the thinness of the strikethrough line bothers me. But of all the methods presented so far, strikethrough doesn’t require I give up any regions. Also, note, the last region hasn’t been shaded but the region still feels somewhat there.

EDIT!

I don’t know what I was thinking when I posted the strikethrough method. If you use strikethrough, you won’t ever get to a point where there’s distance between your performance measure and your target. I’ve kept it in the in the example file so you can see what I mean. However, you can still get something similar to the strikethrough method by using the dash (“-“) character with the pipe symbol instead. 

Thoughts

I really haven’t said anything about scaling – and there are scaling issues! You’ll have to play around with your REPT() formula to figure out how many pipes (or other desired character) can fit in the cell. You’ll then have to figure how much one character unit represents. Finally, you’ll have to ensure your character units align correctly with your Sparkline Column units. For now, I leave that up to you. In the future, I plan to have more applied examples. Until then, you can download this file of examples:

In Cell Chart Examples.xlsx

As a bonus, I’ve included this in-cell box and whisker chart:

image

One last, final point. I’d love to hear your feedback. I still have yet to play around with these charts in full, so I’m not entirely convinced of their usefulness. If you hate them, let me know. If you love them, let me know. Or, if you have other ideas for creating in cell charts using Excel Sparklines, share ‘em!

Also, Stephen Few recently released a new edition of Information Dashboard Design: Displaying Data for At-a-Glance Monitoring, which I highly recommend.

Newspapers – Read Your Sources!

Update – Apologies for the bad proofreading job in the original blog post. Before writing this, I had spent the entire day writing a book proposal for a technical publisher. I was a bit burned-out.

It’s an exciting time for the United States. In the forthcoming days (possibly tomorrow), the Supreme Court of the United States will issue a ruling on whether all or part of the Affordable Care Act is constitutional. Surrounding the hype is an attempt by the media to figure out–that is, predict–the court’s ruling before it’s released this week. One factoid passed around is that a group of former Supreme Court law clerks and attorneys surveyed after arguments a few months ago suggest that the law will be overturned.

In an article by Business Insider, author Brett LoGiurato writes:

Most former Supreme Court attorneys and clerks believe that the individual mandate to buy health insurance — the signature provision of President Barack Obama’s Affordable Care Act — will soon be struck down as unconstitutional by the high court.

That’s according to a joint poll by the American Action Forum, Center Forward and Purple Strategies. An astounding 57 percent of clerks and attorneys polled now think that the mandate will be ruled unconstitutional. That’s up from just 35 percent in March, before the court held oral arguments on the case.

Read more: http://www.businessinsider.com/obamacare-unconstitutional-supreme-court-barack-obama-health-care-2012-6#ixzz1yko2eVxa

The pic below appears in the actual Business Insider article and is a snapshot from the actual survey.

I’m not sure how Business Insider misunderstood the study, especially because a picture of it (the very same picture above) appears in their article. Clearly the survey does not say that 57 percent of clerks and attorneys think the mandate will be ruled unconstitutional. Rather, it says the average probability given by all the respondents is 57 percent. But can we say that “most” of the survey respondents believe that the mandate will likely be struck down? No, that’s wrong, too. We really don’t have enough information to draw that conclusion. More on that point in a minute.

This article, by the Washington Post, gets it wrong too:

new poll of 56 former Supreme Court clerks finds that 57 percent think the individual mandate will be overturned. That’s a 22-point jump from the last time the same group of clerks was surveyed, right before oral arguments. Back then, 35 percent thought the court would toss out the required purchase of health insurance.

Ezra Klien (who loves charts) gets it wrong.

Here, the Wall Street Journal‘s Joseph Rago, too.

Did anyone get it right? Yes. Laura Green, of the Palm Beach Post, writes:

A poll of former Supreme Court clerks and lawyers who have argued before the justices found that many switched their initial prediction that the court would uphold the sweeping law. The oral argument performance led them to now speculate that there is a 57 percent chance the court will strike down the heart of the law, which requires virtually every American to buy insurance.

Read more: http://www.wptv.com/dpp/news/national/health-care-act-studies-of-oral-arguments-predict-court-will-rule-against-reform-this-week#ixzz1yksclk3m

What can we say about the survey’s results?

Not a ton. I question the usefulness of engaging folks on the probability that a dependent, nonrandom event will occur. When the poll’s respondents came up with their “probabilities,” they were really giving a score to how confident they are in their belief that the Supreme Court will strike down the mandate. Calling this a “probability” is misleading.

The average is above 50%–can’t we say that most believe the mandate will be struck down? 

Not really. Consider this scenario. Of the 56 respondents, nine said the probability of the Supreme Court overturning the mandate was 100%; that’s a rather slim, if certain, margin. Now consider that the rest — 47 respondents — giving the probability of the court overturning the mandate a 49%. In other words, the 47 feel the likelihood is slightly less certain than flipping a coin. Do the math; you’ll see that it averages to about 57%. Do those nine respondents in this example constitute a “most believe”?

Survey Source:
Affordable Care Act Survey, June 2012
Sponsored by: American Action Forum, Center Forward, and Purple Strategies

Easy One-Way Sensitivity Analysis on Weighted Sum Models in Excel (Part 2)

Wow, it’s been a while since my last article. I’ve actually been pretty busy the last few weeks putting the finishing touches on an Excel deliverable for one of my company’s clients. We delivered the product about a week ago and things have just now started to slow. So now it’s back to blogging.

If you’ve been following so far, we left off with Easy One-Way Sensitivity Analysis on Weighted Sum Models in Excel (Part 1). In that article, I describe a method for automating one-way sensitivity analysis, and, in Part 1, we construct the mechanism that drives the automation. However, where we left off, our sheet was a bit ugly. Sure, the mechanism worked, but it wasn’t a dashboard. We’ll talk about making a dashboard in this article, Easy One-Way Sensitivity Analysis on Weighted Sum Models in Excel (Part 2). If you’re scratching your head at this point (“what the heck is sensitivity analysis?”), go ahead and start with Part 1. There, I give a brief introduction to the mathematics and rational behind sensitivity analysis (and the weighted-sum model) before going into the Excel stuff. Also, these tutorials are a bit on the long side, so you may just want to do the analog thing and print them out instead of following along on here. 

I’ve been scratching my head as to how to segue from my last article to this one. I’ve decided to do what good chefs do and start with something already prepared. So go ahead and download this file: Healthcare Sensitivity Analysis Example.xlsx. You’re welcome, of course, to recreate the layout of that file, but I’ve done the heavy lifting for you. The core difference between this file and the one you created in the last blog post is that I’ve added many more countries (with fake data). Let’s get the lay of the land of this new file.

If you start with the Data tab you’ll something similar to our last blog post.

The table on the left isn’t all that different from the last file. If you play with the scrollbars, you’ll see that they work exactly as they did before. In the table to the right, each metric has more information broken out than in our last spreadsheet, but the mechanism we created before is exactly the same here (notwithstanding the extra information). Note, too, that the final weighted scores have been moved the front (that is, to the first column on the left) of the final table. You’ll see why at the end of this tutorial.

(The reason more info is broken out is because that info is used in the Healthcare Analysis dashboard in the Fun Downloads section. We won’t use those extra columns here, but if you get the hang of this stuff, consider creating a dropdown to individual weighted scores using that extra information. I might write about how to do that in a Part 3.)

Now click the Example tab. You should see the barebones of our dashboard. If you scroll all the way down (or zoom out) you should see another table below the dashboard called VLookup Table.

Dashboard 

Intermediate Table

This table is an important, if unnecessary, intermediate step. I say unnecessary because you could go without out, but I wouldn’t recommend it; the table makes life much easier. So, this is my advice: whenever you make a dashboard, you should have an intermediate table off-screen that summarizes most of the selection and number crunching for you. For the computer science and java geeks out there, think of this table as part of the model-view-controller framework. The dashboard is our view, or “reporting” layer – it reports data to the users, but it doesn’t allow them to make specific underlying changes to it (they can only change how the data is shown to them). This intermediate table acts kind of like a controller. It handles input stuff like scrolling, and it’s usually only for the Excel developer and not for the client or front-end user (that’s why it’s off-screen). Finally, the underlying data on the backend is our model. I admit that’s a pretty rough explanation, but it serves this very fundamental point to dashboard creation: you should separate the reporting, event handling, and underlying data areas on your spreadsheet. This separation mitigates damage caused by underlying and user-created errors and separates the logic of your work into distinct modules. In other words: it makes life easier.

Let’s go.

Step 1: Link the percentages from your data to your dashboard. 


Start by selecting cell I3 on the Example tab and link it to cell E4 on the Data tab, which holds its corresponding weight. Then go across the boxes on your Example tab and ensure that each cell is linked to its correct proportion on the data tab. Your numbers might be different from mine above, but if you do everything correctly, your mapping should look like this:

Health Level                → Data!E4
Responsiveness              → Data!E5
Financial Fairness          → Data!E6
Health Distribution         → Data!E7
Response Distribution       → Data!E8

It’s always good to double-check to make sure your references are correct. Check twice, reference once.

Step 2: Copy the scrollbars to your dashboard. Resize accordingly.

Remember those scrollbars from the last blog post? They’ll come in handy here. Go to the Data tab. While holding down Ctrl, select all five scroll bars by clicking each one individually. Press Ctrl+C to copy them. Next, go to your dashboard on the Example tab. Press Ctrl+V to paste to your dashboard.

You’ll now need to adjust the size of each scroll bar – then position it next to each number as shown below.
 

Pro Tip: You can make your life easier by selecting the first scrollbar on the left (select it with a Ctrl+Left-Click) on your dashboard. Move the control on top of, or near, the Health Level box and adjust it to the desired height. Use the arrow keys on your keyboard to fine tune its placement (you may need to adjust the height one more time). When you like the scrollbar’s size and position, right-click and select Format Control. Select the Size tab. Take note of its current Height and Width (write it down if you need). Click OK.

Now select all of the remaining scrollbars using the Ctrl+Left-Click as you did in Step 1. Right-click any one of the selected scrollbars, then select Format Control. In the Height and Width boxes enter the information you just noted. Click OK. Each scroll bar is now the right size. Move each scrollbar next to its associated number. Use the arrow keys to fine tune if you’re neurotic (like me!).

Lookin’ good….

Step 3: Create the scrolling indices to show each country.

On the Example tab, scroll all the way down to the VLookup table. In cell F32 of the Example tab you should see only the value “1” alone, by itself. In the cell below the 1 (F33), type =F32 + 1.

Hit Enter. Now drag cell F33 down until you reach the value of 15. Next, go to the Developer tab and insert another From Control scroll bar to the left of the column of numbers we just created. Just like in Step 1, we’ll link this scroll bar to a specific cell on the sheet; in this case, we’ll link it to the cell that held the 1, cell F32. So right-click the scroll bar, then select Format Control. Click the Control tab. In the cell-link box, select (or type) F32. Click OK.

Press the up and down arrows on the scroll bar to see the indices change.

Step 4: Use the indices to pull the final weight values from the Data tab.

Here, we’ll use the scrolling indices to pull information from the Data tab. We can use the Large() formula to ensure the data we pull is sorted. Use the scrollbar to scroll up so that there is only a 1 in F32. (If you see a zero in F32, you’ve scrolled too far. Scroll up one.) Now, in the cell to the right of it, type “=Large(“. Now select the entire column of Final Weights from your data tab, Data!G4:G53.

Hit F4 to make it an absolute reference. Type a comma to go to the next parameter. Now, go back on your Example tab, select the 1 (cell F32) to the left. Your formula for F3 should look like this:

=LARGE(Data!$G$4:$G$53,Example!F32)

What we’ve told Excel to do is pull the greatest value from the set of Final Weights. Hit Enter.

If we drag that formula all the way down, we are then telling Excel to pull the second largest value, third largest value, and so forth. Thus, the resulting values in the Largest Values column are always nth largest value, where n is the number in the cell to the left. That’s how we make an automated sorted list. 

Step 5: Fix the Scrollbar minimum and maximum values.

If you play with the scroll bar, you’ll see that scrolling all the way up or scrolling up the way down results in #NUM errors. This happens because we only have 50 countries to choose from. Indices less than one or greater than fifty are outside the bounds of our set. So right-click the scrollbar, select Format Control. Select the Control tab.

Right away, we know that the Minimum value should be 1. But if we have 50 countries in our set, what do you think the maximum value should be? Hint: it’s not 50. Remember, we’re only changing the value in cell F32, the rest of the values are calculated for us on the spreadsheet. So the maximum value should actually be 36. Why? Because while we have 15 different indices showing, one of them is the cell link and the other 14 are calculated on the spreadsheet. 50 – 14 = 36.

In Sum: 

Minimum Value: 1
Maximum Value: 36 

Click OK.

If you play with the scrollbar, you’ll see that it keeps the indices within the correct bounds.

Step 6: Use VLookup to take largest values to its corresponding information.

In the first cell to the right of the Largest Values column (cell H32), we’ll use the Weight Value to the left as a lookup value. Now, do you see why I moved the final weighted scores all the way to the left on the ata tab? It’s a lookup column now! So, in Cell H32, you’ll have something like the formula shown below. But rather than typing what I have, try to recreate the formula yourself. Then double-check to ensure that our work agrees. 

=VLOOKUP(G32,Data!$G$4:$R$53,2,FALSE)

(1) G32 is the weighted score;
(2) Data!$G$4:$R$53 is the entire table from the Data tab;
(3) 2 tells us to pull from the second column, that’s the name of the country we’re interested in; and,
(4) FALSE ensures an exact match.

But wait, the VLookup Table on the example tab has a column for each metric, and we’ve only filled in info for one column. We’re not just interested in the second column from the table on the data tab. We’re actually interested in columns 2, 4, 6, 8, 10, and 12 as demarcated in red below.


We can actually pull all that info out with only one VLookup. On your data tab, select H32. Now, rewrite the VLookup formula as follows:

=VLOOKUP(G32,Data!$G$4:$R$53,{2,4,6,8,10,12},FALSE) 

Note we have changed the formula from looking up only column 2 to looking up a set of columns. Ensure that you use the curly braces to surround your set of columns as shown above. Now drag the selected cell to the right until you reach column M, the end of the Vlookup table. With H32:M32 still selected, click into the formula bar, then press Ctrl+Shift+Enter. Viola! We use the Ctrl+Shift+Enter here because the VLookup is returning an array of numbers and not just one number. Now drag the selected row all the way down to fill the table. If you’ve done everything correct so far, your table should look like mine.

Use the scrollbar to see the values change.

Step 7: Map the values from the intermediate table to the dashboard.

First, select the scrollbar to the left of your VLookup Table. Press Ctrl+C to copy. Scroll all the way up to the dashboard. Now paste. Move this new scrollbar to a location on your screen that makes sense. You could put it between columns E and G. Or, you can put on the right, as I have, to the right of column O.

Looking at your dashboard, select cell G6. Set G6 to reference the nation at the top of the list in the VLookup table. So, G6 should have “=H32” as its formula. Now drag G6 down 14 rows. If you see one of your rows start showing a value of “0,” you’ve dragged too far. Now, while looking at your dashboard, select cell O6. Map this cell reference the first weighted score on your VLookup table. O6, then, should have the value “=G32.” Drag down. The preset bar charts should show up automatically.

Now play with every scrollbar on the dashboard.

***

This is a good stopping point for Part 2. I might be working on a Part 3 as time allows. If I end up writing a Part 3, we’ll talk about how to make those bar charts. And, we’ll talk about how to build another graph that shows how each country performed in just one metric. Lastly, we’ll talk about how to populate the top and bottom rankings.

If you’re really curious about the incell bar charts, you can see a good discussion on Chandoo’s site,
How to Visualize Survey Results using Incell Panel Charts.

Questions? Feel free to ask.