Author Archives: bjoernstiel

About bjoernstiel

Python expert. Minimalist. Founder of xltrail (https://www.xltrail.com )

Excel Requests – HTTP for Humans

Written by Bjoern Stiel

APIs are the lifeblood of the internet, providing access to bits and bytes that make our personal and professional worlds work.

For example, Quandl provide a vast amount of financial and economic data through their API, while GitHub and StackOverflow let you pull out all sort of user and repo data through their API.

But how do we communicate with these APIs… from Excel?

Working with HTTP with all its different authentication methods and custom headers can be painful. Plus, Excel doesn’t play nicely with Json. Sure, VBA is always an option (VBA-Web seems a very decent project) and Power Query has been a huge leap forward, but neither of them are ideal if you need something quick or programmatic.

As a seasoned Pythonista, I am a big fan of Kenneth Reitz’ Requests package as it massively simplifies HTTP communication in the Python world. This inspired me to write Excel Requests, a dead simple Excel Addin to make Json over HTTP from Excel a walk in the park.

Installation

Installing Excel Requests is fairly simple, download the PowerShell installation script from the docs and run it (right click, Run with PowerShell).

Making your first Request

Start a new Excel (in case you haven’t already done so after the installation) and let’s make a GET request to get the Facebook share price history from Quandl.

 

>>> =REQUESTS.GET("https://www.quandl.com/api/v3/datasets/WIKI/FB.json")
    "https://www.quandl.com/api/v3/datasets/WIKI/FB.json"

Now, this function simply returns the URL which might be a bit surprising. This URL represents the Quandl server response. It is a (nested) Json dictionary that we cannot represent easily in our two dimensional grid system. So instead we return a “handle” to the cached dictionary, which we can access through two other Excel functions.

Let’s see what data the response dictionary contains:

>>> =REQUESTS.DICT.KEYS("https://www.quandl.com/api/v3/datasets/WIKI/FB.json")
    {"Text";"StatusCode";"StatusDescription";"ContentType";"Method";"Json"}

REQUEST.DICT.KEYS gives us the list of dictionary fields. In this case it is the response dictionary with meta information on StatusCode, StatusDescription, ContentType and Method and the actual response in the “Text” field. And, in the Json case, the actual response parsed as… you guessed it, another dictionary.

If we go back to https://www.quandl.com/api/v3/datasets/WIKI/FB.json we can see, that the timeseries data we are after is in dataset/data, so let’s pull it out (you can see from the Json representation that it is an array, so bear in mind to do it as an Excel array formula):

>>> =REQUESTS.DICT.GET("https://www.quandl.com/api/v3/datasets/WIKI/FB.json", "Json/dataset/data")
    {"2016-09-12",125.96;"2016-09-09",129.71; ...}

Authentication

You might often need to pass authentication. If you happen to have a GitHub account, you need to authenticate to pull information about yourself. A quick look at the GitHub API docs reveals that we can use basic authentication whereby the authentication string is simply <username>:<token>:

>>> =REQUESTS.GET("https://api.github.com/user",,,"<username>:<token>")
    https://api.github.com/user

 

 

Custom Headers

If you’d like to add HTTP headers to a request, simply pass in a two-column range to the headers parameter (column one refers to the header name and column two to the header value).

For example, Appveyor uses a (very slightly) different authentication method, usually referred to as Bearer Token Authentication

>>> =REQUESTS.GET("https://ci.appveyor.com/api/projects",,{"Authorization","Bearer <token>"})
    https://ci.appveyor.com/api/projects
 

 

Excel Requests is an open source Addin for Excel written in C#
To get started, check out the docs – http://excel-requests.pathio.com/en/latest/
If you need help, contact me bjoern.stiel@pathio.com

 

Written by Bjoern Stiel
Minimalist, Ex-Investment Banker, Founder of Pathio – Version control for Excel
Email me bjoern.stiel@pathio.com, follow me @bjoernstiel

 

 

Advertisements

Developing a JSON Excel Add-In With VB.NET

Written by Bjoern Stiel

Add-Ins are a great way to increase Excel’s firepower. Add-Ins not only add additional functionality but are also an ideal vehicle for distributing your custom functions across multiple workbooks and users.

If writing a new Add-In is on your to-do list anyway, this is the perfect opportunity to look beyond VBA. VBA has become a bit middle-aged. Integration with modern stuff like NoSQL databases or web services can become quite painful as it involves messing around with DLLs and cumbersome APIs.

Excel-DNA is a free open source tool that lets you expose .NET functions to Excel via the XLL interface. This gives you immediate access to all .NET languages (VB.NET, C#, F# etc.) and libraries. Which makes developing modern looking WPF forms or hooking into a JSON API really a breeze.

In this post, I show you how to get started with Excel-DNA. We hit the ground running busting out our first two functions and have a brief look at what to watch out for when migrating over from VBA. To wrap it all up, we build a slick Excel JSON client in just a few lines of code and show use it to retrieve data from a web server with a simple =OBJECT_GET(…) function call.

All source code is available for download at GitHub:
https://github.com/Pathio/excel-json-client

An example spreadsheet is available at https://spreadgit.com/bjoern/excel-json-client.

1. Getting Started
There are two ways to write a User Defined Function in Excel-DNA. Either as a text file (.dna) or a .NET assembly (.dll). It’s easiest to start with the text file approach which we will do throughout this post. You need three things:
– a text editor such as Notepad
– the .NET Framework 2.0 runtime or later (.NET 4.0 or later if you are on Excel 64-bit), should already be installed with Windows anyway, so you probably do not need to worry about this bit
– the Excel-DNA XLL (https://exceldna.codeplex.com/releases/view/119190)

If you want to get jump-started, clone my git repository https://github.com/Pathio/excel-json-client and you’re good to go, it comes with all batteries. Get the example workbook at https://spreadgit.com/bjoern/excel-json-client and open one of the XLLs (ending with 32 or 64 -depending whether you are on Excel 32- or 64-bit) via drag & drop, file/open or the Add-In manager.

2. Writing Our First Function
I’ve defined the code in the excel-json-client.dna file which is just a text file. As mentioned above, you can Visual Studio studio (which you probably should) but to avoid all the overhead, I’ll keep it simple in this post.

Let’s look at the DISTINCT function defined in the Helpers module. It gives you a first idea of how we pass around data between .NET and Excel. We receive a two-dimensional array of type Object (think Variant), filter out all duplicates and return another two dimensional Object array. I’m pretty sure there’s a more elegant way of removing the duplicates via LINQ (.NET’s Language Integrated Query language) but it does the job.

Public Module Helpers
 Public Function DISTINCT(data As Object(,))
   Dim dict As Dictionary(Of String, Integer) = New Dictionary(Of String, Integer)
   Dim key As String
   Dim i As Integer = 0, j As Integer = 0
 
   For i = 0 To data.GetUpperBound(0)
     key = ""
     For j = 0 To data.GetUpperBound(1)
       key += data(i, j).ToString()
     Next
     If Not dict.ContainsKey(key) Then
       dict.Add(key, i)
     End If
   Next
   Dim out(dict.Count-1, data.GetUpperBound(1)) As Object
   i = 0
   For Each key In dict.Keys
     For j = 0 To data.GetUpperBound(1)
       out(i, j) = data(dict.Item(key), j)
     Next
     i += 1
   Next
   DISTINCT = out
 End Function
End Module

3. Migrating From VBA to VB.NET
Migrating from VBA to VB.NET/Excel-DNA is relatively straight forward. But there are a couple of gotchas and things to watch out for, most importantly:
– Variant is no longer supported: use the Object type instead
– Function parameters of type Object will arrive as one of the following: Double, String, Boolean, ExcelDna.Integration.ExcelError, ExcelDna.Integration.ExcelMissing, ExcelDna.Integration.ExcelEmpty, Object[,] containing an array with a mixture of the above types, ExcelReference (only if AllowReference=true in ExcelArgumentAttribute causing R type instead of P)
– Arrays: In VBA, the default lower bound of an array dimension is 0 but can be set to 1 using “Option Base”. VB.NET does not support “Option Base” and the lower bound of an array dimension must be 0. Be careful though that most Office collections begin with 1. For a more exhaustive list, I definitely recommend checking out Patrick O’Beirne’s excellent guide and Govert’s (the guy who wrote and maintains Excel-DNA) documentation on Excel-DNA’s Data marshalling.

4. A JSON Excel client
JSON is all over the web and has become the number one choice for RESTful APIs. Pulling JSON from a web server onto Excel is definitely a useful thing to have. For example, at spreadgit, we run our entire admin-backend off Excel via a JSON API. Unfortunately, building a JSON client in VBA is quite a messy job and involves a lot of HTTP and deserialization stuff which does really feel like re-inventing the wheel.

With Excel-DNA however, we can use whatever is available in the .NET world. HTTP request classes are built in and for handling the JSON bit we use the excellent Json.NET library, written and maintained by James Newton-King’s. All we need to do is some plumbing work. Before we roll up sleeves, let’s briefly design the data flow.

1. User requests data from a URL (that returns a JSON) by calling the Excel function =OBJ_GET(http://myobject#field)
2. Let’s see if the object http://myobject already exists in our cache from a previous call. If not, do an HTTP request to http://myobject and stick it into our cache.
3. Return the value for “field” on object http://myobject (now definitely in our cache)

As you can see, we make use of the hash mark separator to identify a field within our JSON document. Let’s start with a thin JSON class. We want to deserializes the JSON data we get from the web server as a (nested) dictionary so we can easily gain access to its fields via keys.

Public Class Json
 Public Shared Function Deserialize(json As String) As Object
   Deserialize = ToObject(JToken.Parse(json))
 End Function

 Public Shared Function ToObject(token As Newtonsoft.Json.Linq.JToken) As Object
   If token.Type = JTokenType.Object Then
   Dim dict As Dictionary(Of String, Object) = New Dictionary(Of String, Object)
   Dim prop As Object
   For Each prop In CType(token, JObject).Properties()
     dict.Add(prop.Name, ToObject(prop.Value))
   Next
   ToObject = dict
   ElseIf token.Type = JTokenType.Array Then
     Dim lst As List(Of Object) = New List(Of Object)
     Dim value As Object
     For Each value In token
       lst.Add(ToObject(value))
     Next
     ToObject = lst
   Else
     ToObject = DirectCast(token, JValue).Value
   End If
   End Function
 End Class

Next thing on our list is the cache. It is simply another dictionary, whereby the url is our key (up to the hash mark separator as mentioned above) and the JSON deserialized dictionary our value.

Public Class Cache
 Private _objects As New Dictionary(Of String, Object)

 Public Sub Clear()
   _objects.Clear()
 End Sub

 Public Sub ImportObject(baseUrl As String)
   Dim wrGETURL As WebRequest
   wrGETURL = WebRequest.Create(baseUrl)
   Dim dictionary As New Dictionary(Of String, Object)
   Dim objStream As Stream
   objStream = wrGETURL.GetResponse.GetResponseStream()
   Dim objReader As New StreamReader(objStream)
   _objects.Add(baseUrl, Json.Deserialize(objReader.ReadToEnd))
   objReader.Close()
   objStream.Close()
 End Sub

 Public Function GetObject(url As String) As Object
   Dim baseUrl As String = Nothing
   baseUrl = url.Split("#")(0)
   If Not _objects.ContainsKey(baseUrl) Then
     ImportObject(baseUrl)
   End If
   GetObject = _objects.Item(baseUrl)
 End Function

 Public Function GetObjectPartial(url As String) As Object
   Dim baseUrl As String = Nothing, obj As Object
   Dim key As String
   baseUrl = url.Split("#")(0)
   Try
     obj = GetObject(baseUrl)
     If url.Contains("#") Then
       For Each key In url.Split("#")(1).Split("/")
         obj = obj.Item(key)
       Next
     End If
     GetObjectPartial = obj
   Catch ex as Exception
     GetObjectPartial = ex.Message
   End Try
 End Function`
End Class

That’s pretty much it. Last thing to do is to wrap it all up nicely in a couple of Excel functions. We use a slight trick to be able to work with complex objects: Anything we cannot resolve on a two-dimensional grid, we just return the URL of the object instead of the data. Let’s assume we retrieve a JSON document from http://localhost:8000/test.json that looks like this (which is a nested object):

  {
    'id': 1234,
    'object': {
      'name': 'test', 
      'data': [['date' 'value'], ['2014-04-01', 100], ['2014-04-02', 101]]
    }
  }

In this instance
=OBJECT_GET(“http://localhost:8000/test.json&#8221;) returns http://localhost:8000/test.json
=OBJECT_GET(“http://localhost:8000/test.json#id&#8221;) returns 1234
=OBJECT_GET(“http://localhost:8000/test.json#object&#8221; returns http://localhost:8000/test.json#object
=OBJECT_GET(“http://localhost:8000/test.json#object#data&#8221;) returns [[‘date’ ‘value’], [‘2014-04-01’, 100], [‘2014-04-02’, 101]]

 Public Module Objects
 Dim cache As Cache = New Cache()
 
 Public Function OBJECT_GET(url As String)
   Dim path As String() = Nothing
   Dim obj As Object
   Dim i As Long = 0, j As Long
   Dim row As Object, col As Object, item As Object

   obj = cache.GetObjectPartial(url)
   If TypeOf obj Is List(Of Object) Then
   If TypeOf CType(obj, List(Of Object))(0) Is List(Of Object) Then
   Dim o(,) As Object
   ReDim o(CType(obj, List(Of Object)).Count()-1, CType(obj, List(Of Object))(0).Count()-1)
   For Each row In obj
     j = 0
     For Each col In row
       o(i, j) = col
       j += 1
     Next
     i += 1
   Next
   obj = o
   Else
     Dim o() As Object
     ReDim o(CType(obj, List(Of Object)).Count()-1)
     For Each item In obj
       o(i) = item
       i += 1
     Next
     obj = o
   End If
   ElseIf TypeOf obj Is Dictionary(Of String, Object) Then
     obj = url
   End If
   OBJECT_GET = obj
 End Function
End Module

Let’s give it a try and access some Apple stock data via the quandl.com API. Feel free to download the example workbook at https://spreadgit.com/bjoern/excel-json-client. Take it for a spin and try with any other JSON API you can find.

Please note that the code above doesn’t cover all cases, for instance arrays of objects aren’t handled properly. Another nice thing to have is an object explorer UI to explorer the objects in your cache easily. Or adding POST, PUT and DELETE requests, all of which can be easily added. This gives you a powerful, generic Add-In that lets you hook into literally any JSON API.

How do you like using VB.NET via Excel-DNA?

About the Author
Bjoern Stiel
Founder of Pathio (formerly known as spreadgit), a GitHub like version control platform for Excel. As a former quant developer and trader, used to push Excel beyond its limits, now jack of all trades. You can find me online, tweeting about Excel, writing posts or developing .NET Add-Ins. Contact me @path_io or drop me a line bjoern.stiel@pathio.com.