Upcoming Live Workshops and Courses in Los Angeles and Cincinnati

I have several great training sessions and workshops coming up. If you’re interested in receiving training for your organization, let me know.

Los Angeles, California, August 25-27th

Free wine and beer will be served at both classes!

Get the Picture: Intro to Data Visualization Tools With Microsoft Excel

August 25th, 2014, 7-9pm Whether you’re an account, controller, business analyst, manager, report writer, or IT professional, learning how to further leverage the power of Microsoft Excel and better understand the needs of the business user community can do you good. In this workshop, you’ll find out how to explore Excel’s best built-in data visualization tools that you can start using today – no programming required.

Register

The Art-Science of Excel Dashboards

August 27th, 2014, 7-9pm This informative, hands-on workshop is designed especially for business users of Microsoft Excel. You’ll create stunning, interactive Excel dashboards by leveraging concepts you’ve already mastered in Excel, and learn new formulas and simple macro techniques. We’ll also explore data science concepts and visualization theory with dashboard design best practices.

Register

 

Cincinnati, Ohio, October 30-31 (Tentatively)

Analytics in Excel, Two-Day Short Course

This course will introduce intermediate-to-advanced tools in Excel for analytics. We will cover data visualization topics that move beyond the basic charting tools in Excel. Descriptive analytics methods for analyzing data and generating meaningful insights will be covered using PivotTables, PivotCharts and other Excel tools. We will use Excel for predictive analytics by utilizing Excel’s regression tools and other forecasting capabilities.

Registration Details to follow.

Developing a JSON Excel Add-In With VB.NET

Written by Bjoern Stiel

Add-Ins are a great way to increase Excel’s firepower. Add-Ins not only add additional functionality but are also an ideal vehicle for distributing your custom functions across multiple workbooks and users.

If writing a new Add-In is on your to-do list anyway, this is the perfect opportunity to look beyond VBA. VBA has become a bit middle-aged. Integration with modern stuff like NoSQL databases or web services can become quite painful as it involves messing around with DLLs and cumbersome APIs.

Excel-DNA is a free open source tool that lets you expose .NET functions to Excel via the XLL interface. This gives you immediate access to all .NET languages (VB.NET, C#, F# etc.) and libraries. Which makes developing modern looking WPF forms or hooking into a JSON API really a breeze.

In this post, I show you how to get started with Excel-DNA. We hit the ground running busting out our first two functions and have a brief look at what to watch out for when migrating over from VBA. To wrap it all up, we build a slick Excel JSON client in just a few lines of code and show use it to retrieve data from a web server with a simple =OBJECT_GET(…) function call.

All source code is available for download at GitHub:
https://github.com/spreadgit/excel-json-client

An example spreadsheet is available at https://spreadgit.com/bjoern/excel-json-client.

1. Getting Started
There are two ways to write a User Defined Function in Excel-DNA. Either as a text file (.dna) or a .NET assembly (.dll). It’s easiest to start with the text file approach which we will do throughout this post. You need three things:
– a text editor such as Notepad
– the .NET Framework 2.0 runtime or later (.NET 4.0 or later if you are on Excel 64-bit), should already be installed with Windows anyway, so you probably do not need to worry about this bit
– the Excel-DNA XLL (https://exceldna.codeplex.com/releases/view/119190)

If you want to get jump-started, clone my git repository https://github.com/spreadgit/excel-json-client and you’re good to go, it comes with all batteries. Get the example workbook at https://spreadgit.com/bjoern/excel-json-client and open one of the XLLs (ending with 32 or 64 -depending whether you are on Excel 32- or 64-bit) via drag & drop, file/open or the Add-In manager.

2. Writing Our First Function
I’ve defined the code in the excel-json-client.dna file which is just a text file. As mentioned above, you can Visual Studio studio (which you probably should) but to avoid all the overhead, I’ll keep it simple in this post.

Let’s look at the DISTINCT function defined in the Helpers module. It gives you a first idea of how we pass around data between .NET and Excel. We receive a two-dimensional array of type Object (think Variant), filter out all duplicates and return another two dimensional Object array. I’m pretty sure there’s a more elegant way of removing the duplicates via LINQ (.NET’s Language Integrated Query language) but it does the job.

Public Module Helpers
 Public Function DISTINCT(data As Object(,))
   Dim dict As Dictionary(Of String, Integer) = New Dictionary(Of String, Integer)
   Dim key As String
   Dim i As Integer = 0, j As Integer = 0
 
   For i = 0 To data.GetUpperBound(0)
     key = ""
     For j = 0 To data.GetUpperBound(1)
       key += data(i, j).ToString()
     Next
     If Not dict.ContainsKey(key) Then
       dict.Add(key, i)
     End If
   Next
   Dim out(dict.Count-1, data.GetUpperBound(1)) As Object
   i = 0
   For Each key In dict.Keys
     For j = 0 To data.GetUpperBound(1)
       out(i, j) = data(dict.Item(key), j)
     Next
     i += 1
   Next
   DISTINCT = out
 End Function
End Module

3. Migrating From VBA to VB.NET
Migrating from VBA to VB.NET/Excel-DNA is relatively straight forward. But there are a couple of gotchas and things to watch out for, most importantly:
- Variant is no longer supported: use the Object type instead
- Function parameters of type Object will arrive as one of the following: Double, String, Boolean, ExcelDna.Integration.ExcelError, ExcelDna.Integration.ExcelMissing, ExcelDna.Integration.ExcelEmpty, Object[,] containing an array with a mixture of the above types, ExcelReference (only if AllowReference=true in ExcelArgumentAttribute causing R type instead of P)
- Arrays: In VBA, the default lower bound of an array dimension is 0 but can be set to 1 using “Option Base”. VB.NET does not support “Option Base” and the lower bound of an array dimension must be 0. Be careful though that most Office collections begin with 1. For a more exhaustive list, I definitely recommend checking out Patrick O’Beirne’s excellent guide and Govert’s (the guy who wrote and maintains Excel-DNA) documentation on Excel-DNA’s Data marshalling.

4. A JSON Excel client
JSON is all over the web and has become the number one choice for RESTful APIs. Pulling JSON from a web server onto Excel is definitely a useful thing to have. For example, at spreadgit, we run our entire admin-backend off Excel via a JSON API. Unfortunately, building a JSON client in VBA is quite a messy job and involves a lot of HTTP and deserialization stuff which does really feel like re-inventing the wheel.

With Excel-DNA however, we can use whatever is available in the .NET world. HTTP request classes are built in and for handling the JSON bit we use the excellent Json.NET library, written and maintained by James Newton-King’s. All we need to do is some plumbing work. Before we roll up sleeves, let’s briefly design the data flow.

1. User requests data from a URL (that returns a JSON) by calling the Excel function =OBJ_GET(http://myobject#field)
2. Let’s see if the object http://myobject already exists in our cache from a previous call. If not, do an HTTP request to http://myobject and stick it into our cache.
3. Return the value for “field” on object http://myobject (now definitely in our cache)

As you can see, we make use of the hash mark separator to identify a field within our JSON document. Let’s start with a thin JSON class. We want to deserializes the JSON data we get from the web server as a (nested) dictionary so we can easily gain access to its fields via keys.

Public Class Json
 Public Shared Function Deserialize(json As String) As Object
   Deserialize = ToObject(JToken.Parse(json))
 End Function

 Public Shared Function ToObject(token As Newtonsoft.Json.Linq.JToken) As Object
   If token.Type = JTokenType.Object Then
   Dim dict As Dictionary(Of String, Object) = New Dictionary(Of String, Object)
   Dim prop As Object
   For Each prop In CType(token, JObject).Properties()
     dict.Add(prop.Name, ToObject(prop.Value))
   Next
   ToObject = dict
   ElseIf token.Type = JTokenType.Array Then
     Dim lst As List(Of Object) = New List(Of Object)
     Dim value As Object
     For Each value In token
       lst.Add(ToObject(value))
     Next
     ToObject = lst
   Else
     ToObject = DirectCast(token, JValue).Value
   End If
   End Function
 End Class

Next thing on our list is the cache. It is simply another dictionary, whereby the url is our key (up to the hash mark separator as mentioned above) and the JSON deserialized dictionary our value.

Public Class Cache
 Private _objects As New Dictionary(Of String, Object)

 Public Sub Clear()
   _objects.Clear()
 End Sub

 Public Sub ImportObject(baseUrl As String)
   Dim wrGETURL As WebRequest
   wrGETURL = WebRequest.Create(baseUrl)
   Dim dictionary As New Dictionary(Of String, Object)
   Dim objStream As Stream
   objStream = wrGETURL.GetResponse.GetResponseStream()
   Dim objReader As New StreamReader(objStream)
   _objects.Add(baseUrl, Json.Deserialize(objReader.ReadToEnd))
   objReader.Close()
   objStream.Close()
 End Sub

 Public Function GetObject(url As String) As Object
   Dim baseUrl As String = Nothing
   baseUrl = url.Split("#")(0)
   If Not _objects.ContainsKey(baseUrl) Then
     ImportObject(baseUrl)
   End If
   GetObject = _objects.Item(baseUrl)
 End Function

 Public Function GetObjectPartial(url As String) As Object
   Dim baseUrl As String = Nothing, obj As Object
   Dim key As String
   baseUrl = url.Split("#")(0)
   Try
     obj = GetObject(baseUrl)
     If url.Contains("#") Then
       For Each key In url.Split("#")(1).Split("/")
         obj = obj.Item(key)
       Next
     End If
     GetObjectPartial = obj
   Catch ex as Exception
     GetObjectPartial = ex.Message
   End Try
 End Function`
End Class

That’s pretty much it. Last thing to do is to wrap it all up nicely in a couple of Excel functions. We use a slight trick to be able to work with complex objects: Anything we cannot resolve on a two-dimensional grid, we just return the URL of the object instead of the data. Let’s assume we retrieve a JSON document from http://localhost:8000/test.json that looks like this (which is a nested object):

  {
    'id': 1234,
    'object': {
      'name': 'test', 
      'data': [['date' 'value'], ['2014-04-01', 100], ['2014-04-02', 101]]
    }
  }

In this instance
=OBJECT_GET(“http://localhost:8000/test.json”) returns http://localhost:8000/test.json
=OBJECT_GET(“http://localhost:8000/test.json#id”) returns 1234
=OBJECT_GET(“http://localhost:8000/test.json#object” returns http://localhost:8000/test.json#object
=OBJECT_GET(“http://localhost:8000/test.json#object#data”) returns [['date' 'value'], ['2014-04-01', 100], ['2014-04-02', 101]]

 Public Module Objects
 Dim cache As Cache = New Cache()
 
 Public Function OBJECT_GET(url As String)
   Dim path As String() = Nothing
   Dim obj As Object
   Dim i As Long = 0, j As Long
   Dim row As Object, col As Object, item As Object

   obj = cache.GetObjectPartial(url)
   If TypeOf obj Is List(Of Object) Then
   If TypeOf CType(obj, List(Of Object))(0) Is List(Of Object) Then
   Dim o(,) As Object
   ReDim o(CType(obj, List(Of Object)).Count()-1, CType(obj, List(Of Object))(0).Count()-1)
   For Each row In obj
     j = 0
     For Each col In row
       o(i, j) = col
       j += 1
     Next
     i += 1
   Next
   obj = o
   Else
     Dim o() As Object
     ReDim o(CType(obj, List(Of Object)).Count()-1)
     For Each item In obj
       o(i) = item
       i += 1
     Next
     obj = o
   End If
   ElseIf TypeOf obj Is Dictionary(Of String, Object) Then
     obj = url
   End If
   OBJECT_GET = obj
 End Function
End Module

Let’s give it a try and access some Apple stock data via the quandl.com API. Feel free to download the example workbook at https://spreadgit.com/bjoern/excel-json-client. Take it for a spin and try with any other JSON API you can find.

Please note that the code above doesn’t cover all cases, for instance arrays of objects aren’t handled properly. Another nice thing to have is an object explorer UI to explorer the objects in your cache easily. Or adding POST, PUT and DELETE requests, all of which can be easily added. This gives you a powerful, generic Add-In that lets you hook into literally any JSON API.

How do you like using VB.NET via Excel-DNA?

About the Author
Bjoern Stiel
Founder and CEO at spreadgit, a version control system for Excel. As a former quant developer and trader, used to push Excel beyond its limits, now jack of all trades. You can find me online, tweeting about Excel, writing posts or developing .NET Add-Ins. Contact me @spreadgit or drop me a line bjoern@spreadgit.com.

A VBA Coding Manitesto (Part 2)

In the first part of my manifesto, I outlined why I think we should really change the way we code. The important takeaways of that article were:

(1) the Hungarian Notation coding style is old and should no longer be preferred;

(2) we cannot simply code for ourselves anymore; and

(3) the implications of the older coding style has made some programmers question whether VBA and Excel are really even still relevant anymore.

Before moving forward, I should be clear: Hungarian Notation is a symptom of the overall problem. That Hungarian Notation is still championed is a demonstration of how Excel development has been left behind as compared to other technologies. Below, we’ll go beyond Hungarian Notation and address many of the common perceptions of Excel development.

Nobody thinks of us as developers.

Right now, the United States (and, from what I understand, the rest of the world) has a shortage of computer science/data analysis professionals. In the next decade, the demand for developers will greatly outstrip the current supply. I know of many computer science professionals who won’t go into Excel development because they don’t view it as a professional development platform. I know of many accounting and finance professionals proficient with macros who would never think of themselves as developers.

The fact is many of these individuals could help address the workforce demands of the next decade. But there’s virtually no crossover: the computer science major decides not to use Excel to solve a problem, and the company he interns for purchases a large unstable product from a vendor—or just simply abandons the idea altogether. The accounting professional wants to link all her workbooks together with an Access database backend, but her boss remembers how macros crashed her machine several years ago and frowns on the idea. A sophomore in his Information Systems class silently questions everyday why anybody would ever use VBA.

These are anecdotes I’ve pulled from my own experience and those of my friends. And, to be sure, I’ve found organizations who understand quite well how useful Excel and VBA are. But such organizations are fewer these days (at least, in my experience). Excel is not taken as seriously as say Tableau or QlikView. And unless we do something, there’s little reason to expect this to change.

“It just works.”

It just works is the compatibility level Microsoft has committed to Visual Basic 6.0, the language on which VBA is based. Microsoft keeps VBA around because there’s still demand for it. But even they discourage the use of Hungarian Notation in other languages.

So why should we use or even prefer VBA? The typical answer is that it can do a lot of with little effort. In fact, if you ask folks about the future of Excel desktop and VBA, the standard answer is there will always be place for VBA (as there will always be a place for Excel desktop).

I’m not here to disagree. But there’s an undertone of defeatism here; we use VBA because it just works. We can’t imagine a world without it, because so many spreadsheets at big companies still make use of it (what will happen if it were eliminated?). People should know VBA in case their local VBA Expert decides to retire.

None of these points however promote the objective advantages of VBA. Rather they point to the fact that VBA still exists, and then conclude the Office suite is much better with it than without. It’s really a lack of imagination on our part to assume something better couldn’t come to replace it, or even Excel.

I think these arguments represent a sad state of affairs. So long as VBA is viewed as the arbitrary (and old) language in which we must operate, we’re not going to progress. There are so many VBA coders who won’t ever touch a lick of C# or C++ because they’re scared of anything C. Some of these coders even do VB.net but won’t touch C# although the semantic differences between VB.net and C# are almost trivial.

So long as we encourage people to code in methods that have not progressed since 1998, I’m not sure there is an objective advantage to learning Visual Basic for Applications. Many people start their programming with VBA… and end there. If we encouraged people to use Option Explicit for instance, they’d be well armed to advance to other languages. If we encouraged folks to write code in a style similar to .net, the transition to it becomes almost seamless. And, given the shortage of developers required, I think it’s a goal worth pursuing.

Coding for ourselves no longer

No longer should we write code just for ourselves. As long as we do that, every accounting or finance office is going to freak out when their VBA expert leaves. Why should they go through this?

At my first job in 2007, the macros being used had not been updating since Office 2000. At first I was told not to touch any of the code because they were afraid what would happen. Once I convinced them there was a better way, I had to go through and replace all the previous code? The way the code had been written previously was almost entirely unreadable. Like I said: we’ve been taught to code for ourselves. And the prevailing thought is, if my code is working fine, then there’s no problem. Why fix what ain’t broke?

Because the entire system is broken. If the person who comes to replace you as the VBA expert can’t figure out what’s going on, then your code is not good code. Consider the state of modern languages: they accommodate team coding and reuse by a community. They use versioning tools like SVN and Git. (To my knowledge, no versioning tools have written to do the same with office files that use VBA even though it desperately needs it.)

In the Excel TV episode in which I argued with Bill Jelen about using VBA, Oz argued that everyone’s programming experience is personal. But since when has that been true? Programming isn’t a pInterest page. I’m not saying there aren’t different styles and preferences worth considering. But no other language platform operates under the idea that programming a serious Excel application for a large financial institution should be as personal as one’s Facebook. We should have standards—modern standards.

Comments won’t save you

Isn’t explaining what you’re code is doing what comments are for? Sure, but comments are supplements to your code. They were never intended—and should no longer be thought of—as the sole explanation of what you’re code is doing.

We’ll go into comments a bit more in the part 3. For now, think about them and where they might fit into everything I’ve presented so far.

Conclusion

As you can see, my beef isn’t just with Hungarian Notation. Hungarian Notation is a coding style that encourages personal tastes because there’s no practical standard. In addition, it encourages unreadable coding styles.

A larger effect of which Hungarian Notation is a part is that Excel developers aren’t taken seriously. Part of the problem is writing code that only really one person—that’s you, the developer—understands. By contrast, a feature of modern languages is coding for others, especially for a community at large. That many coders write in a way that only they can understand is a hindrance both to businesses and to the advancement of the community as a whole.

In part 3 of this exciting serious, we’ll go into the role of comments. I’ll also provide a new way of thinking about coding, which I refer (perhaps incorrectly) as semantic coding. Finally, we’ll end with a rather mundane example.

 

 

A VBA Coding Manifesto (Part 1)

A little while ago, I made a blog post on why I think we no longer need Hungarian Notation. Truthfully, it’s not the first time I’ve advocated for doing something against the grain. My blog on the principles for app/game development with Excel argues you should keep most of your procedures in the sheet object (versus a module), and you should use active objects (like ActiveSheet, ActiveCell etc) sparingly. At the MVP Summit, Roger Govier told me I should challenge everything. So I am.

This morning, the esteemed Excel expert, Dick Kusleika of Daily Dose of Excel, wrote a response post to my rant against Hungarian Notation in The Great Hungarian Debate. I started writing my response in the comments. But I’m terrible at brevity. So I’m going to post my response here. My response started small, but it ultimately became a manifesto. So I’m breaking it up into parts. This, of course, was my intention all along. However, I was still working on drafting the rest of series. Dick’s response has forced me to get it all out there, which I’m actually very thankful for.

Not that long ago, in the same galaxy as this one…

Let’s travel back in time to January 30th, 2007—the date Office 2007 was released. I remember I was working as a junior auditor in 2007 for a federal auditing agency. Near the end of the year, we upgraded our Office suite from 2003 to 2007.

clip_image002

For those who remember, Office 2003 was driven by menus and toolbars (above). Office 2007 brought a dramatic change replacing menus and toolbars with ribbons (Office 2013 shown below).

clip_image004

I vividly remember the response in my office to the upgrades. Nobody liked the change. Nothing was where they had remembered it. At the time, I was the Excel expert in the office (and likely Microsoft’s only defender of the new ribbons). Nobody really bought my arguments; that everything was easier to find, that we could really showcase our internal add-ins on a new ribbon (at the time, people in the office had trouble finding our internal toolbar). I had been there for only less than a year—so I was easily dismissed as a young pisher. The veteran auditors didn’t want to learn anything new. Once developers on the internet began releasing those ribbon add-ins that recreated the layout of Office 2003, many in our regional office flocked to them.

In 2007, if I had disparaged the new ribbon, it would have been in likeminded company. Could the same be said today in 2014? Would anybody be so bold as to defend that old menu system? Is there anyone reading this right now who will take up the mantle to breathlessly defend the layout of the old Office? Anyone? Now’s your chance.

***

Hopefully you see where I’m going with all of this. Visual Basic for Applications is a dialect of Visual Basic 6.0, which was discontinued by Microsoft in 1998. That’s more than a decade ago. Most modern languages have all but abandoned Hungarian Notation,

And while I greatly respect the opinions of the Excel experts and MVPSswho paved the way for my success—and whose work I owe a great debt of gratitude—the fact remains no argument presented so far has objectively addressed my points. If I may be so bold, I’ll sum up the typical arguments against the motion:

1. I can read my code so what’s the big deal?

2. I’ve been doing this for 25 years, if I changed now, I won’t be able to understand my code.

3. I’m stubborn, why should I change now?

If these arguments sound familiar, it’s because they’re fundamentally the same as the ones I heard from those veteran auditors. Nobody should fault those who prefer familiarity over change. It’s worth noting detractors of 2007’s ribbon were helpful to the Office product team. Therefore, I’m not arguing that everyone who is against change is fundamentally on the wrong side of things. My point is to demonstrate that problems persist with Hungarian Notation no amount of familiarity can overcome. And they same could have been said about 2007’s new system.

So let’s consider the idea of familiarity. Does familiarity address confusion over abbreviations between programmers? Again, a textbox in my code, might be txbInput. Yours might be txtInput. In my previous post, I presented code which showed such confusion. So I ask you: Which of these conventions is right? Moreover: What is the point of a coding convention if there isn’t ubiquity?

Near the end of his article, Dick argues

Another advantage of data type prefixing is being able to use reserved words. For my experiments, if I want to use a reserve word I’m going to tack on an underscore. When I want to code Dim lEnd As Long, I will instead use Dim End_ As Long.

Here again we have the problem that Hungarian Notation encourages: both lEnd and End_ are not good variable names (no offense). What does End_ do? As I argued in my original post, we need to focus on descriptive variable names. Hungarian Notation encourages the use of inelegant descriptors.

Hopefully, you now see what the resistance to go beyond Hungarian Notation means in practice: namely, we write code that only we can understand. But if the code works, how one choose’s to write it amounts to a hill of beans, right?

I don’t think so. If you want to keep doing what you’ve done in the past, well, that’s up to you. But hopefully we can agree it’s time to move beyond 1998. Even if we can’t bring ourselves to break old habits, we should encourage others not to do what we do. Yes, that may be hypocritical, but that’s what I meant by the “good bye.” It may be too late for us to change, but we should the lay the groundwork. We need to transition from an old model to a new one. I’m not the only one who feels this way. Rob Collie’s Modern Excel movement makes similar claims about moving toward modern development. Even as he and I might readily disagree on what that future ought to look like, the point is made: just because we’ve done something the same way for a long time does not mean it’s better.

Even as people complained back in 2007, there was an acceptance change was coming. Those ribbon toolbars, which mimicked the old layout, were pyrrhic victories at best. Eventually we had to accept the ribbon changes whether we liked them or not. This simple inevitability, I believe, is why few if any will still defend the old menu system today. Objectively, the ribbon toolbar was (and still is) better for a variety of reasons. But what most helped us see the light is that we had no choice but to accept it!

VBA isn’t going anywhere, right? At least, that’s the perception from the inside. But from the outside vantage point, VBA has already been left behind. So my views on why we should code differently are more than just preference. It’s a struggle for relevance.

And, in the next exciting chapter you’ll find out why.