Microsoft Professional Program Artificial Intelligence

Building on the momentum† of completing the Data Science track of the Microsoft
Professional Program
, and inspired by the amazing season 2 of Westworld, I have now also completed the Artificial Intelligence track, Microsoft’s internal AI course just opened to the public. This combines theory with Python programming (no R option this time sadly) for deep learning (DL) and reinforcement learning (RL), leading up to a Capstone project, which I completed with Keras and CNTK, scoring 100% this time. Of the 4 available optional courses, I chose Natural Language Processing. The track also includes a course on the ethical implications of AI/machine learning/data science, something that should be mandatory for the employees of certain companies…

Screen Shot 2018-08-08 at 07.24.42

I had had some exposure to neural nets earlier but this was my first encounter with RL, and that was easily my favourite and the most rewarding part, and definitely something I want to explore further, with tools like OpenAI Gym.  A fair amount of independent reading is needed to answer the assessment questions in this and the other more advanced courses; obviously I was not looking to be spoon-fed but it would have been better for it to be self-contained. Rumsfeld’s Theory applies here; if you don’t know what you don’t know, how can you assess the validity or currency of an external source? Such as what has changed in Sutton & Barto between the 1st edition (1998) and the 2nd (October 2018, so not actually published yet!) , and which one was the person who set the assessment questions reading? Or the latest edition of Jurafsky & Martin?Many students raised this concern in the forum and the edX proctor said they were taking the feedback on board so perhaps by the time any readers of this blog come to it, it will be improved.  The NLP course was particularly bad for this, I wonder if something was missed when MS reworked them for an external audience? So frustrating when it is such an interesting subject!

Obviously there is not the depth of theory in these relatively short courses to do academic research in the field of AI. Each of the later courses  (7-9) takes a few weeks but to go fully in depth would take a year or more. But there is certainly enough to understand how the relevant maths corresponds to and interacts with the moving parts, and to confidently identify situations or problems DL and RL could be applied to, and to subsequently implement and operationalize a solution with open source tooling, Azure, or both. Overall I am pretty happy with the experience. I learnt an awful lot, and have plenty of avenues in addition to RL mentioned previously to go on exploring, and have picked up both a long term foundation and some skills that are immediately useful in the short term. Understanding the maths is so important to be able to develop intuition, and is an investment that will continue to pay off even as the technologies change. Working on this part time over several months, I am very conscious that a lot of this stuff is quite “use it or lose it”‘ so I will need to maintain the momentum and internalize it all properly. For my next course I think I’ll do Neuronal Dynamics or maybe something purely practical.

Oh, and I previously mentioned that I had finally upgraded my late-2008 Macbook Pro to a Surface Laptop. The lack of a discrete GPU‡ on this particular model means that the final computation for the Capstone took about an hour to complete… On a NC6 instance in Azure I am seeing speedups of 4-10× on the K80, which is actually less than I had expected, but still pretty good and I expect the gap would open up with larger datasets. I think I will stick with renting a GPU instance for now, until my Azure bill indicates its time to invest in a desktop PC with a 1080, I’m just not sure that it makes sense on a laptop. Extensive use is made in these courses of Jupyter Notebook, which when running locally is pretty clunky compared to the MathCAD I remember using as a Mech Eng undergrad in the ’90’s, but there is no denying that Azure Notebooks is very convenient, and it’s free!


It begins with the birth of a new people, and the choices they’ll have to make and the people they will decide to become.

Did I mention that I am obsessed with Westworld?

† A 3-course overlap/headstart!

PlaidML is nearly 2x as fast as CNTK on the same processor with integrated GPU, but less accuracy in my experiments so you need more epochs anyway, it depends where the lines cross for your specific hardware and workload.

Posted in AI, azure, C++, Cloud, data science, edx, Microsoft, Python, R | Tagged , , , , , , , , , , | 3 Comments

Not-learning is a skill too

To be successful in tech, it’s well known that you must keep your skills up to date. The onus is on each individual to do this, no-one will do it for you, and companies that provide ongoing personal development are few and far between. Many companies would rather “remix our skills”, which means laying off workers with one skill (on statutory minimum terms) and hiring people with the new skill. Which is short-termist in the extreme; the new workers are no better than the old, they just happened to enter the workforce later, and the churn means there is no accumulation of institutional knowledge. If you were one of the newer workers, why would you voluntarily step onto this treadmill and if you were a client, why would you hire such a firm when it provides no value-add over just hiring the staff you need yourself? Anyway, I digress.

It is clear that C++11 was a enormous improvement over C++98. The list of new features is vast and all-encompassing, yet at the same time, backwards compatibility is preserved. You can have all the benefits of the new while preserving investment in the old (“legacy”). Upgrading your skills to C++11 was a very obvious thing to do, and because of the smooth transition, you could make quick wins as you brought yourself up to speed. That is just one example of the sort of thing I am talking about. You still need to put the effort in to learn it and seek out opportunities to use it, but the path from the old to the new is straightforward and there are early and frequent rewards along the way, and from there to C++14, 17, 20…

But I look around the current technology landscape and I see things that are only incremental improvements on existing programming languages or technologies and yet require a clean break with the past, which in practice means not only learning the new thing, but also rebuilding the ecosystem and tooling around it, porting/re-writing all the code, encountering all new bugs and edge cases, rediscovering the design patterns or new idioms in the language. The extent to which the new technology is “better” is dwarfed by the effort taken to use it, so where is the improved productivity coming from? Every project consists of either learning the language as you go, or maintaining and extending something written by someone who was learning the language as they went, perhaps gambling on getting in on the ground floor of the next big thing. But things only get big if people stick with them is the paradox!

So I am pretty comfortable with my decision to mostly ignore lots of new things, including but not limited to Go, Rust, Julia, Node.js, Perl6 in favour of deepening my skills in C++, R, Python and pushing into new problem domains (e.g. ML/AI) with my tried and trusted tools. When something comes along that is a big enough leap forward over any of them, of course I’ll jump – just like I did when I learnt Java in 1995 and was getting paid for it the same year! I had a lot of fun with OCaml and Haskell too, but neither gained significant traction in the end, also Scala. I don’t see anything on the horizon, all the cutting edge stuff is appearing as libraries or features for my “big 3” while the newer ecosystems are scrambling to backfill their capabilities and will probably never match the breadth and depth, before falling out of fashion and fading away. I’ll be interested in any comments arguing why I’m wrong to discount them, or any pointers to things that are sufficiently advanced to be worth taking a closer look at.

Posted in C++, data science, Haskell, Ocaml, Python, R | 3 Comments

Blockchain 101

  1. If you are a developer who uses Git and knows what fast-forwards are and when and when not to use them, you already know literally everything there is to know about distributed/decentralised ledgers.
  2. A blockchain controlled by a single organisation is just a really crappy database. And if you wanted a really crappy database for some reason, you might as well just use MongoDB†.
  3. There is no 3. That’s everything. A consulting firm will charge you a million dollars and not give you advice as good as this. You’re welcome!


† It boggles my mind that there is sufficient demand for such a thing that the company behind it is still in business. Just use Postgres! You’re welcome.

Posted in Business, Random thoughts | Leave a comment

WSL is a Game Changer

Why did we (developers) flock to Macbooks? Even if using platform-agnostic languages and/or writing applications that would run on servers, we wanted portable Unix workstations with a high build quality and none of the hardware compatibility issues that come with trying to run Linux on a laptop. It’s been over 20 years since I first tried it and it is still woeful. The only way to run Linux on a laptop, even now, and not lose your mind is as a virtual guest of Windows or OSX. And with OSX all the power of Unix is right there already, great!

But Apple have really dropped the ball recently. The build quality isn’t there anymore, the CPU/GPU/memory specs of the MBP are lagging†, and there is a new player in town: Windows Subsystem for Linux. And it is seriously impressive, super-slick and Just Works™. Debian is available, you can develop for it with Visual Studio. There are still a few things to iron out – I still haven’t quite figured out how to have a single project that can target both – but no need to run a heavyweight, high-overhead VM or even a container, it’s deeply integrated with Windows, the experience is pretty seamless. I’m running it on a Surface Laptop now and by the way, I love the keyboard and I love the screen on this device. My first new laptop since 2008…

I think this is going to cost Apple a lot of developer mindshare, as long as MS manages not to screw up their acquisition of GitHub‡, and where the devs go the apps go and the users follow. I saw first hand a decade ago in the wholesale migration from SPARC/Solaris to Linux on x86 that a superior OS can’t save a vendor if they don’t have a good hardware story, and it’s not as if OSX can claim to be far ahead of Windows anymore. What amazing new feature did they demo at WWDC – the animated poop emoji??


† Their desktop workstation story is even worse, it’s almost as if they want to just be a phone/tablet company now. That’s where the revenue is but the apps and the content for iOS only exist because of the developer/media author ecosystem on OSX.

‡ Who will buy GitLab now is the question. Oracle??

Posted in C++, Linux, Microsoft, Random thoughts, Virt | Tagged , , , , , | 1 Comment

Microsoft Professional Program Data Science

I’ve finally gotten around to completing the Microsoft Professional Program in Data Science, which I started nearly a year ago. It’s a pretty comprehensive sequence of courses that gives a solid grounding in (and/or revision of!):

  • Probability and Statistics (the heart of all of this)
  • Programming in Python and/or R
  • Importing and cleansing various types of data from different sources
  • Visualising data (including timeseries and spatial)
  • Machine Learning (regression, classification and clustering)

… and shows how they all fit together into a “big picture”. Obviously the course is run by Microsoft via edX and does make use of some Microsoft technologies such as Azure ML Studio but it is not actually particularly Microsoft-centric. The maths is universal and most of the programming is in open-source languages, for example I completed the final Capstone project with the free RStudio on my late-2008 MacBook Pro (achieving a final score of 97%).

So I definitely recommend this course (and it’s free if you don’t care about getting a cert at the end, and doesn’t require owning any high-end hardware, all you need is time and self-discipline). I think there is a lot of data science hype around right now, and a lot of unrealistic expectations both from data scientists and organisations employing them. I am certainly not planning on any abrupt career changes myself! But when the smoke clears and the dust settles, these kinds of skills will be applicable to all industries and most roles, even if the job title isn’t Official Data Scientist. Data munging/wrangling (or “ETL” to use the fancy term) is something I’ve done my entire career for example, but I haven’t previously done much dimensionality reduction or feature engineering, and I do forecasts of things all the time, so I will be looking to apply some of that perhaps.

Next I think I will do the recently-launched MPP in Artificial Intelligence.

These violent delights have violent ends.


Posted in azure, Cloud, data science, edx, Microsoft, Python, R | Tagged , , , , , , , | 2 Comments

Less Facebook, More Faces and Books

I made the decision back in mid-November to radically cut down on my use of Facebook. Thus far it has been a great success, I have recovered at least ½-hr per day, maybe more. Even if I spent it sleeping, that would be a huge net win, instead I have been using the time to make a dent in my to-read list. For example I have two×20-minute train journeys on a working day that are now better used. There are other more subtle benefits too, I feel that I am less easily distracted, more able to work on things for a solid block of time and at the end feel like I have accomplished something.

What brought this on was an increasing awareness of the intrusiveness and manipulation of the algorithm. This crept up slowly like boiling a frog, but Facebook deploys cutting-edge ML to one and only one end, to maximise the time you spend looking at Facebook. I’d be looking at the next thing and the next thing thinking, why am I being shown this? It’s not important in a general sense, nor is it important to me personally… And what important things am I missing because I’m looking at this instead? I rarely write in this blog anymore; I don’t write much Open Source anymore; where did all that time and energy and attention go?

It’s an interesting aspect of neural nets and so-called “deep learning” (which should really be called “machine intuition”) that no-one really understands how to unpick it; give it a lot of data (everything you’ve ever done on FB or any site with a like or share button) and an objective function and it will maximise that function of course, but the how and the why remain opaque. “Fake News” is a thing because fake news and controversy in general generates clicks and “engagement” and so that’s what the algo pushes to you, no humans in the loop at all. I grew up in a more innocent age on the Internet; there was no algorithm on IRC or AIM or Usenet statistically analyzing every line of text before deciding whether showing it to me or not was more likely to make me spend more time there, and injecting an ad every few lines. There have been a few prominent ex-FB execs coming forward recently saying that this manipulation of the timeline/newsfeed has gone too far too. It pretends to be engagement with your friends but it isn’t really, it’s just engagement with Facebook itself. We didn’t need this extra layer before, why do we need it now?

Anyway, if anyone is considering this, or needs to find more time in the day (it’s a matter of priorities; will you die thinking I wish I’d spent more time clicking like on things an algorithm showed me?) this is how to do it:

  1. Start by switching off notifications. Get into the habit of looking at your phone when you want to, not when it wants attention. This might take a couple of weeks to ingrain.
  2. Cue up plenty of other stuff on your phone or mobile device. If you have a few minutes to kill, something other than Facebook to do. It took me a while to unlearn the muscle memory of pulling my phone out and clicking that blue f icon, but it is actually just as easy to click Kindle instead. Or even a quick game or anything that will take the edge off boredom. Also you probably aren’t really bored in the same way as you don’t snack on junk food because you’re really hungry. You will unlearn this impulse too.
  3. Once you are ready just uninstall the app. This will also boost the battery life of your device! Facebook has another interface at that provides an absolutely minimal experience; if you really need to check an event or reply to a message, you will still be able to, no need to worry
  4. Generate a random password on your desktop, e.g. with iCloud Keychain or whatever you use and activate two factor auth. This little extra step will reduce the temptation to look at it on a whim

Anyway there’s no high principle here or paranoia about tracking or anything; I need more time to do more important and ultimately more fulfilling things, resisting the engagement algorithm and the thousands of “data scientists” who would rather work on selling ads than curing cancer requires as much or more willpower than resisting junk food, so I simply choose not to play and actually after a few weeks I don’t even want to play, and I find it a little weird that I ever spent so much time doing it.

I’m no-one special or unique, I don’t think anything I do is particularly unusual, so perhaps 2018 will be the year mass Facebook Fatigue sets in…

Posted in Random thoughts | 1 Comment

My First MOOC

Just completed the Data Analysis for Life Sciences XSeries on EdX, my first MOOC. I had been meaning to learn R for a while as I’ve seen some cool stuff being done with it (I am mainly a Python guy), and to try one of these online courses, and it was very interesting to take a peek into a field of computing that I’ve had no exposure to, biostats. Obv a course like this barely scratches the surface of a very deep and broad area, but it was enjoyable to do and a good foundation for future practice. And most interesting computation these days is really matrix manipulation/LinAlg so all the skills are very transferrable. My first degree was in Mech Eng and a lot of it was familiar once I had dredged it up from memory.

I actually started out doing the Microsoft Professional Program in Data Science a while ago after reading an article in El Reg but got sidetracked; I expect I’ll finish that one up in the next few months too in my copious free time.. The real skills are in the maths so again, all very transferrable, across industries and technology platforms. It’s free too, you only pay if you want the cert at the end.

I should really update this blog more often… My technology work and interests are quite different now than when I started it all those years ago… I am mainly planning to use so-called “data science” to explore some government data, perhaps I will write up what I discover (if anything) here, or if I create any tooling that might be useful to anyone else.

Posted in R, Random thoughts | Tagged , , , , , | Leave a comment


I have been having a play with some cloud stuff recently, as hinted at in my last post, and have put together some nice Python objects wrapping APIs/command-line tools so I can do things like:

>>> from cloudlib import Aws # or Azure or VBox
>>> cloud = Aws() 
>>> ip = cloud.spinup_vm()

VBox is just my local machine, which I suppose is what people mean by private cloud 🙂 I have a bunch of other functions for managing them too, so the main code is transparently the same whatever the VMs are actually running on. This has made it easy to run a few scenarios like running Postgres in a VM somewhere, and Barman in a VM somewhere else. Pretty cool!

I also have come to realize that I have been thinking of AWS and Azure as glorified hypervisors but that isn’t really true; with all the services layered on top of them, it makes more sense to consider them to be fully-fledged operating systems in their own right. And they are clearly mature enough by now to be worth investing some time in, which I don’t think was necessarily the case when I first signed up for AWS in 2011 and found it wasn’t as good a fit for our use cases as the vSphere + 3Par platform we were operating at the time.

Speaking of 3Par, I am astonished that HP has bought SGI now. Never would have seen that coming in the ’90s!

Update July 2017 It’s been pointed out to me that Vagrant does this so I have actually switched to using that. I have also settled on Azure as my cloud of choice…

Posted in Cloud, Linux, Postgres, Python, Random thoughts, Virt | Tagged , , , , | Leave a comment

Turbo Boost

We have a couple of machines at home†:

  • Core i7 – 2 cores, hyperthreading, 3Ghz turbo boosting to 3.5Ghz, 16G RAM
  • Core i5 – 4 cores, no HT, 3.2Ghz, 32G RAM

I compared them using the POV-Ray 3.7 benchmark with a single worker thread and no other workload than normal background tasks, the i7 system completed in 13m25s average, the i5 system in 12m58s. Now going on clock speed alone I would expect the i5 to have been 6% faster, it was only 3%, so clearly clock-for-clock the i7 does have some advantages, but with boost it should have beaten the i5 in this scenario. The whole point of Turbo Boost is that if you are only using one core it can accelerate it but I’m not seeing that it actually does in any useful way.

Other results, with 4 worker threads the i7 runs it in 6m13s, with 2 threads in 7m1s. So there is some advantage in HT of about 12% even for a compute bound task. But the i5 system runs with 4 threads in 3m35s, nothing beats real actual cores, at a higher frequency! Which is obvious really. Everything else is a gimmick.

No regrets buying the i7 box mind, it will do everything I need it to, and these results are the perfect excuse to really get to grips with something I’ve been meaning to do for years, and have only dabbled in so far, which is offloading compute in a serious way to Azure.

† Well more than a couple!

Posted in Cloud, Random thoughts | Tagged , , , | Leave a comment

HP Calculators I Have Owned

Over the years I have owned several HP calculators:

  • HP 28s, my first RPN programmable that I used when an engineering student in the mid-90s, and made the jump from GCSE/A-level Casio to a grown-up calculator.
  • HP 48GX, a very serious machine that I never really used the full power of, but would certainly have been used everyday had I continued into mechanical engineering. Upgrade from the 28s for better programming and graphing. Probably bought in 1996, still used for certain tasks.
  • HP 12C which I acquired while working as a consultant in about 1999 and needing to do TVM calculations, then used later while writing financial software to prototype and validate bond/portfolio pricing calculations. This is the one that presently sits on my desk at work and is still regularly used.
  • HP 17B-II which I bought as an upgrade to the 12C but then discovered I preferred the older one, so it was relegated to the status of a backup. This one is on my desk at home.
  • I also seem to have acquired a modern HP 35s somewhere along the way, probably for nostalgia’s sake. This one gets used for things neither the 12C nor the 17B can do, such as the odd bit of trig or binary/hex.

Quality products, built to last, providing their owners with literally decades of faithful service, just new batteries every now and then. Looking at HP now, it’s sad to see how far they have fallen, and I suspect they haven’t hit the bottom yet.

Posted in Random thoughts | Tagged , | 1 Comment