Thursday 1 January 2015

Future of Programming - Rise of the Scientific Programmer (and fall of the craftsman)

Level [C3]

[Disclaimer: I am by no means a Scientific Programmer but I am striving to become one] It is the turn of yet another year and the time is ripe for the last year reviews, predictions for the new year and its resolutions. Last year I made some bold statements and made some radical decisions to start transitioning. I picked up a Mac, learnt some Python and Bash and a year on, I think it was good and really enjoyed it. Still (as I predicted), I spent most of my time writing C#. [working on a Reactive Cloud Actor micro-Framework, in case for any reason it interests you]. Now a year on, Microsoft is a different company: new CEO, moving towards Open Source and embracing non-Windows operating systems. So how it is going to shift the innovation imbalance is a wait-and-see. But anyway, that was last year and is behind us.

Now let's talk about 2015. And perhaps programming in general. Are you sick of hearing Big Data buzzwords? Do you believe Data Science is a pile of mumbo jumbo to bamboozle us and actually used by a teeny tiny number of companies, and producing value even less? IoT is just another hype? I hope by reading the below, I would have been able to answer you. Sorry, no TL;DR

*     *     *

It was a warm, sunny and all around really nice day in June. The year is 2007 and I am on a University day trip (and punting) to Cambridge along with my classmates many of whom are at least 15 years younger than me. Punting is fun but as a part time student this is one of the few times I have a leisurely access to our Image Processing lecturer - a bright and young guy - again younger than me. And I open the discussion with how we have not moved much since the 80s in the field of Artificial Intelligence. We improve and optimise algorithms but there is no game-changing giant leap. And he argues the state of the art usually improves little by little.

"Day out punting in cambridge"

Next year, we work on a project involving some machine learning to recognise road markings. I spend a lot of time on feature extraction and use a 2 layer Neural Network since I get the best result out of it compared to 3. I am told not to use many layers of neurons as it usually gets stuck on a local minima during training - I actually tried and saw it. Overall the result was OK but it involved many pre- and post- processing techniques to achieve acceptable recognition.

*     *     *

I wake up and it is 2014. Many Universities, research organisations (and companies) across the world have successfully implemented Deep Learning using Deep Neural Networks - which have many layers of neurons. Watson answers all the questions in Double Jeopardy. Object Recognition from image is almost a solved case - with essentially no feature extraction.

A Deep Neural Network
Perhaps my lecturer was right: with improving training algorithms and providing many many labeled data, we suddenly have a big leap in science (or was I right?!). It seems that for the first time implementation has got ahead of the mathematics: we do not fully understand why Deep Learning works - but it works. And when they fail, we still don't know why they fail.

And guess what, industry and the academia have not been this close for a long time.

And what has all this got to do with us? Rise of the machine intelligence is going to change programming. Forever.

*     *     *

Honestly, I am sick of the amount of bickering and fanboyism that goes today in the programming world. The culture of "nah... I don't like this" or "ahhh... that is s..t" or "ah that is a killer" is what has plagued our community. One day Angular is super hot next week it is the worst thing. Be it zsh or Bash. Be in vim vs. Emacs vs. Sublime Text vs Visual Studio. Be it Ruby, Node.js, Scala, Java, C#, you name it. And same goes for technologies such as MongoDB, Redis... subjectivism instead of facts. As if we forgot we came from the line of scientists.

Like children we get attached to new toys and with the attention span of a goldfish, instead of solving real world problems, ruminate over on how we can improve our coding experience. We are ninjas and what we do no one can do. And we can do whatever we want to do.

"I have got power"

Yes, we are lucky. A 23-year old kid with a couple of years of programming experience can earn double of what a 45-year old retail manager with 20 years of experience earns annually. And what we do with that money? spend all of it on booze, specialty burgers, travelling and conferences, gadgets - basically whatever we want to.

But those who remember the first .com crash, can tell you it has not always been like this. In fact, back in 2001-2002 it was really hard to get a job. And the problem was, there were many really good candidates. IT industry became almost impenetrable since there was this catch-22 of requiring job experience to get the job experience. But anyway, the good ones, the stubborn ones and those with little talent but a lot of passion (includes me) stayed on for the good days that we have now. Reality was many programmers of the time had read "Access in 24 hours" and landed a fat salary in a big company. And on the other hand, projects were failing since we spent most of our time writing documentation. The industry had to weed out bad coders and inefficient practices.

And we have software craftsmanship movement and agile practices.

*     *     *

The opposition has already started. You might have seen discussions DHH has had with Kent Beck and Martin Fowler on TDD. I do not agree 100% with Erik Meijer says here (only 90%) but there is a lot of truth in it. We have replaced fact-based data-backed attitude with a faith-based wishy-washy peace-hug-freedom hippie agile way, forcing us mechanically to follow some steps and believe that it will be good for us. Agile has taken us a long way from where we started at the turn of the century, but there are problems. From personal experience, I see no difference in the quality of developers who do TDD and do not. And to be frank, I actually see negative effect, people who do TDD do not fully think hard about the consequence of the code they write - I know this could be inflammatory but hand on heart, that is my experience.  I think TDD and agile has given us a safety net that as a tightrope walker, instead of focusing on our walking technique, we improve the safety net. As long as we do the motions, we are safe. Unit tests, coverage, planning poker, retrospective, definition of done, Story, task, creating tickets, moving tickets. How many bad programmers have you seen that are masters of agile?

You know what? It is the mediocrity we have been against all the time. Mediocre developers who in the first .com boom got into the market by taking a class or reading a book are back in a different shape: those who know how to be opinionated, look cool, play the game and take the paycheck. We are in another .com boom now, and if there is a crash, sadly they are out - even if it includes me.

*     *     *

I think we have neglected the scientific side of our jobs. Our maths is rusty and those who did study CompSci do not remember a lot of what they read. We cannot calculate the complexity of our code and fall to the trap that machines are fast now - yes it didn't matter for a time but when you are dealing with petabytes of data and pay by processing hours? When our team first started working on recommendations, the naive implementation took 1000 node for 2 days, now the implementation uses 24 nodes for a few hours, and perhaps this is still way way too much.

"we are craftsmen and craftswomen" (from Anders Drachen)

But really, since when did our job look like a craftsman (a carpenter)? We are Ninjas? And we do code Kata to keep our skills/swords sharp. This is all gone too far into the world of fantasy. The world of warcraft. This is now a New Age full-blown religion.

What an utter rubbish.

*     *     *

Now back on earth, languages of the 90s and early 2000 are on the decline. Java, C#, C++ all on the decline. But they are being replaced by other languages such as Scala right? I leave that to you to decide based on the diagram below. 
Google trends of "Java", "Scala", "C#" and "Python Programming" (so that it does not get mixed up with Python the snake) - source: google
The only counter trend is Python. The recent rise in Python popularity is what I call "rise of the scientific programmer" - and that is just one of the signs. Python is a very popular language in the academic space. It is easy to pick up works everywhere and has some functional aspects making it terse. But that is not all: it sits on top of a huge wealth of scientific libraries and it can talk to Java and C as well. Industry innovations have started to come straight from the Universities. From the early 2000s where the academia seemed completely irrelevant to now where it leads the innovation. PySpark has come fully from the heart of Berkeley's University. Many of the contributors to Hadoop code and its wide ecosystem are in the academia.

We are now in need of people who can scientifically argue about algorithms and data (is coding anything but code+data?) and most of them could implement an algorithm given the paper or mathematical notation. And guess what, this is the trend for jobs with "Machine Learning":
Trend of jobs containing "Machine Learning" - Source: ITJobsWatch

And this is really not just Hadoop. According to the source above Machine learning jobs have had 41% rise from 2013 to 2014 while hadoop jobs had only 16%.

This Deep Learning thing is real. It is already here. All those existing algorithms need to be polished and integrated with the new concepts and some will be just replaced. If you can give interactions of a person with a site to a deep network, it can predict with a high confidence whether they are gonna buy, leave or indecisive. It can find patterns in diseases that we as humans cannot. This is what we were waiting for (and we were afraid of?). Machine intelligence is here.

The scientific Programmer [And yes, it has to know more]

Now one might say that the answer is the Data Scientists. True. But first, we don't have enough of them and second, based on first hand experience, we need people with engineering rigour to produce production ready software - something that certainly some Data Scientist have but not all. So I feel that a programmer turned Statistician can build a more robust software than the other way around. We need people who understand what it takes to build a software that you can put in front of millions of customers to use. People who understand linear scalability, SLA, monitoring and architectural constraints.

*     *     *

Horizon is shifting.

We can pick a new language (be it Go, Haskell, Julia, Rust, Elixir or Erlang) and start re-inventing the wheel and start from pretty much the same scratch again because hey, this is easy now, we have done it before and don't have to think. We can pick a new albeit cleaner abstraction and re-implement thousands of hours of hard work and sweat we and the community have suffered - since hey we can. We can rewrite the same HTTP pipeline 1000s of different ways and never be happy with what we have achieved, be it Ruby on Rails, Sinatra, Nancy, ASP.NET Web API, Flask, etc. And keep happy that we are striving for that perfection, that unicorn. We can argue about how to version APIs and how a service is such RESTful and such not RESTful. We can mull over pettiest of things such as semicolon or the gender of a pronoun and let insanely clever people leave our community. We can exchange the worst of words over "females in the industry" while we more or less are saying the same thing, Too much drama.

But soon this will be no good. Not good enough. We got to grow up and go back to school, relearn all about Maths, statistics, and generally scientific reasoning. We need to man up and re-learn that being a good coder has nothing to do with the number of stickers you have at the back of your Mac. It is all scientific - we come from a long line of scientists, we have got to live up to our heritage.

We need to go and build novelties for the second half of the decade. This is what I hope to be able to do.


  1. It's not easy to post a comment in there, but I wish more people would read this. I'm not quite an expert in Machine Learning but I can definitely relate.

    I blame hipsters! Half jokingly.

  2. I have noticed a general trend upwards in the interest of scientific programming for a few months now, and the community (most specifically Hacker News) has driven my interest in that area as well. The idea of functional programming and thinking in mathematically sound ways really appeals to me, but my lack of math and comp sci background is holding me back from going full-speed learning and getting better at it.

    I feel many of us are lost swimming in a sea of opinions and juggling frameworks du jour, development methods, and business strategies, that it keeps us from focusing on improving our skills in areas that matter. This frustrates me and I've been looking for ways to get out of it. There is also this fear of another bubble mixed with trying to keep up with the trends and hipness of the industry, to remain gainfully employed.

    I realize I am sort of just reiterating the authors point, so I guess what I'm saying is I agree.

  3. Bravo.

    That summarized all of my professional frustrations, fears and insecurities in one post.

  4. interesting post, but might be on the pessimistic side...
    I think some crappy programmers can use their github and experience w/ agile to fool people, but for the most part, being opinionated about a technology correlates pretty highly with having some experience with it

  5. This comment has been removed by the author.

  6. Thank you for the nice post which makes you think about a lot of things. I agree with you about many points in it. But I can't accept the main idea that IT world needs more science and mathematics(for the current moment). Your point that in 80s IT has more math is explained by one fact. In 80s the major customer of IT was science and programmers solve a science problems. And it's hard to solve these problems without knowledge of math, physics, etc. But let's look into 2015. How is the main customer? A business of a different sizes and private persons. Do they deal with science a lot? Well, sometimes it's needed to save money, but mostly it isn't so important. That why programming nowadays has a small part of science. Also, on my opinion(23 years old kid with some years of experience), programming is more close to engineering or building process, rather than math. Because it wraps the knowledge and rules of the area which it's used. So if we write a code for a university its full of science, for banks - money and economics, for teenagers - hippy staff. And it's OK, because we all hired to solve a problems of our clients and not ours.

    Once again thanks for the good post, but it's too early for now.

  7. "But soon this will be no good. Not good enough. We got to grow up and go back to school, relearn all about Maths, statistics, and generally scientific reasoning. We need to man up and re-learn that being a good coder has nothing to do with the number of stickers you have at the back of your Mac. It is all scientific - we come from a long line of scientists, we have got to live up to our heritage."

  8. If the subject is about science then why not talk about the fundamentals of all those hypes, the Lisp programming language or at least its modern variation, the JVM-ready Clojure?

    And as for the Java language, yes it's an unnecessary object and verbose loaded stuff but it still dominates in the continental europe, if you look at it from the programming for a living angle. And as a platform, JVM looks like a scientific breakthrough, IMHO.

  9. Add javascript to your trend graph, and cry :)

  10. I was actually considering the opposite a week or so ago, I really believe it is easier for a quant type (statician, math, anyone with some type of ML or AI or signal/image proc, or algorithms experience), to come up to speed on software methodologies and the like. Analytical skills are realy useful to learn things fast ... Is my own experience

  11. This is totally well said. I find myself going back to these resources to get myself started

  12. I've been feeling this way for sometime too. Generally everything goes through this I've been feeling this way for sometime too. Generally everything goes through this cycle:invent it, tweak it, reinvent it and back again. however, I don't think it is a bad thing, it's the natural process of evolution - refactoring our thoughts. Although we're going back to engineering concepts, as we should, we have much more knowledge now to make our field an even more exciting place to be in. Thanks for posting! !

  13. This comment has been removed by the author.

  14. All my experience shows that mathematicians do wrong/bad programming. Same story with programmers - instead of utilising any math model they prefer to make it them self with common sense logic. Super-duper scientific programmer will definitely win job racing. No doubts. But pair of mathematician and programmer more scalable approach. More efficient way from money prospective is to have just 1 mathematician for whole software division! How many software companies hires mathematicians? That is the first question to the author of this excellent article. Many thanks!

    1. Problem is, two sides (scientist and programmer) live in different worlds and their vocabulary and approach to problems is completely different. They do not know how to present the problem to the other. That is where the scientific programmer comes in.

  15. Bravo, Interesting post which makes you think about a lot of things. This post helped to make the career.

    JAVA Training in Delhi
    Android Training Institute in Delhi
    CCNA Training Company in Delhi

  16. This article provides the information about Java its key features and scope for java professionals. This information is really helpful me to know more about Java programming language.

  17. Great idea thank u for sharing this wonderful concept...
    IOS Training institute in chennai

  18. Besant Technologies is an e-learning portal which aims to impart software and soft skills training to its client. It is a very fast growing technology city, besant technologies has found the potential to establish its services across the virtual space.So join us besant technologies. AWS Training in Bangalore |
    DataScience Training in Bangalore |

  19. Thank you a lot for providing individuals with a very spectacular possibility to read critical reviews from this site.

    Aws training in Bangalore

  20. I am commenting to let you know what a terrific experience I enjoyed reading through your web page. I noticed a wide variety of pieces, with the inclusion of what it is like to have an awesome helping style to have the rest without hassle grasp some grueling matters
    Hadoop Training in Bangalore

  21. 3y gone since I'd red this article. For that time I see that all math gues rejected to be prof developers and moves to actuarial science or business analyst. None of them choos dev by the same reason - it is too boring for such inteligeny gues!!!

    Same story on dev side. None of them know math good enaugh and they do not like put an effort on math term. It is much easier to make mony without math.

    I stay on the same - hire 1 prof math specialist per 100 developers instead of waiting for new kind of people appiarence in good future. School or post grad level of math&prog skills enaugh to communicate.

  22. very helpfull blog it was a pleasure reading your blog
    would love to read it more
    knowldege is not found but earned through hardwork and good teaching
    that being said click here to join us the next best thing in bangalore
    devops online training
    Devops Training in Bangalore


Note: only a member of this blog may post a comment.