Sunday, October 27, 2013

Two talks at local Software Development meetup: me and Pavel Knorr

Hello,
Both Pavel Knorr and myself, from Logicify, were having talks at local IT-meetup last Thursday. That was quite fun, thank to DataArt Kherson for organizing people and facility.
Thanks to Pavel for making up his excellent presentation with code samples, and to everyone working on our nodejs-sample application.
Here are the relevant links:
  1. https://github.com/Logicify/nodejs-sample-task - source code of the sample app. We also use it as ‘seed’, or an ‘archetype’ for starting new apps of the same structure.
  2. https://docs.google.com/presentation/d/1XW8suMG4hpzmccoKtvtLpytVgZLyVPZqZgrqCGnyNUY/present#slide=id.g121b56501_148 Russian language Slides for the Pavel’s talk about basics and caveats of Node.js. It would be best viewed with the accompanying recording of the talk, and if you have some background in client-side javascript and/or serverside programming (like java).
  3. https://docs.google.com/presentation/d/1Osy7XQd9LuY7npicwK-r37cNOFUW0No3AwzEbEmRNqQ/present#slide=id.p Russian language Slides for my presentation on Node.js and unittesting. Would be best if the reader woud understand the concepts of the unit testing (mocks, verification, structure, goal of unittesting, level of granularity, usual terminology), thought that’s not really required.
Happy to help if you have questions!
Best,
Alex.

Node.js, elasticsearch, mognodb, ext.js and Heroku sample application - we have it as a work-in-progress

Hello!
We have built a sample application to use as a base (archetype-like) for the forementioned stack. It is accessible on GitHub and we really welcoming everyone trying to interact with it somehow. Push it, copy, use, fork, augment - great. Comments and questions? Totally wonderful. Complaints and ideas? Great as well! We look forward to improving it.
The idea of the app is boring - it’s CRUD+S(earch) app for bookshelf functionality. Nah, boring. Node.js, elasticsearch, mognodb, ext.js and Heroku sample application; it was written in 2 days by 4 people who have never used and even tried any of this tech before. This is a great illustration of quite a fast time-to-dev. The domain itself (books) is quite non-intrusive, which means you can play around with different components easily.
As of now, here’s what we have:
  1. Source codes are accessible on https://github.com/Logicify/nodejs-sample-task
  2. The app deployed to http://nodejs-sample-task.logicify.com/ and visible in the Internet
  3. It has all the stuff necessary for Heroku deployment.
  4. Exposes minimal architecture. From the point we’ve been working with this, we’ve came to way better way of handling the configuration, exceptions, logging, etc.
  5. It has around half of the testing facilities (supertest, mocha, jscoverage, chaijs) setup, but is missing, for instance, nice usage of Sinon.
  6. Has some basic documentation.
Upcoming plans:
  1. Add ‘proper’ configuration handling (with several config files and env variable switching them)
  2. Proper logger wrapper.
  3. Improved testing - showcasing Sinon.js and some better assertions.
  4. Growing a little more functionality.
This demo was partly created as a Proof of Concept we’ve been doing for one of our customers back in May, and partly as a demo app for 2 talks me and Pavel Knorr were having on the local Kherson it-talk meetup. Still has huge room for improvement, and that’s what we’ll try to do.
Also, it was created by Logicify team
[ADS MODE ON]
we are doing great custom software development and services. You can reach me for any enquiries or use the website :)
[ADS MODE OFF]
, and at Logicify we really have already used this project twice as an archetype - cloning it, fixing something in it and making it effectively a seed of the new project.
So, if you’d like to interact with it - please do!
Cheers,
Alex.

Friday, October 18, 2013

Node.js express middleware: beware of reading data before the bodyParser()!

Found that yesterday, as a part of implementation of HMAC-like functionality. So, what we did:
/*
THIS IS A CODE WHICH IS ____NOT____ WORKING! DON'T USE IT PLEASE!
*/
app.use(function (req, res, next) {
    var headerValue = req.header(secureAPIHeader);
    req.hasher = crypto.createHmac("sha256", config.secretAdminAPIKey);
    req.setEncoding('utf8');
    req.on('data', function (chunk) {
        req.hasher.update(chunk);
    });
    req.on('end', function () {
        var hash = req.hasher.digest('hex');
        if (hash != headerValue) {
            log.err(util.format("Received wrong Admin request, hashes do not match. Received hash %s want hash %s", headerValue, hash));
            res.json(403, {message: "Not authorized"});
        } else {
            log.warn("Received admin request for url " + req.url);
            req.isSuperSecure = true;
            next();
        }
    });
});
It exposed quite weird (not really) behavior. When the request with the wrong HMAC was coming in, the server gave away the correct 403 error. However, if the HMAC was correct, the server would never return from call chain.
Further investigation showed that it was quite simple. Since we have finished reading all the data from stream (and successfully received ‘end’ event), the underlying bodyParser was forever blocked on ‘end’ event which would never be issued! Actually, bodyParser was only subscribing to this event when the whole data bunch was read.
We didn’t really want to divert from streaming nature of the node.js crypto hasher and read all data into a huge string. So this middleware came into play:
/*
A correct version. The underlying data sourcers are not blocked.
*/
app.use(function (req, res, next) {
    var headerValue = req.header(secureAPIHeader);
    req.hasher = crypto.createHmac("sha256", config.secretAdminAPIKey);
    req.setEncoding('utf8');
    req.on('data', function (chunk) {
        req.hasher.update(chunk);
    });
    req.on('end', function () {
        var hash = req.hasher.digest('hex');
        if (hash != headerValue) {
            log.err(util.format("Received wrong Admin request, hashes do not match. Received hash %s want hash %s", headerValue, hash));
            res.json(403, {message: "Not authorized"});
        } else {
            log.warn("Received admin request for url " + req.url);
            req.isSuperSecure = true;
        }
    });
    // This is the only difference - we execute middlewares in parallel! 
    // => 
    next();
});
Trick here is that we pass execution to the underlying bodyParser straight away. And, when bodyParser receives data so do we - there are 2 listeners for data and end events here. There is an issue with this solution, though. We are not strongly coupling the moments when hmac becomes available in request, and the processing is done. bodyParser does this for us.
Moreover, got an extra-quick reaction by the express team, and now there’s a verify method (in pull request as of 2013-10-18), which does exactly what we do, but w/out streaming (for JSON only), which should be good for 90% users. Here’s discussion link: https://github.com/visionmedia/express/issues/897#issuecomment-26575496
Thanks!

Monday, October 14, 2013

JMeter not sending body with the PUT method

Noticed very, very strange behavior. When we issue a PUT request with jMeter’s HTTP Sampler, it totally ignores what we have in Post Data or in the Raw Post Data section.
That’s kind of crazy, since I would expect PUT method to be an HTTP mutator as well, and where’s the state if we don’t send it?
However HTTP is OK, it’s rather issues of the jMeter or underlying libraries. In order to make PUT method send the POST body, one could simply set the encoding to UTF-8 (that’s the box exactly after the method dropdown. And it works. A kind of magic, but ate another 30 mins of my time.
I had that issue on jMeter 2.9, and it seems to be quite common since that’s the link where I have found the solution.

Wednesday, October 9, 2013

Chaining calls with Underscore.js

I have came across underscore (http://underscorejs.org/) — a wonderful collection of javascript functions making one’s life easier — a while ago.
However, I have only recently found out that underscore.js has a beautiful interface of chaining. The samples below are quite self-descriptive for anyone familiar with javascipt. But anyway, idea is that with chaining, underscore would automatically pass the result of the previous operation to the next one.
That’s how the work looks w/out chaining:
var someData = [1,4,2,4,5,6];
var results = _.map(
   _.filter(
      _.sortBy(someData, _.identity),
      function(a) {return a>3}),
   function(m) {return m*m}
);
//outputs [16, 16, 25, 36]
and that’s how it looks like when we use method chaining:
var someData = [1,4,2,4,5,6];
var results = _.chain(someData)
   .sortBy(_.identity)
   .filter(function(a){return a>3})
   .map(function(m){return m*m})
   .value(); // that's the piece which 'executes' the chain
//outputs [16, 16, 25, 36]
You judge which one is better!

Tuesday, October 8, 2013

Domain name history search tool - found a non-paid one!

Hi there,
I was fighting the DNS system recently, and GoDaddy to name. Issue was that I was changing the nameservers for a given domain name, but for some reason GoDaddy disallowed to forward that domain. This worked out pretty well afterwards (just takes some time to wait).
However I made a mistake then. I have changed the NS servers, without writing down the old ones. When the change was done, I was unable to recover original IP of the service I was moving (oh sh!).
That was a moment of panic, however, after some searching I have come across (a finally free!) tool, which shows the history of the domain name change over time.
That’s http://whoisrequest.org/history/, and it works just perfectly. Using it, I was able to find out the historical NS records, and then query it for the old IP and it was relief :)

Friday, October 4, 2013

Setting up TeamCity coverage report for node.js test with mocha and jscoverage

Hi!

The mentioned tech stack has been popular for quite a while, leading to many people using it (especially in startups, for quick and robust API implementations, etc.). The question of testing is solved quite well (there is a number of testing, mocking, assert frameworks out there which I find astonishing).

This post is about quite a regular task on how to setup coverage metrics automatically on the build server (TeamCity is my favourite).

Disclaimer: I have a strong belief that coverage metrics should never be given to the business people w/out tech control. When they see _any_ metrics, they become excited with it and we have that overused. I believe, that coverage should be used by developers themselves to see what code paths have not been tested, and what tests can they add.


So, let's start. Prerequisites:

  1. You have mocha tests
  2. You have a working installation of TeamCity, which is already pulling the sources from repo, building app, running these tests on post-commit basis (and surely sends notifications to the people in charge).
  3. You want to add coverage metrics to the party to see what's the progress and what areas need more help. 
Given that your project structure knows nothing of jscoverage, we'll go least-intrusive, scripting way. 

Jscoverage

You need a specific jsCoverage build, which knows about node. You can have that installed, as usual, with npm:
 
npm install -g jscoverage

Verify the install, by issuing jscoverage on the commandline. The output is not very helpful, but gives the most interesting parameters you'll need. 

The way all coverage tools work is that either instrument the product of your code (or code itself, for scripting tools), OR plug into the interpreter, and when you apply some interaction to this code (tests, users coming in and out, etc), collect how many times each of the lines was called. After these numbers are obtained, static analysis may plug in to verify how many functions/branches/conditions were covered. 

So, we need to obtain such an 'instrumented' code first from our target code. That's what jscoverage doing:


jscoverage --exclude tests,node_modules,.git,target,.idea  CURRENT_FOLDER OUTPUT_FOLDER

After that you can examine the results in the output folder. All your files should be copied there (except the ones included), and they should have some weird code inside (that's the recording part!), with pieces and bits of your code scattered across the file. 

mocha

As you are already probably using mocha for testing, nothing is really tough here. I assume that you are loading your modules-under-test with the require('../../lib/something'), rather than via absolute paths. 

As we already have a specific folder, which contains jscoverage-processed main app code, w/out all the unnecessary folders, we could now just copy out the tests themselves to that folder, and run mocha. No results yet, we just verify that mocha tests run with our processed code. 

Next, we are good to go to add some html-cov reporting. This reporter is already bundled with mocha, so no need to reference it in package.json. what it does, is that after the mocha tests are ran, it gets the coverage statistics from coverage-augmented files under test, and flushes that (in form of HTML) to the stdout. So, to see what's in the coverage, we do (whilst in the OUTPUT_FOLDER): 

../.././node_modules/mocha/bin/mocha -t 20000 --noinject --reporter html-cov --recursive ./tests/ --coverage

--noinject is required for me, as I am using rewire library for testing, and that somehow clashes with the jscoverage stuff.

You could divert that output to file, or just pipe to w3m.

TeamCity

Quite an easy part now; since we have all working locally on the server, we just need to automate it. I have finally used a html-file-cov mocha reporter, it prints out the tests results as dots to the stdout, and writes the coverage report to coverage.html within working folder; quite an easy and handy tool.  https://npmjs.org/package/html-file-cov

In teamcity, we need to add another build step to the post-commit configuration. I did the easy and ugly script way instead of built-in report processing, since I was quite happy with the html format. 

So, as we have added that build step (of type Custom script), we could now fill it with the actual logic: 

#!/bin/bash

rm -rf ../target-cov/*
rm -rf ./target/target-cov
rm -rf ./target/coverage.html

mkdir -p ../target-cov
jscoverage --exclude tests,node_modules,.git,target,.idea . ../target-cov

mv ../target-cov ./target
cp -r ./tests ./target/target-cov
echo "JS files instrumented correctly"

cd target/target-cov
echo "Running mocha tests with 'dot' reporter"
NODE_ENV=dev ../.././node_modules/mocha/bin/mocha -t 20000 --noinject --reporter html-file-cov --recursive ./tests/ --coverage
mv ./coverage.html ..
echo "Done. Configuration stored to `pwd`/../coverage.html"


As a result of this manipulation (making a dir one level up, storing there the jscoverage-processed files, copying it to the local 'target' folder, copying over the tests non-processed, running the tests, moving the coverage.html to the target dir), we have the artifact - coverage.html - which contains all the stuff we need. 

We just need to publish it - in TeamCity project settings, on the very first page, there's an 'Artifacts' box, which takes paths. Just add +:target/coverage.html, and after each build you'll have a nice drop-down 'Artifacts' with a link to your html with coverage. 

Voila! 
 

Wednesday, March 13, 2013

Portecle - nice tool to manage certificates, keystores, keys, etc. Instead of keytool (jdk)

I was using keytool from jdk installation always. However, when it came down to bouncycastle implementation, too much work should be done to make it running - download a crypto provider, install it, etc. That's the description: http://stackoverflow.com/questions/4065379/how-to-create-a-bks-bouncycastle-format-java-keystore-that-contains-a-client-c

As Android uses bouncycastle by default, had to pack a public key of the RSA pair to the BKS store to make a resilient HttpsUrlConnection.

Found it very nice and handy to use http://portecle.sourceforge.net/ as a GUI tool.

Works flawlessly, recommended

Tuesday, March 12, 2013

Linux/mac alternative to Fiddler - WebScarab!

I really enjoy using fiddler - that is web logging proxy, or, the way it is called - a web debugger.

It works very easy way:

  1. When started, modifies the Internet Explorer proxy settings to point them on  itself.
  2. All the request are logged and ready to be analyzed then. 
  3. As much of the software does use IE proxy settings for its own setup, non-straightforward things as SOAP calls, Ajax, embedded-browser calls may also be caught. 
  4. With the captured interchange, one can analyze every aspect of a single roundtrip.
The only thing I dislike about it is its nature - .Net built, it works flawlessly on MS platform, but not on Linux.

The good and working alternative I have found is webscarab project, which is written in java, looks pretty much the same, and does most of the same job. Easy thing. Just run it with java-jar and setup your software to proxy through 127.0.0.1:8008.

Monday, March 11, 2013

Web-harvest scrapper 2.1 - how to use

Often it happens that you need to get some data from a web source, but the developers of the site disallow or simply do not have resources to implement any b2b API.

That's where scrapers come in. They are software pieces acting as a browser but providing some programmatic API to process results in a program code.

During my search for an acceptable solution, I came over the http://web-harvest.sourceforge.net/ ; nice features are:

  1. All-java, used as a library
  2. Great versatility - it is packed; really. Xquery, Xpath, regex searches, emulation of browser activity, templating and variables, different script integration. 
  3. Nice UI workbench, which allows to develop scripts easily and see the results immediately. Then, just save the XML configuration and invoke it from the code then. 
However, the version under link above is 2.0 version. It is not present in Maven, and I am building with maven. However, there is a new 2.1 version, which is highly redone - maven build process, switched to Guice injection, etc. 

I fancy 2.1 alot, but there are some issues with it - NO documentation at all, a little different behavior. I made a fork for myself on GitHub (https://github.com/lexaux/web-harvest) - and applying changes there. Hopefully, will be able to contact developers and contribute. 

For now, a quick how-to on running the 2.1 web-harvest scraper in your code (UI is pretty straightforward). It is really different from 2.0. So, here it goes:

Thursday, February 28, 2013

Building Android application with Maven -- Unknown packaging APK

Should have been an easy thing. However, I spent my own 15 minutes because of not following rules. So here they are for starting up with maven and android:

  1. Read the Getting Started here http://code.google.com/p/maven-android-plugin/wiki/GettingStarted
  2. Create your own pom.xml, copying one from one of the samples. Samples can be downloaded also from Google code: http://code.google.com/p/maven-android-plugin/wiki/Samples
After performing these steps, I was unable to proceed, unfortunately. I had an error from maven build saying that my pom is not correct - looks like it did not pick up the android plugin, and thus did not understand the apk packaging. 

Unknown packaging: apk @ line 15, column 16

So i started lurking around and got the difference soon. I missed a single line in declaring a build plugin. Without the element, maven is not picking the new packaging. Oh, and also - the plugin requires maven 3.0.3 or newer. 

This is my entire Build section:

Build/plugins section:
Dependencies section:

Saturday, February 23, 2013

How to build gwt 2.4 now -- if errors in build

Okay, so the goal is to build Google Web Toolkit of version 2.4 now - when 2.5 is a  current stable. Issues seems to be fairly easy, but unfortunately -- fails from the first approach. Continuing tradition of writing memos for future, here it goes:

  1. http://google-web-toolkit.googlecode.com/svn/releases/2.4/ - Main GWT source, release 2.4, checkout with subersion.
  2. http://google-web-toolkit.googlecode.com/svn/tools/ - GWT-tools, some binary dependencies required for building.
    Trick is that they are versioned in a streamlined manner, and no-one leaves a note about which version of the tools does particular version of GWT require to build.
    So - to note, for GWT release 2.4 Tools rev 10417 are required. Do update Tools repo to this revision. 
  3. Next step is straightforward - just set GWT_TOOLS environment variable to point to the tools working copy contianing rev 10417, descend into the GWT 2.4 release working copy, and issue ant.

    It did not work for me
    [gwt.javac] /home/lexaux/work/gwt/2_4/user/src/com/google/web/bindery/requestfactory/shared/impl/FindRequest.java:34: error: name clash: find(EntityProxyId) in FindRequest and find(EntityProxyId) in RequestContext have the same erasure, yet neither overrides the other
    [gwt.javac]   Request find(EntityProxyId proxy);

    Which is fairly weird, I thought, as ant explicitly sets language and class output level.
    After about a day of search, I just tried to downgrade my JDK to version 6 -- and voila; it worked. Looks like there is a change in generic treatment which stopped compilation.
After successful build, you can obtain results in build/lib - here's gwt-dev.jar and gwt-user.jar

Wednesday, January 30, 2013

Java to Scala idioms: id list from object list

Well, i was amazed.

I just needed to get a list of IDs of database-stored objects, in order to pass them for later-on updates with the 'where ... in' query. And got that with scala mind-blowing way. Instead of Java (please omit comments about lots of auto-boxing and array creation etc -- its only verbosity-related sample):

List dbObjects = ...
ArrayList idArray = new ArrayList(dbObjects.size());
for(DBObject o : dbObjects) {
  idArray.add(o.getId());
}

I would just write something like

val ids = dbObjects.map(_.getId)

And that's it!

Moreover, I'm now using slick with lifted embedding. For it, i need to map objects I want to store in DB to tuples consistent with db schema (prior to insert). I'm now about to do this by providing a mapping function in the Table descendants describing DB schema, and using it whenever I want to store objects. Just need to  think of types a bit so it is consistent,but general use would be, instead of:

DBObjects.insertAll(objects.map(x => (x.id, x.name, ...)) 

I would use

DBObjects.insertAll(objects), while inside this thing would use its own mapping functions.
Great stuff about it is that it is all compile-time checked -- e.g, if I change schema, I will get compilation errors about the tuples don't match on that conversion function.

Isn't that nice?

Monday, January 28, 2013

sbt and IntelliJ idea support

For now, not so as good as Maven integration, however worth seeing.
Found:

  1. https://github.com/mpeltonen/sbt-idea - SBT plugin for generating IDEA files (.idea structure, not the old one) from the SBT build definition.
    Unfortunately, the project created is not ideal -- it adds some additional module for SBT build itself, and in my case it picked up wrong web.xml for webapp.
    It is OK to change this, however as soon as you add another one lib in SBT, you'll have to re-generate and re-configure IDEA again. 
  2. https://github.com/orfjackal/idea-sbt-plugin/wiki - IDEA plugin to run SBT commands in IDE, include to the build process and show notifications in-place. Not too bad. 
Issues with those are that, not alike maven integration, IDEA does not track automatically changes in SBT, and on re-configuration of the build some work is needed. 

On the side, just an idea whether it is possible at all. Maven's pom.xml provides ready model, while SBT provides a set of transformations - some of them may be pretty complex. One would need to run sbt behind IDEA to gather the actual ending values of the Setting[T] at the end. I guess :)

Tuesday, January 22, 2013

Android HttpsUrlConnection, self-signed certificate and hostname verifier

Today, an Android hint. Note for myself for future :)

So, what we have:

  1. Server
    REST (JAX-RS+akka), published to Tomcat. Tomcat serves app via HTTPS having a self-signed certificate (official tomcat docs).
    No CA, no chain - just a single certificate done in a really usual way.
    Tomcat is published on a box accessible only via IP, no DNS here.
    The actual server is https://46.4.224.49 just in case someone wants to take a look @cert.
  2. Client
    Android, trying to talk to the server above in JSON. Should use HTTPS for both traffic encryption and identity confirmation.
Generally, implementation seems pretty straightforward - get the public certificate, and make Android https facilities to trust it. 

I decided to use HttpsUrlConnection rather than HttpClient as the documentation suggest the former was only preferred for small operations in Android versions prior to 3. Trick is that Android is not shipped with the JKS keystore implementation which seems to belong to Oracle (Sun before).

After some googling I combined 3 articles to make a good working example:
  1. http://blog.crazybob.org/2010/02/android-trusting-ssl-certificates.html
    use first 2 steps, how to obtain public certificate off the server, pack it to the BKS keystore and some clues on how to setup BouncyCastle
    (hint - as of early 2013 we need version bcprov-jdk15on-146.jar, though newer is there)
    Added correct line to jre/lib/security/java.security to support another one provider, and dropped the jar into jre/lib/ext. This is only needed for keytool to understand new format.
  2. http://developer.android.com/reference/javax/net/ssl/HttpsURLConnection.html
    Feed HttpsUrlConnection with the SslSocketFactory relying on that keystore. Keystore is put into the res/raw and loaded via context.getResources().openRawResource(int id)
And this is where the trouble started. The test code was producing error. After getting the certificate in the error has changed from 'Not trusted certificate' to 'Can not verify hostname'. Which is progress. 

The very major of the solutions on the Internet were just allowing all the hostname, and even deeper, accepting all the certificates -- which was inacceptable due to security reason.

The certificate had an IP address in CN field. After some research, it was clear that certificate is there, it is trusted, however hostname check fails. That's when DefaultHostnameVerifier was spotted (yes i know javadoc is old, but debugger shows this very class, which is hidden in library-no source). However, old javadoc says clearly that all the checks are declined!

Surely we can not have this check passed as nothing is allowed. 

The issue is resolved easily with setting conn.setHostnameVerifier(new BrowserCompatHostnameVerifier()) - this triggers HttpsUrlConnection to the same mode as the browser (passing *.domain CNs, and IP addresses, etc), which is perfectly OK for our case.