Adding Amazon Alexa Voice Services to Your iOS App with Swift

Major thanks to the MacLexa project out on Github for providing motivation, source code, and a great starting place for this blog post.

secure

Amazon Echo is an always listening device which acts as an always available assistant. Thanks to the deep integrations that the Amazon Echo has to online services you can issue a wide variety of questions and commands via voice and get real world responses.

Once your Amazon Echo is setup it is always listening for commands. The key word being listened for is “Alexa”. For example: “Alexa, play Taylor Swift.” If you have an Amazon Music account, and you have purchased Taylor Swift songs (who doesn’t have the 1989 album by this point?), then your Amazon Echo will start playing Taylor Swift.

Notice that I used “Alexa” as a prefix in the above audio command. Saying “Alexa” to an Amazon Echo is the trigger word. It is like saying “Hey Siri” to your iPhone, “OK, Google” to your Android Phone, or “Xbox” to your Xbox, then issuing a voice command. Once you have spoken “Alexa” the Amazon Echo device will record the rest of your voice command and relay it to Amazon’s Alexa Voice Service which will take specific actions and return a response from Amazon.

The Alexa Voice Service has another trick up its sleeve: Skills. Third-parties can create a Skill using Amazon’s Alexa Skills Kit which can extend the Alexa Voice Service via additional service integrations.

  • Fidelity has created a stock value skill that allows you to get answers to questions about values on the stock market. Once you have activated the Fidelity stock skill for your Amazon login via an Android or iOS app, then you can ask Alexa: “What is the value of Microsoft stock?” and get back an answer via the Fidelity stock skill.
  • There is a skill called Yonomi that integrates with Phillips Hue lightbulbs. It allows you to change the color of your lights with a voice command, so you could flood the room with red light whenever you say: “Alexa, set lights to ‘Game of Thrones’ mode.”
  • Of course, you can also buy stuff from Amazon. Just say: “Alexa, buy me some Moon Cheese” If you are an Amazon Prime subscriber, and you have already ordered Moon Cheese, it will be ordered automatically and FedEx will deliver it to your doorstep in 2 days or less (or the same day if you have Amazon Prime Now).
  • If you have a Spotify subscription, you can configure Alexa to integrate with the Spotify Skill and issue voice commands to playback all the music you can stream from that service.

Let’s review all the terms in this Amazon voice control and audio device realm:

  • Echo — The always-on, standalone device that listens and responds to voice commands.
  • Alexa — The keyword that the Amazon Echo uses to determine when a voice command is being listened for. It is also the brand name for the back-end voice services.
  • Alexa Voice Service — The backend service that receives voice commands and delivers audio responses to voice commands.
  • Skill — A third party add-on to handle special voice commands.
  • Alexa Skills Kit — The collection of APIs and resources developers use to create skills.

 

iOS App Integration of the Alexa Voice Service

The Amazon Echo is an interesting device, however, I have an iPhone that has a perfectly good microphone + speaker combination in it. Lets integrate Amazon’s Alexa Voice Service (AVS) directly within an iOS app.

Why would you want to do this kind of in-app Alexa Voice Service integration?

  • It is just plain fun to see this actually work.
  • It is an interesting in-iOS-app stop gap that addresses the limitations of SiriKit’s intent domains for your specific app type.
  • Promote and do in-app integration of your own custom AVS Skill.
  • Get insight into Amazon’s web API design methodology.

Side Note: Not to be underestimated is the amazing configurability that Cortana allows app developers for Windows 10 Universal Windows Platform apps. However, Cortana integration and LUIS are topics for a different Blog post.

Let’s go through the steps needed to perform AVS integration within your native iOS app.

If you have a Mac and want to try out Alexa Voice Services + Skills on your Mac right now check out MacLexa.

There is a 4 step in-app procedure to follow for integrating the Alexa Voice Service into your iOS application:

  • Authorize the app via Login with Amazon and retrieve an Access Token.
    • This requires a user account with Amazon.
    • It is used to associate future Alexa Voice Service requests with the Alexa settings that any Amazon user has setup on their own personal Amazon account.
    • This step also lets your in-app Alexa integration fully work with any Skills that are associated to an Alexa + Amazon account via Amazon’s Alexa configuration apps.
  • Record audio from the device microphone and store it locally to your app.
  • HTTP POST the Access Token and the audio from the device microphone to the Alexa Voice Service.
  • Playback Alexa’s voice response audio which is returned raw from the Alexa Voice Service.

 

amazon-voice-services

Sample Source Code

Feel free to pull from my AlexaVoiceIntegration Github repo for a full working sample. (Sorry, it is still in Swift 2.3)

The most fun chunk of code is the upload function which performs the upload to AVS and the playback of the Alexa voice response.

 

Registration of your Alexa Voice Service app integration with Amazon Developer Portal

In order for your app to integrate with Alexa Voice Services you need to go to the Amazon Developer Portal and get certain specific keys:

  • Application Type ID
  • Bundle Id
  • API Key

To get started:

The end goal for the in-app Alexa startup procedure is to get an access token string that we can send via HTTP to the Alexa Voice Service API.

We get the access token by using the LoginWithAmazon.framework for iOS feeding in the Application Type IDBundle Id, and API Key values you will configure and generate on the Amazon Developer Portal.

From your Amazon Developer Portal / Alexa configuration you need the following values:

  • Application Type ID from your created Alexa Application / Application Type Info section
    • ApplicationTypeID
  • API Key and Bundle Id pair that you will create in the Alexa Application / Security Profile / iOS Settings area
    • iOSSettings

Be sure to keep track of the Application Type ID, API Key, and Bundle Id. We are going to need these later on when we setup our iOS app.

 

iOS code and Xcode project setup to use the LoginWithAmazon.framework from Swift

Start by going through Amazon’s documented steps to get the LoginWithAmazon.framework file.

What follows is a fairly standard way of using a standard iOS framework within an existing Xcode project.

Copy the LoginWithAmazon.framework file to a folder within your iOS Xcode project.

Open Xcode and go to your iOS project General settings:

XcodeConfig

In the Embedded Binaries section press the + button.

Navigate the file chooser to where you copied the LoginWithAmazon.framework bundle. Press OK.

You should see something like the above where the LoginWithAmazon.framework file is in the Embedded Binaries and Linked Frameworks and Libraries section.

To fix an app deployment bug with the framework go to the Build Phases section and ensure that the Copy only when installing checkbox is checked:

CopyOnlyWhenInstalling

The final step is to ensure that the master header from the LoginWithAmazon.framework is included in your Objective-C to Swift bridge header file.

If you already have an Objective-C to Swift bridge header file, then include the following line:

#import “LoginWithAmazon/LoginWithAmazon.h”

If you do not have a bridge header file, then you need to configure your Xcode project with an Objective-C to Swift bridge header, then include the above line in it.

See also the official Swift and Objective-C in the Same Project documentation provided by Apple.

Test to see if all this worked:

  • Get a clean build of your app.
  • Go into a Swift source file and use the standard auto-complete to try and access the AIMobileLib static class.
    • The auto-complete should present the list of functions you can call.

 

Configure your app with the API Key, Bundle Id, and Application Type ID from the Amazon Developer Portal

First up is to ensure that Bundle Id, API Key, and other values are properly configured in your Info.plist and application.

Open up your app’s Info.plist within Xcode:

AmazonInfoPList

 

Bundle identifier

Whoa, that’s a lot of weirdness with $(…) stuff.

As you can see the core of our needed Login With Amazon values is the value of $(PRODUCT_BUNDLE_IDENTIFIER).

The value for $(PRODUCT_BUNDLE_IDENTIFIER) comes from your General project setting page within Xcode:

BundleIdentifier

The above value in the Bundle Identifier field has to match the Bundle Id value from the Amazon Developer Portal.

If the Bundle Ids don’t match, then it is easy to go back to the Amazon Developer Portal and add a new value. Just be sure to track the right API Key and Application Type Id with the updated Bundle Id.

 

URL Types in Info.plist 

The LoginWithAmazon.framework file uses an in-app UIWebView to handle user login scenarios.

Part of those login scenarios involve the UIWebView needing to navigate or redirect to a URL on login success / failure scenarios.

The redirect URL used is generated by the LoginWithAmazon.framework using your Bundle Id as a seed.

When the login result redirect happens within the UIWebView during login the main AppDelegate – openURL function is called in your app.

This boilerplate Swift implementation ensures that openURL portion of the login procedure properly routes through the LoginWithAmazon.framework file to call back on all the properly setup delegates:

 func application(application: UIApplication, openURL url: NSURL, sourceApplication: String?, annotation: AnyObject) -> Bool 
{
   return AIMobileLib.handleOpenURL(url, sourceApplication: sourceApplication)
}

Debugging Tip: If you place a breakpoint on the above function and it is never hit during the login procedure, then you have a misconfigured Info.plist –> URL Types area.

App Transport Security Settings

The settings shown above are the most liberal and allow all HTTP traffic from within the process to be sent. To really lock this down, you should follow Amazon’s instructions which only allow requests to be sent from your app to specific Amazon domains.

API Key

The API Key entry within the Info.plist is read up and processed by the LoginWithAmazon.framework to get all the right IDs for all the requests (i.e. Client ID, and others). The API Key has to 100% match what the Amazon Developer Portal provided. It will be a fairly huge blob of text, and that is OK.

 

Config is now done! Woo hoo! Let’s login with AIMobileLib

ViewController.swift up on my Github repo shows how login will all come together.

The AIMobileLib is your static gateway provided by the LoginWithAmazon.framework.

Side note: I have ‘Swiftified’ my calls into LoginWithAmazon.framework by using an AIAuthenticationEventHandler wrapper class that implements the AIAuthenticationDelegate and bridges the delegate calls to closures.

The call chain to AIMobileLib to successfully login to Amazon:

  • clearAuthorizationState – Clear out any stored tokens and values.
    • authorizeUserForScopes – Pops a web view, user logs in, retrieves an authorization token.
      • getAccessTokenForScopes – Takes all the cookies and keychain stored values from authorizeUserForScopes and retrieves the access token.
        • We then use the access token in calls to the Alexa Voice Service.

In this sample I chose to clear out any stored tokens and values by using the clearAuthorizationState function on every run.

AIMobileLib.clearAuthorizationState(AIAuthenticationEventHandler(
            name: "Clear Or Logout",
            fail: {() -> Void in
                NSLog("Clear Or Logout Fail")
            },
            success: {(result : APIResult!) -> Void in
                NSLog("Clear Or Logout Success")
        }));

 

Now that all tokens / cookies have been cleared, let’s have the user login via authorizeUserForScopes.

Finally, we are at the location where we need to use that Application Type Id from the Amazon Developer Portal.

We need to feed in the Application Type Id from the Amazon Developer Portal into the option scope JSON:

 let options = [kAIOptionScopeData: "{\"alexa:all\":{\"productID\":\"Application Product Id\",
 \"productInstanceAttributes\": {\"deviceSerialNumber\":\"1234567890\"}}}"]
       

Note: kAIOptionScopeData key value comes from the LoginWithAmazon.framework headers.

 

When authorizeUserForScopes is successful we then turn around and call the getAccessTokenForScopes function.

When getAccessTokenForScopes is successful we now have the access token string we can use in pure calls to the Alexa Voice Service.

 

We have the access token!  Let’s call into that Alexa Voice Service.

The Alexa Voice Service makes sending voice commands, and receiving voice responses, a very straight forward process:

  • Record the user’s question as sound data from the microphone as PCM data.
  • Send an HTTP POST request to the Alexa Voice Service that contains the sound data.
  • In the response to the HTTP POST will be the sound data to playback to the user which answers their question.

The Alexa Voice Service calls are all handled within the AVSUploader class.

  • startUpload – The master function that manages the upload process. Takes in completion handlers for success / failure.
    • start – Just a general internal helper function, checks parameters and other cleanup.
      • postRecording – Manages an HTTP POST operation to the Alexa Voice Service using the access token, some boilerplate JSON, and the raw PCM data recorded from the device microphone.
    • When postRecording is successful, the completion handler fed into startUpload will be called.
      • A parameter to the success completion handler is the raw PCM sound data (in the form of NSData) from the Alexa Voice Service that contains the voice response for the recorded user command.

The PCM sound data returned from the Alexa Voice Service can then be played through an AVAudioPlayer (From ViewController.swift):

self.player = try AVAudioPlayer(data: audioData)
self.player?.play()

 

Go out and play… and also take a look at making your own Skill

The sample Swift code to access the Alexa Voice Services is just meant to get you started. It allows you to have some level of control in how Alexa Voice Services are accessed and used within your own app. It can also be used to test any Skills you choose to integrate with your Amazon profile without needing an Amazon Echo device.

The techniques outlined above can also be fully replicated for any other other native device, platform, or front end.

Using Aurelia’s Dependency Injection Library In Non-Aurelia App, Part 1

If you are anything like me then you like to try to keep your code loosely coupled, even your JavaScript code.  The ES2015 module spec helped solve a lot of issues with dependency management in JavaScript apps, but it did not really do anything to prevent having code that is tightly coupled to the specific imports. When Aurelia was originally announced, one of the things that first caught my eye was that it included a dependency injection library that was designed to be standalone so you could use it even if you were not including the rest of the Aurelia framework.  Now that Aurelia has had some time to mature, I decided to see how exactly it might look to use the dependency injection library in a variety of non-Aurelia applications.

In this two-part blog series, I will unpack a few basics about the library itself, and then show how it might be used in three different apps: a vanilla JavaScript app, a React app, and then a React app that uses Redux for its state management.

The DI Library

Before we dive into how you would integrate the dependency injection library into your application, we first need to take a look at the how the library works.

If you want to use Aurelia’s dependency injection library, then I would suggest installing it from NPM with “npm install aurelia-dependency-injection”.  You’ll notice there are only two total dependencies that also get installed: aurelia-pal and aurelia-metadata. Aurelia-metadata is used to read and write metadata from your JavaScript functions, and aurelia-pal is a layer that abstracts away the differences between the browser and server so that your code will work across both environments.

Once you have installed the library, the concept is similar to the Unity dependency injection container for .NET.  You create one or more nested containers so each contains their own type registrations, and then types are resolved against a container with the ability to traverse up the parent container chain if desired.  When you register a type, you are able to specify how exactly it will be constructed, or if it is already an instance that should just be returned as-is.

Registration Types

There are three basic lifecycles that you can choose when you register something with the container, and then there are several advanced methods if you need more flexibility.  Let us consider the three lifecycle options first.

Standard Usage

When you want to register an object or an existing instance with the container, you should use the registerInstance method.  When the container resolves an instance, it will not attempt to construct a new one or manipulate it in any way.  The registered instance will simply be returned.

If you want to have your type be constructed every time that it is resolved from the container, then you want to use the registerTransient method.  When you register a transient you need to register the constructor function so that the container can create new instances every time that it is resolved.

You might have something that you want to be a singleton but it still needs to be constructed that first time.  You could either construct it yourself and register it as an instance, or register the constructor function using the registerSingleton method.  This behaves like the registerTransient function except that it will only construct the object the first time it is resolved, and then it will return that instance every other time.  When a singleton is registered, the container that it was registered with will hold on to a reference to that object to prevent it from getting garbage collected.  This ensures that you will always resolve to the exact same instance.  One thing to remember with singletons though is that they are only considered a singleton by a specific container.  Child containers or other containers are able to register their own versions with the same key, so if that happens, then you might get different instances depending on which container resolved it.  If you want a true application level singleton then you need to register it with the root container and not register that same key with any child containers.

If you attempt to resolve a type that does not exist the default behavior is to register the requested type as a singleton and then return it.  This behavior is configurable, though, so if you do not want it, then you should disable the autoregister feature.

Advanced Usage

Now that we have looked at the three basic use cases for registering with the container let us take a look at the more advanced approaches.  If you have a very specific use case that is not covered by the standard instance/transient/singleton resolvers then there are two other functions available to you to give the flexibility to achieve your goals.

If you need a custom lifetime other than singleton/instance/transient, you may register a custom handler with the container.  The handler is simply a function that takes the key, the container, and the underlying resolver and lets you return the object.

If you need a custom resolution approach, then you can register your own custom resolver with the registerResolver function.


import { Container } from 'aurelia-dependency-injection';

// create the root container using the default configuration
const rootContainer = new Container();
// makeGlobal() will take the current instance and set it on the static Container.instance property so that it is available from anywhere in your app
rootContainer.makeGlobal();

const appConstants = { name: 'DI App', author: 'Joel Peterson' };

function AjaxService() {
    return {
        makeCall: function() { ... }
    };
}

// registerInstance will always return the object that was registered
rootContainer.registerInstance('AppConstants', appConstants);

// create a nested Container
const childContainer = rootContainer.createChild();

// register a singleton with the child container
childContainer.registerSingleton(AjaxService);

Resolution

Now that we have considered how to use the container to register individual instances or objects, let us take a look at how types are resolved.

In my opinion, the real benefit of the Aurelia container comes when you use it to automatically resolve nested dependencies of your resolved type.  However, before it can resolve your nested dependencies you have to first tell the container what those dependencies are supposed to be.

The Aurelia dependency injection library also provides some decorators that can be used to add metadata to your constructor functions that will tell the container what dependencies need to be injected when resolving the type.  If you want to leverage the decorator functions as actual decorators then you will need to add the legacy Babel decorators plugin since Babel 6 does not support decorators at the moment, https://github.com/babel/babel/issues/2645.  However, my advice would be to use the decorator functions as plain functions so that you do not have to rely on experimental features.


import { inject } from 'aurelia-dependency-injection';

import MyDependency1 from './myDependency1';

import MyDependency2 from './myDependency2';

function MyConstructor(myDependency1, myDependency2) { ... }

inject(MyDependency1, MyDependency2)(MyConstructor);

export default MyConstructor;

In this example, the inject function adds metadata to the constructor function that indicates which dependencies need to be resolved and injected into the argument list for your constructor function.  It is important to keep in mind that the dependencies will be injected in the order that they were declared, so be sure to make your arguments list order align with your inject parameter list.

Some developers might decide that they do not want to have to manually register all of their types with the container and would rather have it automagically be wired up for them.  Aurelia does support this approach as well with the autoregister feature.  However, it is probably not going to be ideal to have everything be registered as singletons so Aurelia provides other decorators that you can use to explicitly declare how that type will be autoregistered.  Once you decorate your items as singletons or transients, then whenever they are resolved they will autoregister with that lifetime modifier and you can build up your app’s registrations on-demand.


import { singleton, transient } from 'aurelia-dependency-injection';

function MyType() { ... }

// by decorating this type as a singleton, if this is autoregistered it will instead be registered as a singleton instead of as an instance
// default is to autoregister in root container
singleton()(MyType);
// or you can allow it to be registered in the resolving container
singleton(true)(MyType);

// or you can specify that this is a transient type
transient()(MyType);

export default MyType;

Resolver Modifiers

Aurelia also provides a few different resolver modifiers that you can use to customize what actually gets injected into your constructor functions. For instance, maybe you do not want the container to autoregister the requested type if it does not exist and just return a null value instead, or maybe you want to return all of the registered types for a given key instead of just the highest priority type.  These resolver modifiers are used when you specify the dependencies for your given constructor function.


// this list is not exhaustive, so be sure to check out Aurelia's documentation for additional resolvers

import { Optional, Lazy, All, inject } from 'aurelia-dependency-injection';

function MyConstructor(optionalDep, lazyDep, allDeps) {

    // optional dependencies will be null if they did not exist
    if (optionalDep !== null) { ... }

    // lazy dependencies will return a function that will return the actual dependency
    this.actualLazyDep = lazyDep();

    // all will inject an array of registrations that match the given key
    allDeps.forEach((dep) => { ... });
}

inject(Optional.of(Dep1), Lazy.of(Dep2), All.of(Dep3))(MyConstructor);

export default MyConstructor;

Vanilla JavaScript

Now that we have talked about the basics of how to use Aurelia’s dependency injection library, the first type of application that I want to consider is a simple application written in vanilla JavaScript.  The full source code for this example app can be found at my Github repo, so I will just explain some of my choices and the reasons behind them.

I created a module that is responsible for returning the root container.  My personal preference is to be able to explicitly import the root container in my app instead of relying on the Container.instance static property being defined.

There are many different ways that you can create your UI with vanilla JavaScript and I opted to create a simple component structure where each component has a constructor, an init function, and a render function.  I decided to keep the init phase separate from the constructor so that the container can use the constructor solely for passing in dependencies.  There is a way in which you can supply additional parameters to the constructor but I decided it would be simpler to just have to init functions.  However you end up writing your UI layer, I would advise that you do it in such a way so that your constructor parameters are only the required dependencies, otherwise, you will have to do a more complicated container setup.

I also decided to allow for my components to track a reference to the specific container instance that resolved them.  This allows components to create a child container off of the specific container that resolved the current component and build a container tree.

However, one thing that I did discover with the deeply nested container hierarchies is that child dependencies resolve at the container that resolved the parent, and the resolution of child dependencies does not start back down at the original container.  For instance, consider this example.


// component A depends on component B

rootContainer.registerTransient(ComponentA);
rootContainer.registerTransient(ComponentB);

const childContainer = rootContainer.createChild();
childContainer.registerTransient(ComponentB);
const componentA = childContainer.get(ComponentA);

In this example, I would expect that since ComponentA does not exist in the child container that it would fall back to the root container to resolve.  However, when it sees that ComponentA depends on ComponentB and attempts to resolve ComponentB, I would expect it to start from childContainer since that is where the initial resolution happened.  Based on my experience, however, it seems like it starts at rootContainer since that is the container that actually resolved ComponentA.  This can cause issues if you attempt to override a previously registered item in a child container and that is a dependency of something that is only defined in the parent container.  In my example app, I ran across this and ended up re-registering the dependents of my overridden module in my child container so that the resolution would occur properly.

Conclusion

In this article, we discussed some of the basic functionality of Aurelia’s dependency injection library and how you might incorporate it into a vanilla JavaScript application.  In Part 2, we will look at how you might also wire up dependency injection into a plain React application as well as a React application that uses Redux for state management.

Virtual Reality vs. Augmented Reality – Impact on Your Business

pexels-photo-166055

Virtual reality (VR) and augmented reality (AR) have come out of nowhere to become ‘the next big thing’.

With the failure of Google Glass as an augmented reality platform for consumers, the seeming success of the Oculus Rift as a gaming-based virtual reality platform, and the weird novelty of Microsoft HoloLens as a resurgence in the augmented reality realm, it can be hard to understand the scope, purpose, and worth of these new ‘worn on the head devices’.

  • Are they toys?
  • Are they the ‘next big thing’?
  • How could these directly impact my bottom line?
  • Can they add efficiency to current day to day work across an enterprise?
  • Is this VR / AR stuff even worth paying any attention to?
  • How did these things rise up and become a multi-billion dollar business, when I just seem them as weird head mounted toys?

VR and AR – A History

Everything starts with MEMS. MEMS stands for Micro Electromechanical Systems. MEMS allowed for the at-scale manufacturing of tiny, easily integrated, highly accurate, and super cheap sensors such as accelerometers and gyroscopes.

These MEMS based sensors can fit within a housing that is no larger than a chip on a tiny motherboard. The sensors are super accurate, and easily integrated into anything with a breadboard. The first consumer breakthrough MEMS device is the accelerometer within the Wii Remote from Nintendo. An accelerometer can measure the relationship of the sensor to gravity, meaning an accelerometer can tell you if you are moving up (i.e. negative acceleration to gravity) or moving down (i.e. positive acceleration to gravity). As Nintendo found out, this is all you need to create a multi-billion dollar home gaming business and revive your whole business.

Technology moves on from the Nintendo Wii game console and the simple accelerometer. Soon we have MEMS based gyroscopes that could do more than measure the relationship of the sensor to gravity, but could measure the relationship of the sensor to all spatial dimensions. We have super small magnetometers which can measure the relationship of the device to the true magnetic North Pole of the Earth. Barometers which can be used to measure air pressure and hence altitude. GPS sensors which can measure the latitude / longitude / time / altitude of the device using the Global Positioning System. Soon, you get the iPhone 5 that incorporates all of these sensors, and the Apple CoreMotion and CoreLocation iOS app frameworks, which allow any iOS app developer to discover and log the relationship of an iPhone to all of these real-world spatial dimensions, by regulating, smoothing, and aggregating input from all those sensors in real time for easy in-app consumption.

All of this using sensors that cost less than $10 to make and are as easy as soldering their dangling pins to a motherboard.

In addition to sensor technology, we also have the ARM and SOC (System On a Chip) revolution. ARM Holdings is an intellectual property and design company that produces simple, low-power chip plans. ARM happily toils away shipping out amazingly fast, super low-power, instruction set, and chip designs that anyone can purchase off the shelf and manufacture. ARM doesn’t make the chips, they are just masters of chip and instruction set design. Suddenly, seemingly out of nowhere, the ARM revolution comes full force. Suddenly we have Apple, Nokia, and Samsung taking these ARM designs off the shelf and starting to manufacture super cheap, low power, super fast, highly efficient full systems on a chip.

Those ARM SOCs start to make their way into hugely profitable smartphones. The high margin of smartphones relative to their cost cause a virtuous cycle of ARM SOC manufacturing improvements + cost reduction + ARM design improvements. Suddenly, we have all this high-end sensor and CPU power at super low-energy cost and monetary cost. It is now that we can put high-end ARM hardware + high-end sensor packages into the tiniest of shells with off-the-shelf battery power. Through the natural cycles of profitability driving innovation, we are at the point where we can incorporate actual supercomputers into the smallest of form factors. Even wearables.

At the same time as we are getting all this high-end ARM CPU and high-end super low form factor sensor packages, we get the capacitive touch revolution and the blue LED revolution. Now we can also create screens that sip power and are instantly relatable via the human power of touch.

On the side as part of the SOC revolution we also get the PowerVR line of high-end graphics chips which sip power and have a full hardware accelerated 3D API via OpenGL and DirectX.

It is from the shoulders of PowerVR 3D chips, all those MEMS sensors, LED screens, the ARM SOC revolution, and the virtuous cycle of aggregation + capitalization via the smartphone, that we now have a platform from which to go at full head mounted wearables, which are viable for early adopters to experience virtual reality and augmented reality applications.

The first round of these head mounted wearables falls into 2 camps:

  • Virtual Reality (VR)
  • Augmented Reality (AR)

Virtual Reality

Virtual reality works off the basis that the only thing the user can see and hear are the outputs of the virtual reality device. No outside stimulus is meant to enter the user experience.

Augmented reality works off the basis that the user needs their real world sensory experience augmented with virtual data and graphics. The scope of human vision while using an augmented reality device is meant to be completely real world, while in real time a digital overlay will be presented into the main sensory experience of the user.

The current mainstream implementation of virtual reality exists in the following major consumer products:

  • Google Cardboard
  • Samsung Gear VR
  • Oculus Rift

Google Cardboard
Google Cardboard is exactly what the name implies. It is just a simple cardboard (yes, just cardboard) mount for a smartphone.

The primary use of Google Cardboard is to present stereoscopic, 360 degree / spherical, video.

There are existing SDKs for Android and iOS which allow easy division of a smartphone screen for stereoscopic viewing of a specially created 360-degree video.

360-degree videos are easily created using special multi-camera rigs that film the full sphere of a given location. Software tools then aggregate the input from all the cameras on the rig into a single video. The single video is then projected onto a smartphone screen using special 3D spherical transforms with the 3D support of a PowerVR chip. 360-degree video playback is augmented further by aggregating the all the gyroscope, accelerometer and other position sensors present on a cellphone, to make the user feel immersed in the environment of the 360-degree / VR video.

Almost any smartphone can provide a VR experience for a user via Google Cardboard. It is simple canned video playback, with a few 3D tricks, to make the filmed environment seem more real to the user.

Gear VR
Samsung Gear VR ups the ante just a little over the Google Cardboard experience by providing a more dynamic experience than just a canned 360-degree video.

Oculus Rift
The ultimate VR experience is the Oculus Rift. Oculus Rift is a full division within Facebook.

The Oculus Rift completely shields the user from the real world via a full headset. The headset is wired into a high-end Intel based PC + high end graphics card. Less than 1% of all PCs shipped can power the Oculus Rift. The full power of the PC is pushed into the Oculus Rift to create fully immersive virtual worlds.

The revenue model for Oculus Rift is primarily gaming to start.  One can easily envision more live social interaction, and/or some enterprise uses (i.e. simulated training) for the fully VR worlds that can be created with Oculus Rift.

Most developers of VR have stated that they are glad that they get to work on VR and not AR (or augmented reality). The ability of VR developers to not have to bring in live computer vision from the real world, in real time, and augment real time human perception, is seen as way easier.

Augmented Reality

Augmented reality largely needs to solve all the problems of virtual reality, in addition to bringing in the live sensor input of human vision.

It is my opinion that augmented reality systems are much more suited to future enterprise use than virtual reality systems. This value within enterprise has largely been vetted by the weird Beta release and consumer level failure of Google Glass.

Google Glass
Google Glass was a problem looking for a solution. Google sought a solution in the consumer space hoping for some level of mainstream attraction and adoption. Instead, Google Glass has found a second life in the enterprise.

For those of you who haven’t used Google Glass, all Google Glass did was project a 32” flat panel TV into the upper right portion of your vision.

It was that simple. Any time you wanted to interact with the digital world, all you had to do was look up and to the right and there was a full virtual screen showing you something.

In many enterprise applications, the simple implementation of augmented reality via Google Glass can be extremely powerful. Imagine being an aircraft mechanic, and you would like to look up the design, plans, or details of a given part you are working on, but you are stuck in a fuel tank. Now you can get that information by simply looking up and to the right. The very simple projection of a single rectangle overlaid into human vision can lead to huge efficiencies in many areas of real work. The initial $1500 dollar price tag for Google Glass may have been a small one to pay for many in-the-field jobs. If all Google Glass did was allow a technician to view a relevant page of a PDF-based manual in the upper right area of their vision during tasks such as aircraft maintenance, automobile fleet maintenance, or assembly line work, the value of the simple augmented reality model that Google Glass presented may be realized quite quickly in increased quality, reduced repair times, or even increased safety.

It was unfortunate that Google went after the consumer instead of the enterprise markets.

Google Glass was just the start of augmented reality. We now have Microsoft HoloLens, which goes beyond the simple projected rectangle in the upper right of your vision and can fully incorporate fully 3D overlays onto any object that you can see within your field of vision.

Microsoft HoloLens
Microsoft HoloLens starts with simple gaming models for mass-consumer based revenue, but one can see a much larger enterprise vision as a dramatic improvement over the possible gains from Google Glass.

Imagine a supervisor looking out over a manufacturing assembly line while wearing Microsoft HoloLens and seeing real time stats on each station.

Imagine UPS, Amazon, or FedEx workers being able to get guidance to where something is in a warehouse overlaid directly onto their vision without needing any physical signs.

Imagine software engineers that can place status and development windows anywhere in virtual space for their work.

Imagine DevOps staff that can look out over a data center and see which machines have possible future hard drive failures, or  the real time status of a software deploy and onto which servers.

Realization of all the above with Microsoft HoloLens is aided by Microsoft’s vision of the Universal Windows Platform (or UWP). UWP allows businesses to reuse much of their existing C# / .NET / C++ code across a myriad of devices: Windows 10 Desktops, Xbox One, Windows 10 Mobile, and Microsoft HoloLens. In essence, many enterprises may already have logic that they can integrate into an augmented reality device so they can realize certain efficiencies with minimal software development overhead.

Augmented reality holds boundless promise for efficiencies within an enterprise, and we are on the cusp of being able to actually realize those efficiencies especially with Microsoft HoloLens and the Universal Windows Platform.

Microsoft has also recently announced the launch of Windows Holographic which allows third parties to create their own augmented (or mixed) reality hardware using Microsoft’s software.

Virtual reality and augmented reality each have their future killer applications and killer niches. Hopefully you can find them within your business and start to realize greater efficiencies and value with augmented reality, and possibly virtual reality, based solutions.

Increase Local Reasoning with Stateless Architecture and Value Types

It is just another Thursday of adding features to your mobile app.

You have blasted through your task list by extending the current underlying object model + data retrieval code.

Your front-end native views are all coming together. The navigation between views and specific data loading is all good.

Git Commit. Git Push. The build pops out on HockeyApp. The Friday sprint review goes well. During the sprint review the product manager points out that full CRUD (Create, Read, Update, Delete) functionality is required in each of the added views. You only have the ‘R’ in ‘CRUD’ implemented. You look through your views, think it just can’t be that bad to add C, U and D, and commit to adding full CRUD to all the views by next Friday’s sprint review.

The weekend passes by, you come in on Monday and start going through all your views to add full CRUD. You update your first view with full CRUD; start navigating through your app; do some creates, updates, and deletes; and notice that all of those other views you added last week are just broken. Whole swaths of classes are sharing data you didn’t know was shared between them. Mutation to data in one view has unknown effects on the other views due to the shared references to data classes from your back-end object model.

Your commitment to having this all done by Friday is looking like a pipe-dream.

Without realizing it you are now a victim of code that is not locally reasoned due to heavy use of reference types.

Local reasoning is the ability of a programmer to look at a single unit of code (i.e. class, struct, function, series of functions) and ensure that changes made to data structures and variables within that unit of code don’t have an unintended effect on unrelated areas of the software.

The primary example of poor local reasoning is handing out pointers to single references of classes to many different areas of the application. These pointers to single references are then mutated by all the clients that were handed the reference to the shared instance.

Adding to the pain is the possibility that a reference to your single instance was held by a client, then mutated by a parallel running thread.

Code that you thought that was one-pass, single-path, easy-peasy has now become a spider web of live wires blowing in the wind that short circuit and spark at random times.

With the recent rise of Swift, there has been a movement to use value types to avoid all that sparking caused by random mutation within reference types.

For so long, we were conditioned to use classes types for everything. Classes seem like the ultimate solution. Lightweight hand off of references (i.e. pointers), instead of allocation and copying of all internal members across function calls. The ability to globally mutate in one shot via known publicly exposed functions. It all looks so awesomely ‘object-oriented’. Until you hit the complex scenarios that a CRUD based user interface has to implement. Suddenly that awesome class based object model is being referenced by view after view after subview after sub-subview. Mutation can now occur across 10s of classes with 10s of running threads.

Time to go way too far up the geek scale to talk about possible solutions.

A classic trope in many Star Trek episodes was something sneaking onto the ship. Once on the ship, the alien / particle / nanite / Lwaxana Troi would start to wreak havoc with the red shirts / warp core / main computer / Captain Picard’s patience (respectively).

By using nothing but classes and reference types, even with a well defined pure OO interface to each, you are still spending too much time with your shields down, and letting too many things sneak onto the ship. It is time to raise the shields forever by using value types as an isolated shuttlecraft to move values securely between the ship and interested parties.

Apple has been emphasizing the use of value types for the past two years via their release of the Swift language.

Check out these WWDC 2015 /2016 presentations which emphasize the use of Swift value types as a bringer of stability and performance via using the language itself to bring local reasoning to code:

Apple has even migrated many existing framework classes (i.e. reference typed) to value types in the latest evolution of Swift 3.0. Check out the WWDC 2016 Video: What’s New in Foundation for Swift.

At Minnebar 11, Adam May and Sam Kirchmeier presented on Exploring Stateless UIs in Swift. In their presentation, they outline a series of techniques using Swift to eliminate common iOS state and reference bugs. Their techniques meld together stateless concepts from React, Flux, and language techniques in Swift, to dramatically increase the local reasoning in standard iOS code.

Riffing off of Adam and Sam’s presentation, I came up with a basic representation of stateless concepts in Xamarin to solve a series of cross-platform concerns.

The rise of using value types as a bringer of local reasoning to code is not just isolated to Apple and Swift. The recognition that emphasizing local reasoning can eliminate whole swaths of bugs is also burning it’s way through the JavaScript community as well. React.js and underlying Flux architectural concepts enforce a one-way-one-time push + render to a view via action, dispatcher, and store constructs. React + Flux ensure that JavaScript code doesn’t do cross-application mutation in random and unregulated ways. The local reasoning is in React + Flux is assured by the underlying architecture.

Even the PUT, DELETE, POST, and GET operations lying underneath REST based web interfaces are recognition of the power of local reasoning and the scourge of mutation of shared references to objects.

C# and .NET languages are very weak on the use of value types as a bringer of local reasoning. For so long Microsoft has provided guidance along the lines of ‘In all other cases, you should define your types as classes‘, largely due to performance implications.

Have no fear, you can bring similar concepts to C# code as well via the struct.

The one draw back of C# is the ease with which the struct can expose mutators in non-obvious ways via functions and public property setters.

Contrasting with C#, Swift has the ‘mutating‘ keyword. Any function that will mutate a member of a struct requires the ‘mutating‘ keyword to be attached in order for compilation to succeed. Unfortunately, there is no such compiler enforced mutability guard for structs in C#. The best you can usually do is to omit the property set from most of your struct property definitions, and also use private / internal modifiers to ensure the reasoning scope of your type is as local as you can possibly make it.

The next time you see a bug caused by a seemingly random chunk of data mutating, give a thought to how you may be able to refactor that code using stateless architecture concepts and value types to increase the local reasoning of all associated code. Who knows, you may find and fix many bugs you didn’t even realize that you had.

Practical Data Cleaning Using Stanford Named Entity Recognizer

I enjoy learning about all of the events in and around World War II, especially the Pacific theater.

I was reading the book Miracle at Midway  by Gordon W. Prange (et. al.) and started to get curious about the pre- and post- histories of all the naval vessels involved in the Battle of Midway.

Historically, so many people, plans, and materials had to merge together at a precise moment in time in order for the Battle of Midway to be fought where it was and realize its radical impact on the outcome of World War II.

I got curious as to the pre- and post-history of all the people and ships that fought at Midway and wanted a way to visualize all of that history in a mobile application.

As a software guy, I started digging into possible data sources of ship histories. I was looking forward to constructing my vision of a global map view onto which I can overlay and filter whole historical data sets in and around naval vessels such as:

  • Date of departure
  • Date of arrival
  • Location in lat / long of source / destination
  • Other ships encountered during mission
  • Mission being performed
  • People who commanded and served
  • People who may have been passengers or involved in the mission some way

I am not anywhere near constructing my vision of fully time and space dynamic filtered ship history maps yet. The keyword being: Yet.

This story details one tiny step I undertook to try and get to my goal against a found data source.

 

Finding a data source

I found a whole set of plain text ship histories as part of the Dictionary of American Naval Fighting Ships (DANFS).

An awesome Pythonista had already ripped the DANFS site content into a SQLite database which contains all 2000+ ship histories. – Thanks jrnold!

The ship histories are interesting to read but are not organized in a machine friendly format. The histories are largely just OCR’d text + a little HTML styling markup which was embedded as a side effect of the OCR process. I fully acknowledge that beggars can’t be choosers when it comes to data sources of this type. DANFS is as good a place to start as I could find this side of petitioning the National Archives for all naval ship logs. As any data scientist will tell you: You will spend 90% of your time cleaning your data, then 10% of your time actually using your data.

In this case, most of the ship histories are written in chronological order from past to present (once you separate out some header text regarding the background of the naming of the ship). In theory, if we just had a way to markup all the location names in the text, we could process through the text and create an in-order list of places that the ship has visited. Please note: All this is largely naive. It’s actually much more complicated than this, but you have to start somewhere.

Needless to say: I don’t want to do location identification by hand across 2,000+ free text ship histories. We are talking about the entire history of the American Navy. Having to classify and separate out locations by hand, by myself, is a huge task!

It turns out that there is a library that will markup locations within the text via an already trained machine learning based model: The Stanford Named Entity Recognizer (or Stanford NER)

Note: I continue in this post by using Stanford NER due to its C# / .NET compatibility. You may also want to check out Apache openNLP. Also keep in mind that you can also pre-process your data using NER / Natural Language Processing (NLP) and feed the results into ElasticSearch for even more server side search power!

DANFS History Snippet Before being run through Stanford NER:

<i>Abraham Lincoln </i>returned from her deployment to NAS Alameda on
9 October 1995. During this cruise, the ship provided a wide variety of
on board repair capabilities and technical experts to 17 American and
allied ships operating in the Middle East with limited or non-existent
tender services. In addition, the Communications Department completed a
telemedicine video conference with Johns Hopkins Medical Center that
supported X-ray transfers and surgical procedure consultations.

 

DANFS History Snippet After Stanford NER:

<i>Abraham Lincoln</i>
returned from her deployment to
<LOCATION>NAS Alameda</LOCATION>
on 9 October 1995. During this cruise, the ship provided a wide
variety of on board repair capabilities and technical experts to
17 American and allied ships operating in the
<LOCATION>Middle East</LOCATION>
with limited or non-existent tender services. In addition, the
<ORGANIZATION>Communications Department</ORGANIZATION>
completed a telemedicine video conference with
<ORGANIZATION>Johns Hopkins Medical Center</ORGANIZATION>
that supported X-ray transfers and surgical procedure consultations.

Pretty cool, huh?

Notice the ORGANIZATION and LOCATION XML tags added to the text? I didn’t put those into the text by hand. Stanford NER took in the raw text from the ship history of the U.S.S. Abraham Lincoln (CVN-72) (Before) and used the machine learning trained model to markup locations and organizations within the text (After).

 

Machine Learning: Model = Sample Data + Training

From the documentation regarding the Stanford Named Entity Recognizer:

Included with Stanford NER are a 4 class model trained on the CoNLL 2003 eng.train, a 7 class model trained on the MUC 6 and MUC 7 training data sets, and a 3 class model trained on both data sets and some additional data (including ACE 2002 and limited amounts of in-house data) on the intersection of those class sets. (The training data for the 3 class model does not include any material from the CoNLL eng.testa or eng.testb data sets, nor any of the MUC 6 or 7 test or devtest datasets, nor Alan Ritter’s Twitter NER data, so all of these remain valid tests of its performance.)

3 class: Location, Person, Organization
4 class: Location, Person, Organization, Misc
7 class: Location, Person, Organization, Money, Percent, Date, Time
These models each use distributional similarity features, which provide some performance gain at the cost of increasing their size and runtime. Also available are the same models missing those features.

Whoa, that’s an academic mouthful. Let’s take a deep breath and try to clarify the Stanford Named Entity Recognizer pipeline:

NER

 

All the parts at the top of the diagram labelled ‘Machine Learning’ (with black borders) are already done for you. The machine learning + training output is fully encapsulated and ready for use via models available for download as a handy 170MB Zip file.

  • When you crack open the model zip file, you will see the classifiers sub directory which contains the 4 class Model, 7 class Model, and 3 class Model:
    • Classifiers
    • It is one of these 3 models that you will use to initialize the CRFClassifier in your client code depending on your needs.

The parts at the bottom of the diagram in the ‘Using NER Classifier’  space (with purple borders) are what you will do in your client code to get person, location, and organization markup placed within your text. Check out the instructions on how to do this in C# / .NET.

The key to the client code is the CRFClassifier from the Stanford.NER.NLP NuGet Package. The CRFClassifier takes in your text and the trained model output file, from the above zip file, to do classification of person, location, and organization within your text.

Sample C# code:

 // Loading 3 class classifier model
            var classifier = CRFClassifier.getClassifierNoExceptions(
                classifiersDirecrory + @"\english.all.3class.distsim.crf.ser.gz");

            var s1 = "Good afternoon Rajat Raina, how are you today?";
            Console.WriteLine("{0}\n", classifier.classifyToString(s1));

 

Cleaning the data

Cleaning a set of OCR’d free form text files and converting it to a format that makes it easy for a developer to process it using code can be difficult. Data cleaning is an ad-hoc process requiring use of every data processing, and possibly coding, tool you have in your tool belt:

  • Regular Expressions
    • String replacement
  • XML DOM traversal
    • Node + attribute add
    • Node + attribute remove
  • Standard string split and replacement operations
  • Log to console during processing and manually create filtered lists of interesting data.
    • These lists can also be fed back as training data for future machine learning runs.
    • Filter values on created lists from source data.
  • Use of trained models

All that said, Named Entity Recognition gives you a fun and solid starting point to start cleaning your data using the power of models from machine learning outputs.

I highly recommend using Stanford NER as one or more stages in a pre-production data cleaning pipeline (especially if you are targeting the data for rendering on mobile platforms).

Beware: your data may have a series of false positive PERSON, ORGANIZATION, or LOCATION tags written into it by Stanford NER that may have to be filtered out or augmented by additional post-processing.

In my case Stanford NER also marks up the names of people in the text with PERSON XML tags. You may notice that Abraham Lincoln above is part of a ship name and also the name of a person. In my first run of this ship history text through Stanford NER, the text Abraham Lincoln was surrounded by PERSON XML tags. Having 1,000 occurrences of the person ‘Abraham Lincoln’ in a ship history about the U.S.S. Abraham Lincoln is probably not very useful to anyone. I had to run a post-processing step that used the XML DOM and removed any PERSON, ORGANIZATION, or LOCATION tags if the parent of those tags was an ‘i‘ tag. I found that ‘i‘ tag was written into the original history text from DANFS to indicate that it is a ship name so it was the easiest (and only) data marker I had to aid in cleaning. The same problem would occur for the U.S.S. Arizona, U.S.S. New Jersey, U.S.S. Missouri,  and other text where locations, people, or organizations were used as ship names.

I fully intend on trying to use machine learning, and alternate training sets for Stanford NER, to ensure that PERSON, ORGANIZATION, and LOCATION tags are not written if the text is within an ‘i’ tag (but haven’t done this yet).

As I was cleaning the data using Stanford NER I ran across instances where the rank or honorific of a person was not included within the PERSON tag:

Capt. <PERSON>William B. Hayden</PERSON>

My initial implementation to include the honorific and/or rank of the person was problematic.

  • After the NER stage, I scan through the XML output using System.Xml.Linq (i.e. XDocument, XNode, XElement) looking for any PERSON tags.
  • Using the XML DOM order I went to the previous text node right before the start PERSON tag.
  • I then displayed the first 3 words, as split by a space, at the end of the preceding text node.

In my naiveté I figured that the honorific and/or rank would be at worst about 50 different 3 word variations. Things like:

  • Private First Class
  • Lt. Col.
  • Col.
  • Gen.
  • Maj. Gen.
  • Capt.

Well imagine my astonishment when I discover that the DANFS data doesn’t only contain the names and honorifics of military personnel; but of passengers and people in and around the history of the ship:

  • Senator
  • Sen.
  • King
  • Queen
  • Princess
  • Mrs.
  • Representative
  • Congressman
  • …. and many, many, many more

In addition I found out that almost every possible abbreviation exists in the ship history text for rank:

  • Lt.
  • Lieutenant
  • Lieut.
  • Commander
  • Cmdr.

My second pass at determining the honorific of a person may just involve a custom training stage to create a separate ‘Honorific’ model trained by the set of honorific data I have derived by hand from the whole DANFS text data set.

As stated above: Data scientists spend way too much time cleaning data using any technique that they can. On this project, I found that I was never really ever done extracting new things from the data, then cleaning the output data some more in a never ending loop.

I hope in the future to provide all the sample cleaned DANFS data, my data cleaning source code, and a sample mobile app that will interactively render out historical ship locations.

Writing Node Applications as a .NET Developer – My experience in Developing in Node vs .NET/C# (Part 3)

While the previous posts described what one needs to know prior to starting a Node project, what follows is some of my experiences that I came across while writing a Node application.  

How do I structure my project?

The main problem I had when developing my Node application was figuring out a sound application structure. As mentioned earlier, there is a significant difference between Node and C# when it comes to declaring file dependencies. C#’s using statement is more of a convenience feature for specifying namespaces and its compiler does the dirty work of determining what files and DLLs are required to compile a program. Node’s CommonJS module system explicitly imports a file or dependency into a dependent file at runtime. In C#, I generally inject a class’s dependencies via constructor injection, delegating object instantiation and resolution to an Inversion of Control container. In Javascript, however, I tend to write in a more functional manner where I write and pass around functions instead of stateful objects.

This difference in styles and structure had me question my design choices and made me decide between the following:

  • Passing a module’s dependency(s) in as a function parameter OR
  • “Require-ing” the dependency module via Node’s module system

Right or wrong, I opted for the latter.  Doing so allowed my module to encapsulate its dependencies and decoupled its implementation from its dependent modules. In addition, for unit testing purposes, I was able to mock and stub any modules that I imported via “require” statements using the library “rewire”.

After feeling as though this was the “wrong” way of designing my application, I came to realize the following:

The CommonJS module system is a type of IoC container

In fact, when “require-ing” a module, the result of that module is cached and returned for subsequent require calls to that same file path within an application context.  After realizing this, my anxiety around application structure melted away as I realized I could use the same patterns I would use in a C# application.

How do I write unit tests?

Being the disciplined developer that I am, I rely heavily on unit tests as a safety net against regressions as well as to implement new features through Test Driven Development. In C# (with Visual Studio’s help), the testing story is straightforward as one only needs to create a test project, write tests and use the IDE’s built in test-runner to run them.  If using NUnit or VisualStudio’s Test Tools, tests and test fixtures are denoted via attributes that the test runner picks up while running tests.  The developer experience is quite frictionless as testing seems like a first-class citizen in the ecosystem and within Visual Studio.

Setting up a testing environment in a Node project is a different story. The first decision one must make is the test framework to utilize; the most popular being Jasmine and Mocha.  Both require a configuration file that details the following:

  • Which files (via a file pattern) should (and shouldn’t) be considered tests and therefore processed by the test runner
  • What reporter to use to output test results and detail any failed tests or exceptions
  • Any custom configuration related to transpilation or code processing that will need to be performed prior to running your tests

While the first two items are fairly straightforward, the third can be a major point of frustration especially to those new to Javascript build tools and transpilers. Some of the biggest problems I faced with Javascript testing were with having my files transpile prior to being run through the test runner.  

My first approach was to use Webpack (since I was using it in development and production for bundling and transpilation) to create a bundle of my test files which would run through the test runner. This required having a separate webpack configuration (to indicate which test files needed to bundled) along with configuring my Jasmine config file to point to this bundle. While this did work, it was painfully slow as a bundle had to be created each time and run through the test runner. In addition, it felt like a hack as I’d need to cleanup this generated bundle file after each test run. My eventual solution was to use babel-register as a helper to allow Jasmine to run all of my files through this transpiler utility.  This worked well (albeit slow) and seemed like the cleaner solution as babel-register acted as a transpilation pipeline, transpiling your code in memory and providing it to Jasmine for testing.

Much of the issues I faced with setting up a test harness for my Node application was related to the pain points inherent to transpilation. If I hadn’t been using advanced Javascript language features, this pain would have been eased slightly. However, this fact points to the differences in decision points one must face when developing a Node application compared to developing a .NET application.

Overall experience compared to C#

Aside from the pain points and confusion that I faced in the preceding sections, my overall experience in developing a Node application was delightful.  Much of this is due to my love for Javascript as a language but the standard Node library as well as the immense number of third party libraries available via npm allowed me to easily accomplish whatever programming goal I had.  In addition, I found that when I was stuck using a certain library or standard library module, I had plenty of resources available to me to troubleshoot any issues, whether they be Github issues or Stack Overflow articles.    As a last resort, if Googling my problem didn’t result in a resolution, I able was to look at the actual source code of my application’s dependencies which were available in the node_modules folder.

After clearing these initial hurdles, the overall development experience in Node was not that much different from developing an application in C#.   The major difference between the two platforms is the standard tooling for .NET applications  is arguably the best available to the community.  Visual Studio does so much for the developer in all facets of application design, which is great for productivity but can abstract too much of what your application and code are doing under the hood that it can be an impediment to growing as a programmer.  Although at first it seemed like a step backwards, having the command line as a tool in my Node development process exposed me to the steps required to build the application, giving better insight into the process.

Summary

At the end of the day, both .NET and Node are very capable frameworks that will allow you to create nearly any type of application that you desire. As with many things in technology, deciding between the two generally comes down to your project’s resources and time at hand, as well as the amount of familiarity and experience on your team for a given framework. Both frameworks have pros and cons when compared against each other but one can’t go wrong in choosing one over the other.

From a personal perspective, I thoroughly enjoy developing Node applications for a few reasons. The first being that Javascript is my favorite language and the more Javascript code I can write, the happier I am. Writing Node applications is also a great way to improve your skills in the language compared to Javascript development for the web as you can focus solely on your application logic and not be slowed down by issues related to working with the DOM or different browser quirks. Finally, I find Node to be a great tool for rapid development and prototypes.  The runtime feels very lightweight and if you have a proper build/tool chain in place, the developer feedback loop can be very tight and gratifying.

Overall, you can’t go wrong between the two frameworks but if you want to get out of your comfort zone and fully embrace the full-stack javascript mindset, I strongly recommend giving Node development a shot!

Coping with Device Rotation in Xamarin.Android

You think that you have your Android application in a state where you can demo it to your supervisor when you accidentally rotate your device and the app crashes. We have all been there before and the good news is that the fix is usually pretty simple even if it can sometimes take awhile to find.

This has always been an issue for Android developers, but I have found that, due to the unique interaction between your C# classes and the corresponding Java objects, it seems to be a little more sensitive with Xamarin.Android apps. In this post, we will discuss what happens when you rotate your device and cover the different techniques that you might choose to use to manage your application state through device rotations as well as the ramifications of each of them.

Configuration Changes

So, what happens when you rotate your device?  Your device’s orientation is considered to be a part of the configuration of your application and, by default, Android will restart your Activity whenever it detects a change to the configuration.  At first glance, this seems like a pretty heavy-handed approach to handling device rotations, but there is a reason behind it. To understand that reason, we need to go back and review a few basics about Android app development.

Android, and by extension Xamarin.Android, has a way for you to create resources that only apply for a particular configuration value. Resources can be added to a folder that is tied to a configuration value, and those resources will only be used if that configuration value exists in the current setup. This is seen most often with drawables when you see the various drawable-mdpi or other drawable-*dpi folders so that you can provide images that are scaled appropriately for the resolution of the device. The Android system will choose the proper drawable folder and fall back to the base drawable folder at runtime whenever a given image is requested. This system of coupling resources to a given configuration goes beyond drawables to include all of the resource types, so layouts, strings, values, colors, etc. These can all be restricted using the same configuration qualifiers.

Device orientation is one of those configuration values that can be used to conditionally load different resources through the use of the “*-port” or “*-land” qualifiers.  Though, this means there could potentially be different resources used when viewing the app in landscape mode than in portrait mode, and that needs to be accounted for when the device rotates.  The Android team decided that restarting the Activity would be the best way to handle this so the resources could be reloaded with the new configuration values when the Activity restarts.

That might all sound fine, except that it can cause problems if you have not correctly accounted for this behavior in your code.  There are several approaches that you can take to deal with this, although the simplest approaches can also be the most restrictive to your app.

Prevent Orientation Changes

The first approach is also going to be the easiest to handle, but it is the most limiting, because it involves telling Android that your Activity only supports a single orientation. You can do this by setting the ScreenOrientation property of the ActivityAttribute on your Activity class to the orientation that you want to force your Activity to use.

[Activity(ScreenOrientation = ScreenOrientation.Portrait)]
public class MyActivity : Activity
{
}

This approach has some obvious drawbacks, but if it works for your UX needs then it will be a simple way to ensure that an orientation change will not affect your application.

However, it is important to keep in mind that orientation changes are not the only configuration changes that might occur and cause your Activity to restart. For instance, a user can change their font/text size, which will also trigger your Activity to restart. So suppressing an orientation change is not a complete fix for any issues that your app would experience with restarting the Activity.

Manually Handling Configuration Changes

The second approach is a much more manual approach. It is possible to tell your app that you want to manually handle configuration changes in your code instead of restarting the Activity. To implement this approach you need to override the OnConfigurationChanged method in your Activity and manually process the new configuration object to do whatever needs to be done and then subscribe to specific configuration changes. Again, just like preventing an orientation change, this is as simple as tweaking your ActivityAttribute to specify which configuration changes will trigger calls to your method.

[Activity(ConfigurationChanges = ConfigChanges.Orientation | ConfigChanges.ScreenSize)]
public class MyActivity : Activity
{
    public override void OnConfigurationChanged(Configuration newConfig)
    {
        base.OnConfigurationChanged(newConfig);

        // perform actions to update your UI
    }
}

You might have noticed that I included the ScreenSize configuration change in my attribute. The reason is since API 13 (Honeycomb) the screen size also changes with the orientation, so you need to subscribe to both changes in order to keep your Activity from restarting.

This source code example will indeed prevent the configuration change from restarting your Activity, but it will not change which resources were used. If you were originally in portrait mode, then you would still be using your portrait resources even though you are now in landscape mode. If you want to also update your resources, then you will need to manually inflate the new resources and replace the old resources with them. However, since the Activity is not being restarted, that means the lifecycle events (OnCreate, OnResume etc.) are not executing, so any code that obtains references to views in your layout, or initializes values in your layout, will need to be performed again for your newly inflated views.

This can quickly become a lot of work if your UI is anything other than trivial and it is going to be prone to breaking if you forget a step or something changes in the future. As a result of that, I cannot recommend this approach unless your app does not utilizing orientation based resource overrides that would require inflation.

Retaining your Fragment Instance

If you are making use of fragments in your app, then they are also destroyed and recreated along with your Activity when a configuration change occurs. If your Activity class itself is just a thin wrapper around different fragments that actually contain most of your application state, then maybe it will be enough for you to persist your fragments and let the Activity still recreate itself. Like the previous examples, this is very simple to do since you just have to make one method call on your fragment to let Android know that the instance needs to be saved.

// from activity
var fragment = new MyFragment();
fragment.RetainInstance = true;

// OR in fragment
public override void OnCreate(Bundle bundle)
{
    RetainInstance = true;
}

Android will save the instance of your fragment and when it rebuilds your Activity it will reuse that fragment instance instead of creating a new one. This sounds pretty good, but if your fragment makes use of any resources that are orientation specific then you run into the exact same problem that you have with manually managing the orientation change. You will still need to inflate your new resources and initialize them manually, so other than saving your member variables for you this approach does not gain you a lot.

However, one place where this technique can be very useful is if you have some objects that may not serialize/deserialize well. As long as those objects do not retain references to the Context/Activity, such as other views or drawables, you can add those objects to a dummy fragment and retain that fragment. The fragment itself should just be a thin fragment that does not do anything else, and since there is no resource inflation happening in it you do not have to worry about reinflating anything.

Save and Restore your Application State

The Android designers knew that destroying and recreating the Activity was going to cause problems so they provided a mechanism for developers to save their state and then restore it after recreation.

Both the Activity and Fragment classes have a SaveInstanceState method that receives a Bundle where you can store serializable data. This method is called just prior to those objects being destroyed so the class states are still valid. You can use this bundle to store member variables from your class, or data that was retrieved and you do not want to have to retrieve it again, or anything else that is serializable.

protected override void OnSaveInstanceState(Bundle outState)
{
	base.OnSaveInstanceState(outState);

	outState.PutBoolean("someBoolean", someBoolean);
        outState.PutInt("someInt", someInt);

        // assume someObject is of type List<SomeModel>
        // I like to use Newtonsoft.Json to serialize to strings and back
        outState.PutString("someModels", JsonConvert.SerializeObject(someModels));
}

Activity classes have a RestoreInstanceState method that receives the Bundle containing the saved state and it has a chance to repopulate the class’s members with their data, although the same bundle is also passed to OnCreate so you could put your restore logic there as well depending on your need. RestoreInstanceState is called after OnStart, so if you need to initialize views before the Activity is started then you will want to use OnCreate. Keep in mind that the bundle in OnCreate can be null if the Activity is being launched so you will need to perform a null check.

protected override void OnCreate(Bundle savedInstanceState)
{
        base.OnCreate(savedInstanceState);

        // you must do a null check before referencing it here
        if (savedInstanceState != null)
        {
            someBoolean = savedInstanceState.GetBoolean("someBoolean", false);
            someInt = savedInstanceState.GetInt("someInt", 0);

            someModels = (IList<SomeModel>)JsonConvert.DeserializeObject<IList<SomeModel>>(savedInstanceState.GetString("someModels", null));
        }
}

protected override void OnRestoreInstanceState(Bundle savedInstanceState)
{
        base.OnRestoreInstanceState(savedInstanceState);

        // this method is only called when restoring state, so no need to do a null check
        someBoolean = savedInstanceState.GetBoolean("someBoolean", false);
        someInt = savedInstanceState.GetInt("someInt", 0);

        someModels = (IList<SomeModel>)JsonConvert.DeserializeObject<IList<SomeModel>>(savedInstanceState.GetString("someModels", null));
}

Fragments are a little different in that there are multiple methods that receive the Bundle with the saved state so you can restore your state in any of them. Generally speaking I would recommend using the OnActivityCreated method to restore your state since this happens prior to the UI views in your Fragment getting restored. If you needed to restore your state after the UI views being updated then you can use the OnViewStateRestored method.

public override void OnActivityCreated(Bundle savedInstanceState)
{
	base.OnActivityCreated(savedInstanceState);

	// load the data from the saved cache if it exists
	if (savedInstanceState != null)
	{
		someModels = (IList<SomeModel>)JsonConvert.DeserializeObject<IList<SomeModel>>(savedInstanceState.GetString("someModels", null));
	}
}

Android does not want you to have to do all of the work so it will automatically save the state of all views in your UI with IDs for you. Pieces of information like your scroll location are also saved and restored with the views in between OnActivityCreated and OnViewStateRestored, so if you want your scroll location to be correct then you will need to populate new adapters with your saved data and attach them to your lists in OnActivityCreated so that the scroll size is correct before Android sets the location.

One other piece that you will need to keep in mind is that Android will also attempt to restore your fragments and the back stack in the fragment manager. However, if your Activity keeps a reference to any of the fragments within it, you will need to save that fragment identifier so that Android can restore it with the correct instance. Fortunately they provide an easy way to do that.

protected override void OnSaveInstanceState(Bundle outState)
{
	base.OnSaveInstanceState(outState);

        // I am using the SupportFragmentManager here since I am using AppCompat with the support libraries.  This should also work with FragmentManager if you are not using AppCompat
        SupportFragmentManager.PutFragment(outState, "currentFragment", currentFragment);
}

protected override void OnRestoreInstanceState(Bundle savedInstanceState)
{
        currentFragment = SupportFragmentManager.GetFragment(savedInstanceState, "currentFragment") as MyFragment;
}

It might seem like a lot of work to save and restore your state, but all you really need to do is save off your Activity and Fragment’s instance variables and restore them at the appropriate moments. Most of the issues that I run into deal with forgetting to save/restore variables that I have added.

How does this affect async/await?

One other thing that you will need to keep in mind is that you will need to manage your async/await Tasks. You should try to implement your Tasks so that they can be cancelled if needed. If you have a pending Task when your Activity restarts, when the Task completes it will try to resume the original location which no longer exists. Ideally you should cancel any pending Tasks when the Activity or Fragment is stopped, or come up with an approach where the Task is running in some class instance that is not destroyed with the Activity.

Conclusion

As a developer, it can be annoying work to properly maintain your application’s state. If your application has a relatively simple UI that does not involve resource overrides, then you are probably going to be safe ignoring orientation changes. However, if your requirements change in the future, then that decision could give you a headache. This is one of those cases where it is probably easier to implement it properly from the beginning rather than ignoring it and refactoring it later once it has become an issue.

I hope that this article has helped you come to a better understanding of Android configuration changes and how you can take steps to make sure that your app is going to work properly when it is rotated.

Writing Node Applications as a .NET Developer – Getting Ready to Develop (Part 2)

In the previous blog post, I provided a general overview of some the key differences between the two frameworks. With this out of the way we’re ready to get started writing an application. However, there are some key decisions to make regarding what development tools to use as well as getting the execution environment set up.

Selecting an IDE/Text Editor

Before I could write a line of code, I needed to decide on an IDE/Text Editor that I wanted to use to write my application. As a C# developer, I was spoiled with the number of features that Visual Studio offered a developer that allowed for a frictionless and productive developing experience. I wanted to have this same experience when writing a Node application so before deciding on an IDE, I had a few prerequisites:

  • Debugging capabilities built into the IDE
  • Unobtrusive and generally correct autocomplete
  • File navigation via symbols (CTRL + click in Visual Studio with Resharper extension)
  • Refactoring utilities that I could trust; Find/Replace wasn’t good enough

While I love Visual Studio, I find that its JavaScript editor is more annoying than helpful.  Its autocomplete often gets in the way of my typing and it will automatically capitalize my symbols without my prompting.  Add to the fact that since I was working with a new framework and was already spreading my wings, I wanted to expose myself to another tool for learning’s sake.

Given my preferences above, I decided that JetBrain’s Webstorm would fit my needs:

  • Webstorm offers a Node debugging experience that rivals VS’s. One can set breakpoints, view locals and evaluate code when a breakpoint is hit.
  • The IDE’s autocomplete features (although not perfect) offer not only the correct symbols I’m targeting but often times would describe the signature of the function I was intending to call.
  • By indexing your project files on application start, Webstorm allows for symbol navigation via CTRL + click.  I was even able to navigate into node_modules files.
  • When refactoring code, Webstorm will search filenames, symbols and text comments, providing a safe way of refactoring code without (too many) headaches.

While not at the same level as Visual Studio’s C# development experience, Webstorm offers the user the next best thing, allowing for an environment that offers a familiar developer experience.  Although there are other (free) options available (Sublime Text, Atom, Visual Studio Code) I found that with these editors, I had to do more work to set up an environment that would allow me to develop at a productive pace.

Embracing the Command Line

Due to the power of Visual Studio as a tool and its ability to abstract away mundane operations, your average .NET developer tends to be a little wary of using the command line to perform common tasks.  Actions such as installing dependencies, running build commands and generating project templates are handled quite well in Visual Studio through wizards and search GUIs, preventing the user from having to know a myriad of tool-specific commands.

This is not the case when working with the Node ecosystem and its contemporary toolset.  Need to install a dependency via npm? A quick `npm i -S my-dependency` is required.  Want to run a yeoman generator to scaffold out an express application?  You only need to download the generator (if you don’t have it) using the same npm tool, run the generator with `yo my-awesome-generator` and walk through the prompts.  How about a build command?  Assuming you have an npm script set-up, typing `npm run build:prod` will do, (even though this is just an alias for another command line command that you will have to write).  In Node development, working with that spooky prompt is unavoidable.

While it might feel tedious and a step backwards as a developer, using the command line as a development tool has many benefits. You generally see the actions that a command line command is performing which gives you better insight into what is actually happening when you run `npm run build:prod`.  By using various tools via the command line, you have a better grasp of which tool is meant for what purpose.  This is in comparison to Visual Studio where at first blush, one equates Nuget, compiling via the F5 key and Project Templates to Visual Studio as a whole, not grasping that each of the commands you perform are separate toolsets and dependencies that Visual Studio invokes.  Having better insight into your toolchain can help in troubleshooting when a problem arises.

Running Your Code

The final step in writing a Node application is preparing your environment to run your Node code.  The only thing you will need to run your application is the Node runtime and the Node Package Manager (included in the same download and installed alongside Node).

Node.exe is the actual executable that will run your Node application.  Since Javascript is an interpreted language, the code you write is passed to this executable which parses and runs your application.  There is no explicit compilation step that a user must perform prior to running a Node application.  Furthermore, unlike applications written in a .NET language, Node programs are not dependent on a system-wide framework to be present. The only requirement for your code to run is to have the node.exe on the system path. This results in the deployment story of a Node application to be simpler and allows for cross platform deployment that is not yet readily available to .NET applications.

The neat thing about Node is that if you type in the `node` command without any file parameters, you get access to the Node REPL right in your console.  While this is great for experimentation or running/testing scripts, it’s a little lacking and I’ve only used it for simple operations and language feature tests.

While node.exe is what runs your application, npm is what will drive your application’s development.  Although the “pm” might stand for Package Manager to pull in your project dependencies, it’s more of a do-all utility that can run predefined scripts, specify project dependencies and provide a manifest file for your project if you publish it as an npm module.

Summary

Oftentimes with new frameworks and technologies, I have experienced frustration in getting my environment set up so that I could write code that runs at the click of a button. However with Node, the process is very straightforward, simply requiring one to install the runtime and package manager which are both available as an MSI that can be found on Node’s website.  From there, you can run your Node program by accessing the command line and pointing to your entry file.  In all honesty, the hardest part was deciding on an IDE that offered some of the features I became accustomed to when working in Visual Studio.

In the next and final post in this series, I will provide my overall experience with writing a Node application, detailing some questions I had surrounding application structure and testing, as well as giving a summary on my feelings on the runtime.

Writing Node Applications as a .NET Developer

As a .NET developer, creating modern web apps using Node on the backend can seem daunting.  The amount of tooling and setup required before you can write a “modern” application has resulted in the development community to display “Javascript Fatigue”; a general wariness related to the exploding amount of tooling, libraries, frameworks and best practices that are introduced on a seemingly daily basis.  Contrast this with building an app in .NET using Visual Studio where the developer simply selects a project template to build off of and they’re ready to go. [Read more…]

Universal Windows Platform – The Undiscovered Country for Windows Desktop Apps

It is unfortunate, but I think Dr. McCoy from Star Trek: The Original Series said it best: ‘Windows 10 Mobile is dead, Jim“…. and just to pile on with one more.

Buried underneath the red shirt like death of Windows 10 Mobile lies the amazing Universal Windows Platform (UWP).

The developer capabilities of the Universal Windows Platform have been documented in many a location.

UWP provides:

  • Full developer parity at the API and framework level across all flavors of Windows 10, Xbox One, HoloLens, and Windows 10 Mobile devices.
  • A simplified install model via APPX bundles.
  • A per-app separation of registry and file systems.
  • XAML controls ‘just work’ across all Windows 10 / UWP devices.
  • Targeting the UWP API set ensures that your app works across Windows 10 / UWP devices.

In my experience, there is nothing weirder than seeing your Windows 10 / UWP app just work on an Xbox One,  in a virtual projected rectangle via a HoloLens, and via mouse and keyboard on a standard Windows 10 PC.

 

NetToday-UWP

The above diagram is from the great Build 2016 presentation – .NET Overview given by Scott Hanselman and Scott Hunter.

UWP fits within a completely different vertical of development than the existing Windows Desktop technologies of Windows Presentation Foundation (WPF) and Windows Forms. As such UWP has access to the more modern feature set common to Windows Store applications. UWP has both duplication and subsets of API / XAML features that exist within WPF.

Right now may be a good time to take a look at your existing Windows Desktop apps and assess if they can be moved to UWP.

A UWP port of your legacy desktop app will allow access to modern UWP features:

  • Live tile support
  • Push notifications
  • Cortana
  • Background Downloader

There are 3 primary techniques to bring your current legacy desktop app to UWP:

  • Project Centennial
  • Straight UWP + Portable Library Breakout
  • Brokered Windows Runtime Components

Microsoft’s Project Centennial can be thought of as a UWP enabling wrapper for your existing Windows Desktop app. You may be able to repackage your existing Windows Desktop application and run it as a UWP app via a Project Centennial wrapper on the forthcoming Windows 10 Anniversary Update. As with all huge development efforts, there may be some caveats to using Project Centennial to package your Windows Desktop app as a UWP app (i.e. driver installation for custom hardware devices is not supported).

Update: Project Centennial based apps can go-live against the Windows 10 Anniversary Update as of August 2, 2016.

A straight UWP + Portable Library code breakout of your app could allow the features of existing Windows Desktop apps to be migrated in a straightforward way to Windows Store / UWP applications.

Note that the main UI layer of your existing Windows Desktop app is  not easily converted to UWP.

  • WPF XAML is not the same as UWP XAML with differences in control support, touch events, and lack of Triggers in UWP XAML.
  • UWP + XAML is a radical departure from WinForms. The headaches you may have encountered going from WinForms to WPF are only magnified in a WinForms to UWP conversion due to no real access to the underlying Win32 API set.

Brokered Windows Runtime Components allow a UWP app to tunnel between the UWP API and the Windows Desktop environment to run full .NET Framework 4.6 code on the Windows Desktop side. The downsides of this approach are many, the most limiting of which is that this technique can only be used on sideloaded UWP apps, not on UWP apps which are meant to be distributed via the Windows Store.

Interesting and non-obvious outcomes I have discovered during Windows Desktop app to UWP app code migrations:

  • You discover that there may exist possibilities for your app on Android and iOS with your existing code via third party Portable Libraries and your own code as Portable Libraries running via Xamarin.
  • I have found that having UWP compatible code, and Portable Libraries, allows you to create live prototypes that can bring your business services into an augmented reality future on Microsoft’s HoloLens, and a passable default experience on Xbox One.

We are at a true maturity point across the Windows, iOS, and Android ecosystems when it comes to repurposing managed code across all the .NET runtime variants via portable libraries. There is no better time than now to dust off that old-school Windows Desktop code and see how you may be able to put it to great new purposes on modern devices via UWP / .NET Core and/or Xamarin for iOS and Android.

I believe that the great things that Microsoft has shipped via their Universal Windows Platform have been lost in the 5 year sideways journey to catch up to Android and iOS. That sideways journey started with Windows CE variants in the form of Windows Phone 7; through the slight update of Windows Phone 8;  the flub of full Windows 8;  the long forgotten Windows RT / Windows On ARM strategy; through the full revamp (and necessary stepping stone) that was Windows Phone 8.1; and onto today’s real deal: Windows 10 / Universal Windows Platform.

Microsoft always nails something the third time they take a stab at it. In this case it feels like they have something real with UWP after 5 or 6 stabs and a whole bunch of missteps.

In a recent post to ArsTechnica, Peter Bright, master writer of all things Microsoft, wrote up a concise technical history regarding the path that Microsoft took in order to arrive at the current Universal Windows Platform destination.

With the maturity and solidity of UWP, and the positive mindshare linkage to Windows 10 via marketing and branding schemes, I think it is right to declare that UWP gets Microsoft out of the ‘Bull wandering through a china shop’ phase. With UWP, Microsoft is firmly into ‘We have something that is truly a firm third place contender, and primed for growth’ territory.

 

Windows Desktop app development has always been a major pain when it comes to:

  • Installation and Upgrades — The only clean way to do this was with Windows Installer, and the only clean way to do Windows Installer is Windows Installer XML (just look at that crazy XML schema, and bizarre otherworldly installer object model — Thank goodness for Votive though!)
  • COM + Win32 API / PInvoke — No matter what you do, it always seems like you have to resort to COM or  Win32 / PInvoke for something.
  • Device Access — It seems as if you need to know DIFx to install a driver, then have to know way too much about Windows Driver Model, Bluetooth sockets, or UMDF just to get read and write streams to your device from a program.
  • Security — How many times have you seen the ‘Red Banner Dialog of Doom‘ while installing something? Only to be told to click ‘Install this driver anyway’?

To solve the above platform deficiencies, the Windows platform needed a simplified app development model along the lines of Android and iOS. They have that scheme today in the form of UWP, with far superior tooling in Visual Studio 2015, and far easier app coding with C# / XAML.

On previous projects, the primary things holding back any sort of Windows Desktop to Windows Store / UWP migration were:

  • Third Party Software Libraries — Tied directly to full .NET Framework 4.x and usable only in Windows Desktop apps.
  • Specialized device access (i.e. USB / COM Port / Bluetooth) — Oft times these devices came from third parties as well as an access library written for the device that was only usable on Windows Desktop apps.

The third party software library realm has softened quite a bit. Many third party libraries that were directly tied to full .NET Framework 4.x have been open sourced (allowing you to pull needed bits out), or turned into Portable Libraries for use in UWP (and other mobile platform) applications.

If your existing Windows Desktop app requires integration with devices, you may have some issues with a straight Project Centennial wrap of your app. You may have to adapt and rewrite your device access layers against UWP APIs.

Specialized device access has matured greatly within UWP / Windows Store apps with the release of Windows 10 / UWP. There are now fully supported UWP API sets which allow:

A more friendly environment has been created in the world of Windows application development via a series of fortunate events:

  • The depths of despair that came with Windows 8 are gone with the rise of Windows 10.
  • The purchase of Xamarin by Microsoft.
    • Along with the support guarantees and savvy short term moves made since the acquisition.
  • Maturity of open source code + flexible licensing of that code
  • The rise of the UWP, iOS, and Android compatible Portable Library.

In general, the environment is such that it is a great time to look at a way to fracture and/or pivot your Windows Desktop apps onto UWP, and possibly even iOS and Android via Xamarin. Give it a try, you might be surprised how far your existing old-school code can stretch across modern devices.

Your pivot to get your Windows Desktop application working on Universal Windows Platform may even work on Surface devices from The Next Generation.